Skip to content
This repository was archived by the owner on Jul 24, 2023. It is now read-only.

Commit 0f46921

Browse files
author
Zaid Zawaideh
committed
added handling of invalide UTF-8 byte sequence exceptions
1 parent 9ce498e commit 0f46921

2 files changed

Lines changed: 14 additions & 2 deletions

File tree

lib/openid/consumer/html_parse.rb

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,11 @@ def OpenID.unescape_hash(h)
3434

3535

3636
def OpenID.parse_link_attrs(html)
37-
stripped = html.gsub(REMOVED_RE,'')
37+
begin
38+
stripped = html.gsub(REMOVED_RE,'')
39+
rescue ArgumentError
40+
stripped = html.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '').gsub(REMOVED_RE,'')
41+
end
3842
parser = HTMLTokenizer.new(stripped)
3943

4044
links = []

test/test_linkparse.rb

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,8 @@ def test_linkparse
8484
assert(false, "datafile parsing error: bad header #{h}")
8585
end
8686
}
87-
links = OpenID::parse_link_attrs(html)
87+
88+
links = OpenID::parse_link_attrs(html.force_encoding('UTF-8'))
8889

8990
found = links.dup
9091
expected = expected_links.dup
@@ -97,5 +98,12 @@ def test_linkparse
9798
end
9899
}
99100
assert_equal(numtests, testnum, "Number of tests")
101+
102+
# test handling of invalid UTF-8 byte sequences
103+
html = "<html><body>hello joel\255</body></html>".force_encoding("UTF-8")
104+
assert_nothing_raised do
105+
OpenID::parse_link_attrs(html)
106+
end
107+
100108
end
101109
end

0 commit comments

Comments
 (0)