Commit b53fb1d0 authored by Luis Javier Merino's avatar Luis Javier Merino Committed by Tomaz Canabrava
Browse files

Allow IPv6 literals in URIs

This allows recognizing URIs like the following:

http://[2a00:1450:4001:829::200e]/
http://[2a00:1450:4001:829::200e]:80/

Our regexp is not as strict as the syntax in RFC 3986, we just allow any
combination of hex-digits, colons and dots.

Besides, we don't go to the effort of forbidding IP-literals in www. URI
suffixes, so the following invalid suffix is recognized by our regexp:

www.[dead::beef]

but that was already recognized by the old regexp before 3b7e73f5.
parent da41c19a
......@@ -66,6 +66,11 @@ void HotSpotFilterTest::testUrlFilterRegex_data()
QTest::newRow("www_followed_by_colon") << "www.example.com:foo@bar.com"
<< "www.example.com" << true;
QTest::newRow("ipv6") << "http://[2a00:1450:4001:829::200e]/"
<< "http://[2a00:1450:4001:829::200e]" << true;
QTest::newRow("ipv6_with_port") << "http://[2a00:1450:4001:829::200e]:80/"
<< "http://[2a00:1450:4001:829::200e]:80" << true;
}
void HotSpotFilterTest::testUrlFilterRegex()
......
......@@ -23,10 +23,12 @@ using namespace Konsole;
//
// It deviates from rfc3986:
// - We only recognize URIs with authority (even if it is an empty authority)
// - We match URIs starting with 'www.'
// - We match URI suffixes starting with 'www.'
// - We allow IPv6 literals right after 'www.', e.g: www.[dead::beef]
// - "userinfo" is assumed to have a single ':' character
// - We _don't_ match IPv6 addresses (e.g. http://[2010:836B:4179::836B:4179])
// or IPvFuture
// - We _don't_ match IPvFuture addresses
// - We allow any combination of hex digits, colons and dots as IPv6 addresses,
// e.g: https://[::::dead:::beef::123.666.666.666::dead::::beef::::]/foo
// - "port" (':1234'), if present, is assumed to be non-empty
// - We don't check the validity of percent-encoded characters
// (e.g. "www.example.com/foo%XXbar")
......@@ -50,7 +52,8 @@ static const char userInfo[] =
"[" COMMON_1 "]+?:?"
"[" COMMON_1 "]++@"
")?+";
static const char host[] = "(?:[" COMMON_1 "]*+)"; // www.foo.bar
#define IPv6_literal "\\[[0-9a-fA-F:.]++\\]"
static const char host[] = "(?:[" COMMON_1 "]++|" IPv6_literal ")?+"; // www.foo.bar
static const char port[] = "(?::[0-9]+)?+"; // :1234
#define COMMON_2 "a-z0-9\\-._~%!$&'()*+,;=:@/"
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment