url filter: remove trailing non-URL characters
Adjusted UrlFilter::newHotSpot
to strip non-URL trailing characters (e.g., commas, dots) using regex [',.:;]+$
. This ensures correct URL parsing without trailing punctuation.
Test case: https://example.com.
should exclude the trailing dot.
Potential Limitations:
- The implementation might impact performance due to the use of regular expressions.
- For URLs spanning multiple lines, the function only removes trailing non-URL characters in the last line.
- Unable to treat
https://example.com,https://example.com
as two separate URLs.
Edited by Wendi Gan