fix: reject parser-confusing chars in validate_url to close SSRF bypass (#24534)

urllib.parse.urlparse and requests/aiohttp disagree on how to split URLs
containing backslash, tab, CR, or LF in or around the netloc. urlparse
treats backslash as part of userinfo and uses what follows '@' as the
host; requests treats backslash as the start of the path and connects
to whatever precedes it. The same URL therefore passes the private-IP
filter (urlparse sees a public host) but reaches an internal target
(requests connects to e.g. 127.0.0.1). End result is an SSRF that the
existing IP block list cannot catch because it's evaluating the wrong
host.

PoC: http://127.0.0.1:6666\@1.1.1.1 — urlparse hostname is 1.1.1.1
(global, passes), requests reaches 127.0.0.1 (loopback).

Reject up front any URL containing one of the four documented parser-
confusing characters before either parser gets a chance to interpret
it. None of these characters is valid in an unencoded URL (\ should
always be %5C, whitespace should be %09 / %0A / %0D), so this is a
pure defensive rejection with no legitimate-input false positives.

Reported by Fushuling and RacerZ-fighting in GHSA-8w7q-q5jp-jvgx.

Co-authored-by: Fushuling <Fushuling@users.noreply.github.com>
Co-authored-by: RacerZ-fighting <RacerZ-fighting@users.noreply.github.com>
This commit is contained in:
Classic298
2026-05-10 17:57:48 +02:00
committed by GitHub
parent 5b13e3e3f0
commit e7ba8978c6

View File

@@ -69,6 +69,14 @@ def validate_url(url: Union[str, Sequence[str]]):
if isinstance(validators.url(url), validators.ValidationError):
raise ValueError(ERROR_MESSAGES.INVALID_URL)
# Reject parser-confusing chars: urlparse and requests/aiohttp split
# on these differently, e.g. http://127.0.0.1\@1.1.1.1 → urlparse
# extracts 1.1.1.1 (public, passes filter) while requests connects
# to 127.0.0.1 (internal). Same shape with tab/CR/LF.
if any(ch in url for ch in ('\\', '\t', '\n', '\r')):
log.warning(f'Blocked URL with parser-confusing char: {url!r}')
raise ValueError(ERROR_MESSAGES.INVALID_URL)
parsed_url = urllib.parse.urlparse(url)
# Protocol validation - only allow http/https