URL-parsing: consider ? a divider

The URL parser got a little stricter as it now considers a ? to be a
host name divider so that the slightly sloppier URLs work too. The
problem that made me do this change was the reported problem with an URL
like: www.example.com?email=name@example.com This form of URL is not
really a legal URL (due to the missing slash after the host name) but is
widely accepted by all major browsers and libcurl also already accepted
it, it was just the '@' letter that triggered the problem now.

The side-effect of this change is that now libcurl no longer accepts the
?  letter as part of user-name or password when given in the URL, which
it used to accept (and is tested in test 191). That letter is however
mentioned in RFC3986 to be required to be percent encoded since it is
used as a divider.

Bug: http://curl.haxx.se/bug/view.cgi?id=3090268
This commit is contained in:
Daniel Stenberg 2010-10-19 20:20:06 +02:00
parent 6164d40fce
commit 98d9dc7840
4 changed files with 59 additions and 4 deletions

View File

@ -3597,7 +3597,7 @@ static CURLcode parseurlandfillconn(struct SessionHandle *data,
path[0]=0;
if(2 > sscanf(data->change.url,
"%15[^\n:]://%[^\n/]%[^\n]",
"%15[^\n:]://%[^\n/?]%[^\n]",
protobuf,
conn->host.name, path)) {
@ -3605,7 +3605,7 @@ static CURLcode parseurlandfillconn(struct SessionHandle *data,
* The URL was badly formatted, let's try the browser-style _without_
* protocol specified like 'http://'.
*/
rc = sscanf(data->change.url, "%[^\n/]%[^\n]", conn->host.name, path);
rc = sscanf(data->change.url, "%[^\n/?]%[^\n]", conn->host.name, path);
if(1 > rc) {
/*
* We couldn't even get this format.

View File

@ -68,7 +68,7 @@ EXTRA_DIST = test1 test108 test117 test127 test20 test27 test34 test46 \
test1108 test1109 test1110 test1111 test1112 test129 test567 test568 \
test569 test570 test571 test572 test804 test805 test806 test807 test573 \
test313 test1115 test578 test579 test1116 test1200 test1201 test1202 \
test1203 test1117
test1203 test1117 test1118
filecheck:
@mkdir test-place; \

55
tests/data/test1118 Normal file
View File

@ -0,0 +1,55 @@
<testcase>
<info>
<keywords>
HTTP
HTTP GET
</keywords>
</info>
#
# Server-side
<reply>
<data>
HTTP/1.1 200 OK
Date: Thu, 09 Nov 2010 14:49:00 GMT
Server: test-server/fake
Last-Modified: Tue, 13 Jun 2000 12:10:00 GMT
ETag: "21025-dc7-39462498"
Accept-Ranges: bytes
Content-Length: 6
Connection: close
Content-Type: text/html
Funny-head: yesyes
-foo-
</data>
</reply>
#
# Client-side
<client>
<server>
http
</server>
<name>
URL without slash and @-letter in query
</name>
<command>
http://%HOSTIP:%HTTPPORT?email=name@example.com/1118
</command>
</client>
#
# Verify data after the test has been "shot"
<verify>
<strip>
^User-Agent:.*
</strip>
<protocol>
GET /?email=name@example.com/1118 HTTP/1.1
Host: %HOSTIP:%HTTPPORT
Accept: */*
</protocol>
</verify>
</testcase>

View File

@ -15,7 +15,7 @@ ftp
FTP URL with ?-letters in username and password
</name>
<command>
"ftp://use?r:pass?word@%HOSTIP:%FTPPORT/191"
"ftp://use%3fr:pass%3fword@%HOSTIP:%FTPPORT/191"
</command>
</client>