Commit Graph

63 Commits

Author SHA1 Message Date
Daniel Stenberg
309a517ffd
lib1560: verify that more bad host names are rejected
when setting the hostname component of a URL

Closes #10922
2023-04-11 11:33:07 +02:00
Daniel Stenberg
826e8011d5
urlapi: prevent setting invalid schemes with *url_set()
A typical mistake would be to try to set "https://" - including the
separator - this is now rejected as that would then lead to
url_get(... URL...) would get an invalid URL extracted.

Extended test 1560 to verify.

Closes #10911
2023-04-09 23:23:54 +02:00
Daniel Stenberg
17a15d8846
urlapi: detect and error on illegal IPv4 addresses
Using bad numbers in an IPv4 numerical address now returns
CURLUE_BAD_HOSTNAME.

I noticed while working on trurl and it was originally reported here:
https://github.com/curl/trurl/issues/78

Updated test 1560 accordingly.

Closes #10894
2023-04-06 09:02:00 +02:00
Daniel Stenberg
f042e1e75d
urlapi: URL encoding for the URL missed the fragment
Meaning that it would wrongly still store the fragment using spaces
instead of %20 if allowing space while also asking for URL encoding.

Discovered when playing with trurl.

Added test to lib1560 to verify the fix.

Closes #10887
2023-04-05 08:30:12 +02:00
Daniel Stenberg
0a0c9b6dfa
urlapi: '%' is illegal in host names
Update test 1560 to verify

Ref: #10708
Closes #10711
2023-03-08 15:33:43 +01:00
Daniel Stenberg
54605666ed
lib1560: fix enumerated type mixed with another type
Follow-up to c84c0f9aa3

Closes #10684
2023-03-06 08:14:42 +01:00
Daniel Stenberg
c84c0f9aa3
lib1560: test parsing URLs with ridiculously large fields
In the order of 120K.

Closes #10665
2023-03-03 23:23:53 +01:00
Daniel Stenberg
bb11969838
lib1560: add a test using %25 in the userinfo in a URL
Closes #10578
2023-02-21 16:10:13 +01:00
Daniel Stenberg
b30b0c3840
lib1560: add IPv6 canonicalization tests
Closes #10552
2023-02-17 23:22:05 +01:00
Daniel Stenberg
8b27799f8c
urlapi: do the port number extraction without using sscanf()
- sscanf() is rather complex and slow, strchr() much simpler

- the port number function does not need to fully verify the IPv6 address
  anyway as it is done later in the hostname_check() function and doing
  it twice is unnecessary.

Closes #10541
2023-02-17 16:21:26 +01:00
Daniel Stenberg
2bc1d775f5
copyright: update all copyright lines and remove year ranges
- they are mostly pointless in all major jurisdictions
- many big corporations and projects already don't use them
- saves us from pointless churn
- git keeps history for us
- the year range is kept in COPYING

checksrc is updated to allow non-year using copyright statements

Closes #10205
2023-01-03 09:19:21 +01:00
Daniel Stenberg
901392cbb7
urlapi: add CURLU_PUNYCODE
Allows curl_url_get() get the punycode version of host names for the
host name and URL parts.

Extend test 1560 to verify.

Closes #10109
2022-12-26 23:29:23 +01:00
Daniel Stenberg
b151faa083
lib1560: add some basic IDN host name tests
Closes #10094
2022-12-15 22:57:08 +01:00
Daniel Stenberg
c20b35ddae
urlapi: reject more bad letters from the host name: &+()
Follow-up from eb0167ff7d

Extend test 1560 to verify

Closes #10096
2022-12-15 08:23:48 +01:00
Daniel Stenberg
7d6cf06f57
urlapi: fix parsing URL without slash with CURLU_URLENCODE
When CURLU_URLENCODE is set, the parser would mistreat the path
component if the URL was specified without a slash like in
http://local.test:80?-123

Extended test 1560 to reproduce and verify the fix.

Reported-by: Trail of Bits

Closes #9763
2022-10-20 08:56:53 +02:00
Daniel Stenberg
eb0167ff7d
urlapi: reject more bad characters from the host name field
Extended test 1560 to verify

Report from the ongoing source code audit by Trail of Bits.

Closes #9608
2022-09-28 08:22:42 +02:00
Daniel Stenberg
1a87a1efba
url: a zero-length userinfo part in the URL is still a (blank) user
Adjusted test 1560 to verify

Reported-by: Jay Satiro

Fixes #9088
Closes #9590
2022-09-26 07:45:53 +02:00
Daniel Stenberg
c4768f168c
lib1560: extended to verify detect/reject of unknown schemes
... when no guessing is allowed.
2022-09-15 09:31:45 +02:00
Daniel Stenberg
ef80a87f40
libtest/lib1560: test basic websocket URL parsing 2022-09-09 15:11:14 +02:00
Daniel Stenberg
6fa89fa893
tests: several enumerated type cleanups
To please icc

Closes #9179
2022-07-23 13:39:29 +02:00
Pierrick Charron
4bf2c231d7
urlapi: make curl_url_set(url, CURLUPART_URL, NULL, 0) clear all parts
As per the documentation :

> Setting a part to a NULL pointer will effectively remove that
> part's contents from the CURLU handle.

But currently clearing CURLUPART_URL does nothing and returns
CURLUE_OK. This change will clear all parts of the URL at once.

Closes #9028
2022-06-20 08:15:51 +02:00
max.mehl
ad9bc5976d
copyright: make repository REUSE compliant
Add licensing and copyright information for all files in this repository. This
either happens in the file itself as a comment header or in the file
`.reuse/dep5`.

This commit also adds a Github workflow to check pull requests and adapts
copyright.pl to the changes.

Closes #8869
2022-06-13 09:13:00 +02:00
Daniel Stenberg
cfa47974fe
libtest/lib1560: verify the host name percent decode fix 2022-05-09 12:50:41 +02:00
Daniel Stenberg
eec5ce4ab4
urlapi: if possible, shorten given numerical IPv6 addresses
Extended test 1560 to verify

Closes #8206
2022-01-02 22:59:08 +01:00
Daniel Stenberg
92d1aee8b1
urlapi: accept port number zero
This is a regression since 7.62.0 (fb30ac5a2d).

Updated test 1560 accordingly

Reported-by: Brad Fitzpatrick
Fixes #8090
Closes #8091
2021-12-03 22:58:41 +01:00
Daniel Stenberg
4183b8fe9a
urlapi: provide more detailed return codes
Previously, the return code CURLUE_MALFORMED_INPUT was used for almost
30 different URL format violations. This made it hard for users to
understand why a particular URL was not acceptable. Since the API cannot
point out a specific position within the URL for the problem, this now
instead introduces a number of additional and more fine-grained error
codes to allow the API to return more exactly in what "part" or section
of the URL a problem was detected.

Also bug-fixes curl_url_get() with CURLUPART_ZONEID, which previously
returned CURLUE_OK even if no zoneid existed.

Test cases in 1560 have been adjusted and extended. Tests 1538 and 1559
have been updated.

Updated libcurl-errors.3 and curl_url_strerror() accordingly.

Closes #8049
2021-11-25 08:36:04 +01:00
Daniel Stenberg
3e6eb18fce
urlapi: reject short file URLs
file URLs that are 6 bytes or shorter are not complete. Return
CURLUE_MALFORMED_INPUT for those. Extended test 1560 to verify.

Triggered by #8041
Closes #8042
2021-11-23 08:45:21 +01:00
Daniel Stenberg
9a8564a920
urlapi: URL decode percent-encoded host names
The host name is stored decoded and can be encoded when used to extract
the full URL. By default when extracting the URL, the host name will not
be URL encoded to work as similar as possible as before. When not URL
encoding the host name, the '%' character will however still be encoded.

Getting the URL with the CURLU_URLENCODE flag set will percent encode
the host name part.

As a bonus, setting the host name part with curl_url_set() no longer
accepts a name that contains space, CR or LF.

Test 1560 has been extended to verify percent encodings.

Reported-by: Noam Moshe
Reported-by: Sharon Brizinov
Reported-by: Raul Onitza-Klugman
Reported-by: Kirill Efimov
Fixes #7830
Closes #7834
2021-10-11 17:04:14 +02:00
Sergey Markelov
4b997626b1
urlapi: support UNC paths in file: URLs on Windows
- file://host.name/path/file.txt is a valid UNC path
  \\host.name\path\files.txt to a non-local file transformed into URI
  (RFC 8089 Appendix E.3)

- UNC paths on other OSs must be smb: URLs

Closes #7366
2021-09-27 08:32:41 +02:00
i-ky
3363eeb262
urlapi: add curl_url_strerror()
Add curl_url_strerror() to convert CURLUcode into readable string and
facilitate easier troubleshooting in programs using URL API.
Extend CURLUcode with CURLU_LAST for iteration in unit tests.
Update man pages with a mention of new function.
Update example code and tests with new functionality where it fits.

Closes #7605
2021-09-27 08:28:46 +02:00
Rikard Falkeborn
e75be2c4b2
cleanup: constify unmodified static structs
Constify a number of static structs that are never modified. Make them
const to show this.

Closes #7759
2021-09-23 12:54:35 +02:00
Daniel Stenberg
b67d3ba73e
curl_url_set: reject spaces in URLs w/o CURLU_ALLOW_SPACE
They were never officially allowed and slipped in only due to sloppy
parsing. Spaces (ascii 32) should be correctly encoded (to %20) before
being part of a URL.

The new flag bit CURLU_ALLOW_SPACE when a full URL is set, makes libcurl
allow spaces.

Updated test 1560 to verify.

Closes #7073
2021-06-15 10:49:49 +02:00
Daniel Stenberg
04488851e2
urlapi: make sure no +/- signs are accepted in IPv4 numericals
Follow-up to 56a037cc0a. Extends test 1560 to verify.

Reported-by: Tuomas Siipola
Fixes #6916
Closes #6917
2021-04-21 09:17:55 +02:00
Daniel Stenberg
56a037cc0a
urlapi: "normalize" numerical IPv4 host names
When the host name in a URL is given as an IPv4 numerical address, the
address can be specified with dotted numericals in four different ways:
a32, a.b24, a.b.c16 or a.b.c.d and each part can be specified in
decimal, octal (0-prefixed) or hexadecimal (0x-prefixed).

Instead of passing on the name as-is and leaving the handling to the
underlying name functions, which made them not work with c-ares but work
with getaddrinfo, this change now makes the curl URL API itself detect
and "normalize" host names specified as IPv4 numericals.

The WHATWG URL Spec says this is an okay way to specify a host name in a
URL. RFC 3896 does not allow them, but curl didn't prevent them before
and it seems other RFC 3896-using tools have not either. Host names used
like this are widely supported by other tools as well due to the
handling being done by getaddrinfo and friends.

I decided to add the functionality into the URL API itself so that all
users of these functions get the benefits, when for example wanting to
compare two URLs. Also, it makes curl built to use c-ares now support
them as well and make curl builds more consistent.

The normalization makes HTTPS and virtual hosted HTTP work fine even
when curl gets the address specified using one of the "obscure" formats.

Test 1560 is extended to verify.

Fixes #6863
Closes #6871
2021-04-19 08:34:55 +02:00
Daniel Stenberg
4d2f800677
curl.se: new home
Closes #6172
2020-11-04 23:59:47 +01:00
Daniel Stenberg
b7ea3d2c22
urlapi: URL encode a '+' in the query part
... when asked to with CURLU_URLENCODE.

Extended test 1560 to verify.
Reported-by: Dietmar Hauser
Fixes #6086
Closes #6087
2020-10-15 23:21:53 +02:00
Daniel Stenberg
17fcdf6a31
lib: fix -Wassign-enum warnings
configure --enable-debug now enables -Wassign-enum with clang,
identifying several enum "abuses" also fixed.

Reported-by: Gisle Vanem
Bug: 879007f811 (commitcomment-42087553)

Closes #5929
2020-09-08 13:53:02 +02:00
Daniel Stenberg
259a81555d
lib1560: verify "redirect" to double-slash leading URL
Closes #5849
2020-08-25 13:06:34 +02:00
Martin V
b71628b633
test1560: avoid possibly negative association in wording
Closes #5549
2020-06-12 10:01:57 +02:00
Daniel Stenberg
7f1c098728
urlapi: accept :: as a valid IPv6 address
Text 1560 is extended to verify.

Reported-by: Pavel Volgarev
Fixes #5344
Closes #5351
2020-05-08 08:47:29 +02:00
Patrick Monnerat
a75f12768d
test 1560: avoid valgrind false positives
When using maximum code optimization level (-O3), valgrind wrongly
detects uses of uninitialized values in strcmp().

Preset buffers with all zeroes to avoid that.
2020-03-08 17:30:55 +01:00
Daniel Stenberg
d3dc0a07e9
urlapi: guess scheme correct even with credentials given
In the "scheme-less" parsing case, we need to strip off credentials
first before we guess scheme based on the host name!

Assisted-by: Jay Satiro
Fixes #4856
Closes #4857
2020-01-28 08:40:16 +01:00
Daniel Stenberg
6e7733f788
urlapi: question mark within fragment is still fragment
The parser would check for a query part before fragment, which caused it
to do wrong when the fragment contains a question mark.

Extended test 1560 to verify.

Reported-by: Alex Konev
Fixes #4412
Closes #4413
2019-09-24 23:30:43 +02:00
Jens Finkhaeuser
0a4ecbdf1c
urlapi: CURLU_NO_AUTHORITY allows empty authority/host part
CURLU_NO_AUTHORITY is intended for use with unknown schemes (i.e. not
"file:///") to override cURL's default demand that an authority exists.

Closes #4349
2019-09-19 15:57:28 +02:00
Daniel Stenberg
eab3c580f9
urlapi: verify the IPv6 numerical address
It needs to parse correctly. Otherwise it could be tricked into letting
through a-f using host names that libcurl would then resolve. Like
'[ab.be]'.

Reported-by: Thomas Vegas
Closes #4315
2019-09-10 11:32:12 +02:00
Marcel Raad
e23c52b329
build: fix Codacy warnings
Reduce variable scopes and remove redundant variable stores.

Closes https://github.com/curl/curl/pull/3975
2019-06-05 20:38:06 +02:00
Daniel Stenberg
8b038bcc95
lib1560: add tests for parsing URL with too long scheme
Ref: #3905
2019-05-20 15:27:07 +02:00
Daniel Stenberg
9f9ec7da57
urlapi: require a non-zero host name length when parsing URL
Updated test 1560 to verify.

Closes #3880
2019-05-14 13:39:10 +02:00
Daniel Stenberg
2d0e9b40d3
urlapi: add CURLUPART_ZONEID to set and get
The zoneid can be used with IPv6 numerical addresses.

Updated test 1560 to verify.

Closes #3834
2019-05-05 15:52:46 +02:00
Daniel Stenberg
bdb2dbc103
urlapi: strip off scope id from numerical IPv6 addresses
... to make the host name "usable". Store the scope id and put it back
when extracting a URL out of it.

Also makes curl_url_set() syntax check CURLUPART_HOST.

Fixes #3817
Closes #3822
2019-05-03 12:17:22 +02:00