2024-01-17 18:32:44 +08:00
|
|
|
---
|
2024-02-28 18:28:10 +08:00
|
|
|
c: Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al.
|
2024-01-17 18:32:44 +08:00
|
|
|
SPDX-License-Identifier: curl
|
|
|
|
Title: curl_url_set
|
|
|
|
Section: 3
|
|
|
|
Source: libcurl
|
|
|
|
See-also:
|
|
|
|
- CURLOPT_CURLU (3)
|
|
|
|
- curl_url (3)
|
|
|
|
- curl_url_cleanup (3)
|
|
|
|
- curl_url_dup (3)
|
|
|
|
- curl_url_get (3)
|
|
|
|
- curl_url_strerror (3)
|
2024-03-21 18:50:20 +08:00
|
|
|
Protocol:
|
2024-03-23 06:48:54 +08:00
|
|
|
- All
|
2024-01-17 18:32:44 +08:00
|
|
|
---
|
|
|
|
|
|
|
|
# NAME
|
|
|
|
|
2018-09-09 01:39:57 +08:00
|
|
|
curl_url_set - set a URL part
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
# SYNOPSIS
|
|
|
|
|
|
|
|
~~~c
|
2021-11-26 21:20:18 +08:00
|
|
|
#include <curl/curl.h>
|
2018-08-05 17:51:07 +08:00
|
|
|
|
|
|
|
CURLUcode curl_url_set(CURLU *url,
|
|
|
|
CURLUPart part,
|
|
|
|
const char *content,
|
2023-09-05 17:30:53 +08:00
|
|
|
unsigned int flags);
|
2024-01-17 18:32:44 +08:00
|
|
|
~~~
|
|
|
|
|
|
|
|
# DESCRIPTION
|
|
|
|
|
|
|
|
The *url* handle to work on, passed in as the first argument, must be a
|
|
|
|
handle previously created by curl_url(3) or curl_url_dup(3).
|
2023-08-22 17:26:05 +08:00
|
|
|
|
|
|
|
This function sets or updates individual URL components, or parts, held by the
|
|
|
|
URL object the handle identifies.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
The *part* argument should identify the particular URL part (see list below)
|
|
|
|
to set or change, with *content* pointing to a null-terminated string with the
|
|
|
|
new contents for that URL part. The contents should be in the form and
|
|
|
|
encoding they would use in a URL: URL encoded.
|
2023-08-22 17:26:05 +08:00
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
When setting a part in the URL object that was previously already set, it
|
2023-08-22 17:26:05 +08:00
|
|
|
replaces the data that was previously stored for that part with the new
|
2024-01-17 18:32:44 +08:00
|
|
|
*content*.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
The caller does not have to keep *content* around after a successful call
|
2023-08-22 17:26:05 +08:00
|
|
|
as this function copies the content.
|
2021-04-24 23:13:07 +08:00
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
Setting a part to a NULL pointer removes that part's contents from the *CURLU*
|
|
|
|
handle.
|
2022-06-12 18:53:54 +08:00
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
This function has an 8 MB maximum length limit for all provided input strings.
|
|
|
|
In the real world, excessively long fields in URLs cause problems even if this
|
2024-04-18 16:42:18 +08:00
|
|
|
function accepts them.
|
2023-03-07 18:01:15 +08:00
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
When setting or updating contents of individual URL parts, curl_url_set(3)
|
|
|
|
might accept data that would not be otherwise possible to set in the string
|
|
|
|
when it gets populated as a result of a full URL parse. Beware. If done so,
|
|
|
|
extracting a full URL later on from such components might render an invalid
|
|
|
|
URL.
|
2023-04-11 14:59:00 +08:00
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
The *flags* argument is a bitmask with independent features.
|
|
|
|
|
|
|
|
# PARTS
|
|
|
|
|
|
|
|
## CURLUPART_URL
|
|
|
|
|
2018-08-05 17:51:07 +08:00
|
|
|
Allows the full URL of the handle to be replaced. If the handle already is
|
|
|
|
populated with a URL, the new URL can be relative to the previous.
|
|
|
|
|
|
|
|
When successfully setting a new URL, relative or absolute, the handle contents
|
2023-08-22 17:26:05 +08:00
|
|
|
is replaced with the components of the newly set URL.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
Pass a pointer to a null-terminated string to the *url* parameter. The
|
2018-08-05 17:51:07 +08:00
|
|
|
string must point to a correctly formatted "RFC 3986+" URL or be a NULL
|
|
|
|
pointer.
|
2023-03-07 18:01:15 +08:00
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
By default, this API only accepts setting URLs using schemes for protocols
|
|
|
|
that are supported built-in. To make libcurl parse URLs generically even for
|
|
|
|
schemes it does not know about, the **CURLU_NON_SUPPORT_SCHEME** flags bit
|
|
|
|
must be set. Otherwise, this function returns *CURLUE_UNSUPPORTED_SCHEME* for
|
|
|
|
URL schemes it does not recognize.
|
|
|
|
|
2024-01-23 22:12:09 +08:00
|
|
|
Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed in
|
2023-03-07 18:01:15 +08:00
|
|
|
the URL.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_SCHEME
|
|
|
|
|
2021-11-23 21:16:38 +08:00
|
|
|
Scheme cannot be URL decoded on set. libcurl only accepts setting schemes up
|
|
|
|
to 40 bytes long.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_USER
|
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
If only the user part is set and not the password, the URL is represented with
|
|
|
|
a blank password.
|
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
## CURLUPART_PASSWORD
|
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
If only the password part is set and not the user, the URL is represented with
|
|
|
|
a blank user.
|
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
## CURLUPART_OPTIONS
|
|
|
|
|
2023-03-07 18:01:15 +08:00
|
|
|
The options field is an optional field that might follow the password in the
|
|
|
|
userinfo part. It is only recognized/used when parsing URLs for the following
|
|
|
|
schemes: pop3, smtp and imap. This function however allows users to
|
2023-08-22 23:40:39 +08:00
|
|
|
independently set this field.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_HOST
|
|
|
|
|
2024-01-23 22:12:09 +08:00
|
|
|
The hostname. If it is International Domain Name (IDN) the string must then be
|
|
|
|
encoded as your locale says or UTF-8 (when WinIDN is used). If it is a
|
2023-08-22 17:26:05 +08:00
|
|
|
bracketed IPv6 numeric address it may contain a zone id (or you can use
|
2024-01-17 18:32:44 +08:00
|
|
|
*CURLUPART_ZONEID*).
|
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
Note that if you set an IPv6 address, it gets ruined and causes an error if
|
|
|
|
you also set the CURLU_URLENCODE flag.
|
|
|
|
|
2024-01-23 22:12:09 +08:00
|
|
|
Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed to set.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_ZONEID
|
2023-03-07 18:01:15 +08:00
|
|
|
|
2024-01-23 22:12:09 +08:00
|
|
|
If the hostname is a numeric IPv6 address, this field can also be set.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_PORT
|
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
The port number cannot be URL encoded on set. The given port number is
|
|
|
|
provided as a string and the decimal number in it must be between 0 and
|
|
|
|
65535. Anything else returns an error.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_PATH
|
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
If a path is set in the URL without a leading slash, a slash is prepended
|
2023-06-08 19:15:09 +08:00
|
|
|
automatically.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_QUERY
|
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
The query part gets spaces converted to pluses when asked to URL encode on set
|
2024-01-17 18:32:44 +08:00
|
|
|
with the *CURLU_URLENCODE* bit.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
If used together with the *CURLU_APPENDQUERY* bit, the provided part is
|
2022-10-01 05:41:18 +08:00
|
|
|
appended on the end of the existing query.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
|
|
|
The question mark in the URL is not part of the actual query contents.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLUPART_FRAGMENT
|
|
|
|
|
2018-08-05 17:51:07 +08:00
|
|
|
The hash sign in the URL is not part of the actual fragment contents.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
# FLAGS
|
|
|
|
|
2018-08-05 17:51:07 +08:00
|
|
|
The flags argument is zero, one or more bits set in a bitmask.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLU_APPENDQUERY
|
|
|
|
|
|
|
|
Can be used when setting the *CURLUPART_QUERY* component. The provided new
|
2023-08-22 17:26:05 +08:00
|
|
|
part is then appended at the end of the existing query - and if the previous
|
|
|
|
part did not end with an ampersand (&), an ampersand gets inserted before the
|
|
|
|
new appended part.
|
2022-10-01 05:41:18 +08:00
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
When *CURLU_APPENDQUERY* is used together with *CURLU_URLENCODE*, the
|
2023-08-22 17:26:05 +08:00
|
|
|
first '=' symbol is not URL encoded.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLU_NON_SUPPORT_SCHEME
|
|
|
|
|
|
|
|
If set, allows curl_url_set(3) to set a non-supported scheme.
|
|
|
|
|
|
|
|
## CURLU_URLENCODE
|
|
|
|
|
|
|
|
When set, curl_url_set(3) URL encodes the part on entry, except for
|
2024-04-18 16:42:18 +08:00
|
|
|
**scheme**, **port** and **URL**.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
|
|
|
When setting the path component with URL encoding enabled, the slash character
|
2024-04-18 16:42:18 +08:00
|
|
|
is skipped.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
2024-04-18 16:42:18 +08:00
|
|
|
The query part gets space-to-plus converted before the URL conversion is
|
|
|
|
applied.
|
2018-08-05 17:51:07 +08:00
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
This URL encoding is charset unaware and converts the input in a byte-by-byte
|
|
|
|
manner.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLU_DEFAULT_SCHEME
|
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
If set, allows the URL to be set without a scheme and then sets that to the
|
2024-01-17 18:32:44 +08:00
|
|
|
default scheme: HTTPS. Overrides the *CURLU_GUESS_SCHEME* option if both
|
2023-08-22 17:26:05 +08:00
|
|
|
are set.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLU_GUESS_SCHEME
|
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
If set, allows the URL to be set without a scheme and it instead "guesses"
|
2024-01-23 22:12:09 +08:00
|
|
|
which scheme that was intended based on the hostname. If the outermost
|
|
|
|
subdomain name matches DICT, FTP, IMAP, LDAP, POP3 or SMTP then that scheme is
|
|
|
|
used, otherwise it picks HTTP. Conflicts with the *CURLU_DEFAULT_SCHEME*
|
|
|
|
option which takes precedence if both are set.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLU_NO_AUTHORITY
|
|
|
|
|
2019-09-19 21:54:53 +08:00
|
|
|
If set, skips authority checks. The RFC allows individual schemes to omit the
|
|
|
|
host part (normally the only mandatory part of the authority), but libcurl
|
|
|
|
cannot know whether this is permitted for custom schemes. Specifying the flag
|
|
|
|
permits empty authority sections, similar to how file scheme is handled.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLU_PATH_AS_IS
|
|
|
|
|
|
|
|
When set for **CURLUPART_URL**, this skips the normalization of the
|
2023-08-22 17:26:05 +08:00
|
|
|
path. That is the procedure where libcurl otherwise removes sequences of
|
2021-02-21 20:58:47 +08:00
|
|
|
dot-slash and dot-dot etc. The same option used for transfers is called
|
2024-01-17 18:32:44 +08:00
|
|
|
CURLOPT_PATH_AS_IS(3).
|
|
|
|
|
|
|
|
## CURLU_ALLOW_SPACE
|
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
If set, the URL parser allows space (ASCII 32) where possible. The URL syntax
|
|
|
|
does normally not allow spaces anywhere, but they should be encoded as %20
|
|
|
|
or '+'. When spaces are allowed, they are still not allowed in the scheme.
|
|
|
|
When space is used and allowed in a URL, it is stored as-is unless
|
2024-01-17 18:32:44 +08:00
|
|
|
*CURLU_URLENCODE* is also set, which then makes libcurl URL encode the
|
2023-08-22 17:26:05 +08:00
|
|
|
space before stored. This affects how the URL is constructed when
|
2024-01-17 18:32:44 +08:00
|
|
|
curl_url_get(3) is subsequently used to extract the full URL or
|
2022-01-08 07:28:52 +08:00
|
|
|
individual parts. (Added in 7.78.0)
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
## CURLU_DISALLOW_USER
|
|
|
|
|
2023-08-22 17:26:05 +08:00
|
|
|
If set, the URL parser does not accept embedded credentials for the
|
2024-01-17 18:32:44 +08:00
|
|
|
**CURLUPART_URL**, and instead returns **CURLUE_USER_NOT_ALLOWED** for
|
2022-12-15 17:30:51 +08:00
|
|
|
such URLs.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
# EXAMPLE
|
|
|
|
|
|
|
|
~~~c
|
2023-12-04 17:50:42 +08:00
|
|
|
int main(void)
|
|
|
|
{
|
2018-08-05 17:51:07 +08:00
|
|
|
CURLUcode rc;
|
|
|
|
CURLU *url = curl_url();
|
|
|
|
rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
|
|
|
|
if(!rc) {
|
|
|
|
/* change it to an FTP URL */
|
|
|
|
rc = curl_url_set(url, CURLUPART_SCHEME, "ftp", 0);
|
|
|
|
}
|
|
|
|
curl_url_cleanup(url);
|
2023-12-04 17:50:42 +08:00
|
|
|
}
|
2024-01-17 18:32:44 +08:00
|
|
|
~~~
|
|
|
|
|
|
|
|
# AVAILABILITY
|
|
|
|
|
2021-10-25 17:45:09 +08:00
|
|
|
Added in 7.62.0. CURLUPART_ZONEID was added in 7.65.0.
|
2024-01-17 18:32:44 +08:00
|
|
|
|
|
|
|
# RETURN VALUE
|
|
|
|
|
|
|
|
Returns a *CURLUcode* error value, which is CURLUE_OK (0) if everything
|
|
|
|
went fine. See the libcurl-errors(3) man page for the full list with
|
2021-10-25 14:54:08 +08:00
|
|
|
descriptions.
|
|
|
|
|
2024-01-17 18:32:44 +08:00
|
|
|
The input string passed to curl_url_set(3) must be shorter than eight
|
|
|
|
million bytes. Otherwise this function returns **CURLUE_MALFORMED_INPUT**.
|
2021-10-25 14:54:08 +08:00
|
|
|
|
|
|
|
If this function returns an error, no URL part is set.
|