- ignore duplicate "chunked" transfer-encodings from
a server to accomodate for broken implementations
- add test1482 and test1483
Reported-by: Mel Zuser
Fixes#13451Closes#13461
- curl's transfer handling may write 0-length chunks at the end of the
download with an EOS flag. (HTTP/2 does this commonly)
- content encoders need to pass-through such a write and not count this
as error in case they are finished decoding
Fixes#13209Fixes#13212Closes#13219
This clarifies the handling of server responses by folding the code for
the complicated protocols into their protocol handlers. This concerns
mainly HTTP and its bastard sibling RTSP.
The terms "read" and "write" are often used without clear context if
they refer to the connect or the client/application side of a
transfer. This PR uses "read/write" for operations on the client side
and "send/receive" for the connection, e.g. server side. If this is
considered useful, we can revisit renaming of further methods in another
PR.
Curl's protocol handler `readwrite()` method been changed:
```diff
- CURLcode (*readwrite)(struct Curl_easy *data, struct connectdata *conn,
- const char *buf, size_t blen,
- size_t *pconsumed, bool *readmore);
+ CURLcode (*write_resp)(struct Curl_easy *data, const char *buf, size_t blen,
+ bool is_eos, bool *done);
```
The name was changed to clarify that this writes reponse data to the
client side. The parameter changes are:
* `conn` removed as it always operates on `data->conn`
* `pconsumed` removed as the method needs to handle all data on success
* `readmore` removed as no longer necessary
* `is_eos` as indicator that this is the last call for the transfer
response (end-of-stream).
* `done` TRUE on return iff the transfer response is to be treated as
finished
This change affects many files only because of updated comments in
handlers that provide no implementation. The real change is that the
HTTP protocol handlers now provide an implementation.
The HTTP protocol handlers `write_resp()` implementation will get passed
**all** raw data of a server response for the transfer. The HTTP/1.x
formatted status and headers, as well as the undecoded response
body. `Curl_http_write_resp_hds()` is used internally to parse the
response headers and pass them on. This method is public as the RTSP
protocol handler also uses it.
HTTP/1.1 "chunked" transport encoding is now part of the general
*content encoding* writer stack, just like other encodings. A new flag
`CLIENTWRITE_EOS` was added for the last client write. This allows
writers to verify that they are in a valid end state. The chunked
decoder will check if it indeed has seen the last chunk.
The general response handling in `transfer.c:466` happens in function
`readwrite_data()`. This mainly operates now like:
```
static CURLcode readwrite_data(data, ...)
{
do {
Curl_xfer_recv_resp(data, buf)
...
Curl_xfer_write_resp(data, buf)
...
} while(interested);
...
}
```
All the response data handling is implemented in
`Curl_xfer_write_resp()`. It calls the protocol handler's `write_resp()`
implementation if available, or does the default behaviour.
All raw response data needs to pass through this function. Which also
means that anyone in possession of such data may call
`Curl_xfer_write_resp()`.
Closes#12480
This PR has these changes:
Renaming of unencode_* to cwriter, e.g. client writers
- documentation of sendf.h functions
- move max decode stack checks back to content_encoding.c
- define writer phase which was used as order before
- introduce phases for monitoring inbetween decode phases
- offering default implementations for init/write/close
Add type paramter to client writer's do_write()
- always pass all writes through the writer stack
- writers who only care about BODY data will pass other writes unchanged
add RAW and PROTOCOL client writers
- RAW used for Curl_debug() logging of CURLINFO_DATA_IN
- PROTOCOL used for updates to data->req.bytecount, max_filesize checks and
Curl_pgrsSetDownloadCounter()
- remove all updates of data->req.bytecount and calls to
Curl_pgrsSetDownloadCounter() and Curl_debug() from other code
- adjust test457 expected output to no longer see the excess write
Closes#12184
- move definitions from content_encoding.h to sendf.h
- move create/cleanup/add code into sendf.c
- installed content_encoding writers will always be called
on Curl_client_write(CLIENTWRITE_BODY)
- Curl_client_cleanup() frees writers and tempbuffers from
paused transfers, irregardless of protocol
Closes#11908
brotli v1.0.0 throughout current latest v1.0.9 and latest master [1]
trigger this warning.
It happened with CMake and GNU Make. autotools builds avoid it with
the `convert -I options to -isystem` macro.
llvm/clang:
```
In file included from ./curl/lib/content_encoding.c:36:
./brotli/x64-ucrt/usr/include/brotli/decode.h:204:34: warning: variable length array used [-Wvla]
const uint8_t encoded_buffer[BROTLI_ARRAY_PARAM(encoded_size)],
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./brotli/x64-ucrt/usr/include/brotli/port.h:253:34: note: expanded from macro 'BROTLI_ARRAY_PARAM'
^~~~~~
In file included from ./curl/lib/content_encoding.c:36:
./brotli/x64-ucrt/usr/include/brotli/decode.h:206:48: warning: variable length array used [-Wvla]
uint8_t decoded_buffer[BROTLI_ARRAY_PARAM(*decoded_size)]);
~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
./brotli/x64-ucrt/usr/include/brotli/port.h:253:35: note: expanded from macro 'BROTLI_ARRAY_PARAM'
~^~~~~
```
gcc:
```
In file included from ./curl/lib/content_encoding.c:36:
./brotli/x64-ucrt/usr/include/brotli/decode.h:204:5: warning: ISO C90 forbids variable length array 'encoded_buffer' [-Wvla]
204 | const uint8_t encoded_buffer[BROTLI_ARRAY_PARAM(encoded_size)],
| ^~~~~
./brotli/x64-ucrt/usr/include/brotli/decode.h:206:5: warning: ISO C90 forbids variable length array 'decoded_buffer' [-Wvla]
206 | uint8_t decoded_buffer[BROTLI_ARRAY_PARAM(*decoded_size)]);
| ^~~~~~~
```
[1] ed1995b6bd
Reviewed-by: Daniel Stenberg
Reviewed-by: Marcel Raad
Closes#10738
- they are mostly pointless in all major jurisdictions
- many big corporations and projects already don't use them
- saves us from pointless churn
- git keeps history for us
- the year range is kept in COPYING
checksrc is updated to allow non-year using copyright statements
Closes#10205
The unencoding stack is added to as Transfer-Encoding and
Content-Encoding fields are encountered with no distinction between the
two, meaning the stack will be incorrect if, e.g., the message has both
fields and a non-chunked Transfer-Encoding comes first. This commit
fixes this by ordering the stack with transfer encodings first.
Reviewed-by: Patrick Monnerat
Closes#10187
Detecting headers and lib separately makes sense when headers come in
variations or with extra ones, but this wasn't the case here. These were
duplicate/parallel macros that we had to keep in sync with each other
for a working build. This patch leaves a single macro for each of these
dependencies:
- Rely on `HAVE_LIBZ`, delete parallel `HAVE_ZLIB_H`.
Also delete CMake logic making sure these two were in sync, along with
a toggle to turn off that logic, called `CURL_SPECIAL_LIBZ`.
Also delete stray `HAVE_ZLIB` defines.
There is also a `USE_ZLIB` variant in `lib/config-dos.h`. This patch
retains it for compatibility and deprecates it.
- Rely on `USE_LIBSSH2`, delete parallel `HAVE_LIBSSH2_H`.
Also delete `LIBSSH2_WIN32`, `LIBSSH2_LIBRARY` from
`winbuild/MakefileBuild.vc`, these have a role when building libssh2
itself. And `CURL_USE_LIBSSH`, which had no use at all.
Also delete stray `HAVE_LIBSSH2` defines.
- Rely on `USE_LIBSSH`, delete parallel `HAVE_LIBSSH_LIBSSH_H`.
Also delete `LIBSSH_WIN32`, `LIBSSH_LIBRARY` and `HAVE_LIBSSH` from
`winbuild/MakefileBuild.vc`, these were the result of copy-pasting the
libssh2 line, and were not having any use.
- Delete unused `HAVE_LIBPSL_H` and `HAVE_LIBPSL`.
Reviewed-by: Daniel Stenberg
Closes#9652
The variable-sized encoding-specific storage of a struct contenc_writer
currently relies on void * alignment that may be insufficient with
regards to the specific storage fields, although having not caused any
problems yet.
In addition, gcc 11.3 issues a warning on access to fields of partially
allocated structures that can occur when the specific storage size is 0:
content_encoding.c: In function ‘Curl_build_unencoding_stack’:
content_encoding.c:980:21: warning: array subscript ‘struct contenc_writer[0]’ is partly outside array bounds of ‘unsigned char[16]’ [-Warray-bounds]
980 | writer->handler = handler;
| ~~~~~~~~~~~~~~~~^~~~~~~~~
In file included from content_encoding.c:49:
memdebug.h:115:29: note: referencing an object of size 16 allocated by ‘curl_dbg_calloc’
115 | #define calloc(nbelem,size) curl_dbg_calloc(nbelem, size, __LINE__, __FILE__)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
content_encoding.c:977:60: note: in expansion of macro ‘calloc’
977 | struct contenc_writer *writer = (struct contenc_writer *)calloc(1, sz);
To solve both these problems, the current commit replaces the
contenc_writer/params structure pairs by "subclasses" of struct
contenc_writer. These are structures that contain a contenc_writer at
offset 0. Proper field alignment is therefore handled by the compiler and
full structure allocation is performed, silencing the warnings.
Closes#9455
Instances of ISSPACE() use that should rather use ISBLANK(). I think
somewhat carelessly used because it sounds as if it checks for space or
whitespace, but also includes %0a to %0d.
For parsing purposes, we should only accept what we must and not be
overly liberal. It leads to surprises and surprises lead to bad things.
Closes#9432
Add licensing and copyright information for all files in this repository. This
either happens in the file itself as a comment header or in the file
`.reuse/dep5`.
This commit also adds a Github workflow to check pull requests and adapts
copyright.pl to the changes.
Closes#8869
Commit b5a434f7f0 inhibits the warning
on implicit fallthrough cases, since the current coding of indicating
fallthrough with comments is falling out of fashion with new compilers.
This attempts to make the issue smaller by rewriting fallthroughs to no
longer fallthrough, via either breaking the cases or turning switch
statements into if statements.
lib/content_encoding.c: the fallthrough codepath is simply copied
into the case as it's a single line.
lib/http_ntlm.c: the fallthrough case skips a state in the state-
machine and fast-forwards to NTLMSTATE_LAST. Do this before the
switch statement instead to set up the states that we actually
want.
lib/http_proxy.c: the fallthrough is just falling into exiting the
switch statement which can be done easily enough in the case.
lib/mime.c: switch statement rewritten as if statement.
lib/pop3.c: the fallthrough case skips to the next state in the
statemachine, do this explicitly instead.
lib/urlapi.c: switch statement rewritten as if statement.
lib/vssh/wolfssh.c: the fallthrough cases fast-forwards the state
machine, do this by running another iteration of the switch
statement instead.
lib/vtls/gtls.c: switch statement rewritten as if statement.
lib/vtls/nss.c: the fallthrough codepath is simply copied into the
case as it's a single line. Also twiddle a comment to not be
inside a non-brace if statement.
Closes: #7322
See-also: #7295
Reviewed-by: Daniel Stenberg <daniel@haxx.se>
... in most cases instead of 'struct connectdata *' but in some cases in
addition to.
- We mostly operate on transfers and not connections.
- We need the transfer handle to log, store data and more. Everything in
libcurl is driven by a transfer (the CURL * in the public API).
- This work clarifies and separates the transfers from the connections
better.
- We should avoid "conn->data". Since individual connections can be used
by many transfers when multiplexing, making sure that conn->data
points to the current and correct transfer at all times is difficult
and has been notoriously error-prone over the years. The goal is to
ultimately remove the conn->data pointer for this reason.
Closes#6425
include zstd curl patch for Makefile.m32 from vszakats
and include Add CMake support for zstd from Peter Wu
Helped-by: Viktor Szakats
Helped-by: Peter Wu
Closes#5453
- Stick to a single unified way to use structs
- Make checksrc complain on 'typedef struct {'
- Allow them in tests, public headers and examples
- Let MD4_CTX, MD5_CTX, and SHA256_CTX typedefs remain as they actually
typedef different types/structs depending on build conditions.
Closes#5338
Some servers issue raw deflate data that may be followed by an undocumented
trailer. This commit makes curl tolerate such a trailer of up to 4 bytes
before considering the data is in error.
Reported-by: clbr on github
Fixes#2719
- Get rid of variable that was generating false positive warning
(unitialized)
- Fix issues in tests
- Reduce scope of several variables all over
etc
Closes#2631
When a zeroed out allocation is required, use calloc() rather than
malloc() followed by an explicit memset(). The result will be the
same, but using calloc() everywhere increases consistency in the
codebase and avoids the risk of subtle bugs when code is injected
between malloc and memset by accident.
Closes https://github.com/curl/curl/pull/2497
Some servers return a "content-encoding" header with a non-standard
"none" value.
Add "none" as an alias to "identity" as a work-around, to avoid
unrecognised content encoding type errors.
Signed-off-by: Mohammad AlSaleh <CE.Mohammad.AlSaleh@gmail.com>
Closes https://github.com/curl/curl/pull/2298
Decoding loop implementation did not concern the case when all
received data is consumed by Brotli decoder and the size of decoded
data internally hold by Brotli decoder is greater than CURL_MAX_WRITE_SIZE.
For content with unencoded length greater than CURL_MAX_WRITE_SIZE this
can result in the loss of data at the end of content.
Closes#2194
- When zlib version is < 1.2.0.4, process gzip trailer before considering
extra data as an error.
- Inflate with Z_BLOCK instead of Z_SYNC_FLUSH to maximize correct data
and minimize corrupt data output.
- Do not try to restart deflate decompression in raw mode if output has
started or if the leading data is not available anymore.
- New test 232 checks inflating raw-deflated content.
Closes#2068
There is a conflict on symbol 'free_func' between openssl/crypto.h and
zlib.h on AIX. This is an attempt to resolve it.
Bug: https://curl.haxx.se/mail/lib-2017-11/0032.html
Reported-By: Michael Felt
- Don't call zlib's inflate() when avail_in stream bytes is 0.
This is a follow up to the parent commit 19e66e5. Prior to that change
libcurl's inflate_stream could call zlib's inflate even when no bytes
were available, causing inflate to return Z_BUF_ERROR, and then
inflate_stream would treat that as a hard error and return
CURLE_BAD_CONTENT_ENCODING.
According to the zlib FAQ, Z_BUF_ERROR is not fatal.
This bug would happen randomly since packet sizes are arbitrary. A test
of 10,000 transfers had 55 fail (ie 0.55%).
Ref: https://zlib.net/zlib_faq.html#faq05
Closes https://github.com/curl/curl/pull/2060
This uses the brotli external library (https://github.com/google/brotli).
Brotli becomes a feature: additional curl_version_info() bit and
structure fields are provided for it and CURLVERSION_NOW bumped.
Tests 314 and 315 check Brotli content unencoding with correct and
erroneous data.
Some tests are updated to accomodate with the now configuration dependent
parameters of the Accept-Encoding header.
This is implemented as an output streaming stack of unencoders, the last
calling the client write procedure.
New test 230 checks this feature.
Bug: https://github.com/curl/curl/pull/2002
Reported-By: Daniel Bankhead
This commit renames lib/setup.h to lib/curl_setup.h and
renames lib/setup_once.h to lib/curl_setup_once.h.
Removes the need and usage of a header inclusion guard foreign
to libcurl. [1]
Removes the need and presence of an alarming notice we carried
in old setup_once.h [2]
----------------------------------------
1 - lib/setup_once.h used __SETUP_ONCE_H macro as header inclusion guard
up to commit ec691ca3 which changed this to HEADER_CURL_SETUP_ONCE_H,
this single inclusion guard is enough to ensure that inclusion of
lib/setup_once.h done from lib/setup.h is only done once.
Additionally lib/setup.h has always used __SETUP_ONCE_H macro to
protect inclusion of setup_once.h even after commit ec691ca3, this
was to avoid a circular header inclusion triggered when building a
c-ares enabled version with c-ares sources available which also has
a setup_once.h header. Commit ec691ca3 exposes the real nature of
__SETUP_ONCE_H usage in lib/setup.h, it is a header inclusion guard
foreign to libcurl belonging to c-ares's setup_once.h
The renaming this commit does, fixes the circular header inclusion,
and as such removes the need and usage of a header inclusion guard
foreign to libcurl. Macro __SETUP_ONCE_H no longer used in libcurl.
2 - Due to the circular interdependency of old lib/setup_once.h and the
c-ares setup_once.h header, old file lib/setup_once.h has carried
back from 2006 up to now days an alarming and prominent notice about
the need of keeping libcurl's and c-ares's setup_once.h in sync.
Given that this commit fixes the circular interdependency, the need
and presence of mentioned notice is removed.
All mentioned interdependencies come back from now old days when
the c-ares project lived inside a curl subdirectory. This commit
removes last traces of such fact.
This reverts renaming and usage of lib/*.h header files done
28-12-2012, reverting 2 commits:
f871de0... build: make use of 76 lib/*.h renamed files
ffd8e12... build: rename 76 lib/*.h files
This also reverts removal of redundant include guard (redundant thanks
to changes in above commits) done 2-12-2013, reverting 1 commit:
c087374... curl_setup.h: remove redundant include guard
This also reverts renaming and usage of lib/*.c source files done
3-12-2013, reverting 3 commits:
13606bb... build: make use of 93 lib/*.c renamed files
5b6e792... build: rename 93 lib/*.c files
7d83dff... build: commit 13606bbfde follow-up 1
Start of related discussion thread:
http://curl.haxx.se/mail/lib-2013-01/0012.html
Asking for confirmation on pushing this revertion commit:
http://curl.haxx.se/mail/lib-2013-01/0048.html
Confirmation summary:
http://curl.haxx.se/mail/lib-2013-01/0079.html
NOTICE: The list of 2 files that have been modified by other
intermixed commits, while renamed, and also by at least one
of the 6 commits this one reverts follows below. These 2 files
will exhibit a hole in history unless git's '--follow' option
is used when viewing logs.
lib/curl_imap.h
lib/curl_smtp.h