libxml2

mirror of https://github.com/GNOME/libxml2.git synced 2025-04-18 19:40:25 +08:00

Author	SHA1	Message	Date
Nick Wellnhofer	cd220b93d8	valid: Remove duplicate error messages when streaming	2024-12-28 11:55:24 +01:00
Nick Wellnhofer	e8fb3d639f	parser: Convert some "internal errors" to meaningful codes	2024-01-02 19:48:23 +01:00
Nick Wellnhofer	37c6618be5	parser: Rework parsing of attribute and entity values Don't use a separate function to handle "complex" attributes. Validate UTF-8 byte sequences without decoding. This should improve performance considerably when parsing multi-byte UTF-8 sequences. Use a string buffer to avoid unnecessary allocations and copying when expanding entities. Normalize attribute values in a single pass while expanding entities. Be more lenient in recovery mode. If no entity substitution was requested, validate entities without expanding. Fixes #596. Also fixes #655.	2024-01-02 15:42:03 +01:00
Nick Wellnhofer	d944a41515	parser: Fix in-parameter-entity and in-external-dtd checks Use in ctxt->input->entity instead of ctxt->inputNr to determine whether we are inside a parameter entity. Stop using ctxt->external to check whether we're in an external DTD. This is signaled by ctxt->inSubset == 2.	2023-12-29 01:19:56 +01:00
Nick Wellnhofer	7d446e9736	parser: Fix namespaces redefined from default attributes This regressed in commit e0dd330b. Also fixes a long-standing issue where namespaces from default attributes weren't added if they match an existing namespace. Fixes #643.	2023-12-08 12:19:16 +01:00
Nick Wellnhofer	b76d81dab3	parser: Fix regression when push parsing parameter entities Short-lived regression from 834b8123. Also shrink parameter entity buffers when push parsing.	2023-10-06 13:11:19 +02:00
Nick Wellnhofer	0ba22c0513	parser: Support encoded external PEs in entity values Corner case which was never supported.	2023-10-06 12:28:59 +02:00
Nick Wellnhofer	bbd918b2e7	parser: Fix detection of null bytes Also suppress misleading extra errors. Fixes #122.	2023-08-29 18:43:10 +02:00
David Kilzer	cb1b8b8516	xmlValidatePopElement() can return invalid value (-1) Covered by: test/VC/ElementValid5 This only affects XML Reader API with LIBXML_REGEXP_ENABLED and LIBXML_VALID_ENABLED turned on. * result/VC/ElementValid5.rdr: - Update result to add missing error message. * python/tests/reader2.py: * result/VC/ElementValid6.rdr: * result/VC/ElementValid7.rdr: * result/valid/781333.xml.err.rdr: - Update result to fix grammar issue. * valid.c: (xmlValidatePopElement): - Check return value of xmlRegExecPushString() to handle -1, and assign 'ret = 0;' to return 0 from xmlValidatePopElement(). This change affects xmlTextReaderValidatePop() from xmlreader.c. - Fix grammar of error message by changing 'child' to 'children'.	2023-04-10 13:21:53 -07:00
Nick Wellnhofer	74aa61e0bd	parser: Halt parser on DTD errors If we try to continue parsing after an error in the internal or external subset, entity expansion accounting gets more complicated. Simply halt the parser. Found with libFuzzer.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	f1c32b4c78	Allow missing result files in runtest Treat missing files as empty.	2022-04-04 04:28:15 +02:00
Nick Wellnhofer	48b03c8479	Remove major parts of old test suite Remove all the parts of the old test suite which are covered by runtest.c for quite some time. The following test programs are removed: - testC14N - testHTML - testReader - testRelax - testSAX - testSchemas - testURI - testXPath This also removes a few results of unimportant tests only run by the old test suite.	2022-04-04 04:14:55 +02:00
Nick Wellnhofer	f480f7509c	Update NewsML DTD in test suite Switch to version 1.2 which has a clearer license. Fixes #291.	2022-02-03 14:43:17 +01:00
Nick Wellnhofer	d85245f934	Fix regression with PEs in external DTD Fix a regression introduced with commit a28f7d87. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.	2022-01-16 21:56:10 +01:00
Nick Wellnhofer	01411e7c5e	Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.	2021-02-08 21:51:26 +01:00
Jared Yanovich	2a350ee9b4	Large batch of typo fixes Closes #109.	2019-09-30 18:04:38 +02:00
Nick Wellnhofer	c51e38cb3a	Make xmlParseConditionalSections non-recursive Avoid call stack overflow in deeply nested conditional sections. Found by OSS-Fuzz.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	872fea9485	Get rid of "blanks wrapper" for parameter entities Now that replacement of parameter entities goes exclusively through xmlSkipBlankChars, we can account for the surrounding space characters there and remove the "blanks wrapper" hack.	2017-06-20 13:19:47 +02:00
Nick Wellnhofer	5f440d8cad	Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.	2017-06-17 13:25:53 +02:00
Nick Wellnhofer	932cc9896a	Fix buffer size checks in xmlSnprintfElementContent xmlSnprintfElementContent failed to correctly check the available buffer space in two locations. Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048). Thanks to Marcel Böhme and Thuan Pham for the report.	2017-06-05 19:38:19 +02:00
Nick Wellnhofer	e26630548e	Fix handling of parameter-entity references There were two bugs where parameter-entity references could lead to an unexpected change of the input buffer in xmlParseNameComplex and xmlDictLookup being called with an invalid pointer. Percent sign in DTD Names ========================= The NEXTL macro used to call xmlParserHandlePEReference. When parsing "complex" names inside the DTD, this could result in entity expansion which created a new input buffer. The fix is to simply remove the call to xmlParserHandlePEReference from the NEXTL macro. This is safe because no users of the macro require expansion of parameter entities. - xmlParseNameComplex - xmlParseNCNameComplex - xmlParseNmtoken The percent sign is not allowed in names, which are grammatical tokens. - xmlParseEntityValue Parameter-entity references in entity values are expanded but this happens in a separate step in this function. - xmlParseSystemLiteral Parameter-entity references are ignored in the system literal. - xmlParseAttValueComplex - xmlParseCharDataComplex - xmlParseCommentComplex - xmlParsePI - xmlParseCDSect Parameter-entity references are ignored outside the DTD. - xmlLoadEntityContent This function is only called from xmlStringLenDecodeEntities and entities are replaced in a separate step immediately after the function call. This bug could also be triggered with an internal subset and double entity expansion. This fixes bug 766956 initially reported by Wei Lei and independently by Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone involved. xmlParseNameComplex with XML_PARSE_OLD10 ======================================== When parsing Names inside an expanded parameter entity with the XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the GROW macro if the input buffer was exhausted. At the end of the parameter entity's replacement text, this function would then call xmlPopInput which invalidated the input buffer. There should be no need to invoke GROW in this situation because the buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and, at least for UTF-8, in xmlCurrentChar. This also matches the code path executed when XML_PARSE_OLD10 is not set. This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050). Thanks to Marcel Böhme and Thuan Pham for the report. Additional hardening ==================== A separate check was added in xmlParseNameComplex to validate the buffer size.	2017-06-05 18:38:33 +02:00
Daniel Veillard	a7a94612aa	Heap-based buffer overread in xmlNextChar For https://bugzilla.gnome.org/show_bug.cgi?id=759671 when the end of the internal subset isn't properly detected xmlParseInternalSubset should just return instead of trying to process input further.	2016-02-09 12:55:29 +01:00
Daniel Veillard	ef709ce2f7	Fix the spurious ID already defined error For https://bugzilla.gnome.org/show_bug.cgi?id=737840 the fix for 724903 introduced a regression on external entities carrying IDs, revert that patch in part and add a specific test to avoid readding it	2015-09-10 19:46:46 +08:00
Daniel Veillard	483272f3f0	Added a regression tests from bug 694228 data Provided by Mark Rowe <mrowe@apple.com>	2013-03-27 13:37:14 +08:00
Daniel Veillard	a7982ce272	Adding streaming validation to runtest checks	2012-10-25 15:39:39 +08:00
Daniel Veillard	e7bf892d8c	Improve error reporting on parser errors The extra string was being dismissed when provided. * parser.c: handle bot case properly * result/: this changes a few error reports	2012-07-30 20:09:25 +08:00
Daniel Veillard	cb3549e30a	Improve the error report on undefined REFs Use the tree node to provide the error context instead of the parser input which is not relevant anymore, based on a suggestion by François Delyon <f.delyon@satimage.fr>	2011-11-11 13:43:51 +08:00
Daniel Veillard	a721612e54	446613 small validation bug mixed content with NS * valid.c: fix a bug when valdating mixed content lists and some name use namespaces prefixes. * result/valid/notes.xml* test/valid/dtds/notes.dtd * test/valid/notes.xml: add the test case to the regression suite	2009-08-21 18:22:58 +02:00
Daniel Veillard	8bf64aef50	fix a problem reported by Ashwin for system parameter entities referenced * parser.c: fix a problem reported by Ashwin for system parameter entities referenced from entities in external subset, add a specific loading routine. * test/valid/dtds/external.ent test/valid/dtds/external2.ent test/valid/t11.xml result/valid/t11.xml*: added the test to the regression suite Daniel svn path=/trunk/; revision=3713	2008-03-24 20:45:21 +00:00
Daniel Veillard	57c9db0725	poblem with encoding detection for UTF-16 reported by Ashwin and found by * encoding.c: poblem with encoding detection for UTF-16 reported by Ashwin and found by Bill * test/valid/dtds/utf16b.ent test/valid/dtds/utf16l.ent test/valid/UTF16Entity.xml result/valid/UTF16Entity.xml*: added the example to the regression tests Daniel svn path=/trunk/; revision=3700	2008-03-06 14:37:10 +00:00
Daniel Veillard	9668826368	fixed bug #170489 reported by Jirka Kosek added the test to the regression * parser.c: fixed bug #170489 reported by Jirka Kosek * test/valid/objednavka.xml test/valid/dtds/objednavka.dtd result/valid/objednavka*: added the test to the regression suite. Daniel	2005-08-23 18:14:12 +00:00
William M. Brack	4119d1c61d	implemented bugfix from Massimo Morara for DTD dumping problem. added * valid.c: implemented bugfix from Massimo Morara for DTD dumping problem. * test/valid/t10.xml, result/valid/t10.: added regression for above configure.in: small change for my profile settings	2004-06-24 02:24:44 +00:00
Daniel Veillard	d45325589d	fixed #127877 , never output " in element content this changes the * entities.c: fixed #127877, never output " in element content * result/isolat3 result/slashdot16.xml result/noent/isolat3 result/noent/slashdot16.xml result/valid/REC-xml-19980210.xml result/valid/index.xml result/valid/xlink.xml: this changes the output of a few tests Daniel	2003-11-25 18:29:55 +00:00
Daniel Veillard	e70c877c83	swapped the attribute defaulting and attribute checking parts of parsing a * parser.c: swapped the attribute defaulting and attribute checking parts of parsing a new element start, fixes bug #127772 * result/valid/127772.* test/valid/127772.xml test/valid/dtds/127772.dtd: added the example in the regression tests Daniel	2003-11-25 07:21:18 +00:00
Daniel Veillard	05bcb7ed30	fixed to not send NULL to %s printing cleaning up some of the regression * HTMLparser.c: fixed to not send NULL to %s printing * python/tests/error.py result/HTML/doc3.htm.err result/HTML/test3.html.err result/HTML/wired.html.err result/valid/t8.xml.err result/valid/t8a.xml.err: cleaning up some of the regression tests error Daniel	2003-10-19 14:26:34 +00:00
Daniel Veillard	d96f6d3429	cleaning up XPath error reporting that time. applied the two patches for * error.c include/libxml/xmlerror.h include/libxml/xpath.h include/libxml/xpathInternals.h xpath.c: cleaning up XPath error reporting that time. * threads.c: applied the two patches for TLS threads on Windows from Jesse Pelton * parser.c: tiny safety patch for xmlStrPrintf() make sure the return is always zero terminated. Should also help detecting passing wrong buffer size easilly. * result/VC/* result/valid/rss.xml.err result/valid/xlink.xml.err: updated the results to follow the errors string generated by last commit. Daniel	2003-10-07 21:25:12 +00:00
Daniel Veillard	bb5ababa28	more cleanup in make tests more work in the transition to the new error * Makefile.am: more cleanup in make tests * error.c valid.c parser.c include/libxml/xmlerror.h: more work in the transition to the new error reporting strategy. * python/tests/reader2.py result/VC/* result/valid/*: few changes in the strings generated by the validation output Daniel	2003-10-03 22:21:51 +00:00
Daniel Veillard	2b8c4a151b	changed 'make tests' to use a concise output, scrolling to see where thing * Makefile.am: changed 'make tests' to use a concise output, scrolling to see where thing broke wasn't pleasant * configure.in: some beta4 preparation, but not ready yet * error.c globals.c include/libxml/globals.h include/libxml/xmlerror.h: new error handling code, last error informations are stored in the parsing context or a global variable, new APIs to handle the xmlErrorPtr type. * parser.c parserInternals.c valid.c : started migrating to the new error handling code, it's a royal pain. * include/libxml/parser.h include/libxml/parserInternals.h: moved the definition of xmlNewParserCtxt() * parser.c: small potential buffer access problem in push code provided by Justin Fletcher * result/.sax result/VC/PENesting result/namespaces/* result/valid/*.err: some error messages were sligthly changed. Daniel	2003-10-02 22:28:19 +00:00
Daniel Veillard	d9e9c9d8f3	fixing namespace DTD validations the output of defaulted namespaces is * SAX2.c: fixing namespace DTD validations * result/valid/ns2.xml result/valid/ns.xml: the output of defaulted namespaces is slightly different now. * Makefile.am: report the memory used in Timingtests (as well as time) Daniel	2003-09-18 22:03:46 +00:00
Daniel Veillard	bdbe0d4e78	factoring of more error handling code, serious size reduction and more * parser.c include/libxml/xmlerror.h: factoring of more error handling code, serious size reduction and more lisibility of the resulting code. * parserInternals.c parser.c include/libxml/parserInternals.h include/libxml/parser.h: changing the way VC:Proper Group/PE Nesting checks are done, use a counter for entities. Entities where freed and reallocated at the same address failing the check. * tree.c: avoid a warning * result/valid/* result/VC/*: this slightly changes some validation error messages. Daniel	2003-09-14 19:56:14 +00:00
Daniel Veillard	7b68df974b	fixed bug #118712 about mixed content, and namespaced element names. added * valid.c: fixed bug #118712 about mixed content, and namespaced element names. * test/valid/mixed_ns.xml result/valid/mixed_ns*: added a check in the regression tests Daniel	2003-08-03 22:58:54 +00:00
Daniel Veillard	8265a18a6a	do not generate " for " outside of attributes this changes the output * entities.c: do not generate " for " outside of attributes * result//*: this changes the output of some tests Daniel	2003-06-13 10:05:56 +00:00
William M. Brack	3b811174f7	Updated testfiles for error.c fix	2003-05-14 02:53:43 +00:00
Daniel Veillard	f431eb8144	applied the patch provided by Brent Hendricks fixing #105992 and * SAX.c test/valid/ns* test/result/ns*: applied the patch provided by Brent Hendricks fixing #105992 and integrated the examples in the testsuite. Daniel	2003-04-22 08:37:26 +00:00
Daniel Veillard	ef8dd7be29	fixing bug #108976 get the ID/REFs to reference the ID in the document * parser.c: fixing bug #108976 get the ID/REFs to reference the ID in the document content and not in the entity copy * SAX.c include/libxml/parser.h: more checking of the ID/REF stuff, better solution for #107208 * xmlregexp.c: removed a direct printf, dohhh * xmlreader.c: fixed a bug on streaming validation of empty elements in entities * result/VC/ElementValid8 test/VCM/v20.xml result/valid/xhtml1.xhtml: cleanup of the validation tests * test/valid/id* test/valid/dtds/destfoo.ent result/valid/id*: added more ID/IDREF tests to the suite Daniel	2003-03-23 12:02:56 +00:00
Daniel Veillard	d5c2f92df4	modified the existing APIs to handle XHTML1 serialization rules * tree.c include/libxml/tree.h: modified the existing APIs to handle XHTML1 serialization rules automatically, also add xmlIsXHTML() to libxml2 API. Some tweaking to make sure libxslt serialization uses it when needed without changing the library API. * test/xhtml1 result/noent/xhtml1 result/valid/xhtml1.xhtml result/xhtml1: added a new test specifically for xhtml1 output and updated the result of one XHTML1 test Daniel	2002-11-21 14:10:52 +00:00
Daniel Veillard	90d68fbb35	fixed bug #92518 validation error were not covering namespace * SAX.c valid.c include/libxml/valid.h: fixed bug #92518 validation error were not covering namespace declarations. * result/valid/dia.xml test/valid/dia.xml: the test wasn't valid, it was missing the attribute declaration for the namespace * result/VC/NS3: the fix now report breakages in that test Daniel	2002-09-26 16:10:21 +00:00
Daniel Veillard	76575769f3	working on better error reporting of validity errors, especially providing * error.c valid.c: working on better error reporting of validity errors, especially providing an accurate context. * result/valid/xlink.xml.err result/valid/rss.xml.err: better error reports in those cases. Daniel	2002-09-05 14:21:15 +00:00
Daniel Veillard	58e44c9daf	adding a new API for Christian Glahn: xmlParseBalancedChunkMemoryRecover * parser.c include/libxml/parser.h: adding a new API for Christian Glahn: xmlParseBalancedChunkMemoryRecover * valid.c: patch from Rick Jones for some grammar cleanup in validation messages * result/VC/* result/valid/*: this slightly change some of the regression tests outputs Daniel	2002-08-02 22:19:49 +00:00
Daniel Veillard	f5582f156c	applied a couple of patches from Peter Jacobi to start to get rid of * parser.c: applied a couple of patches from Peter Jacobi to start to get rid of ctxt->token, with a possible significant speed improvement to be gained once done. Better compliance with PE references constructs in DTDs too. * test/valid/t[0-9]* result/valid/t[0-9]*: added a set of tests from Peter too Daniel	2002-06-11 10:08:16 +00:00

1 2

66 Commits