postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2025-01-30 19:00:29 +08:00

History

Andrew Gierth c8ea87e4bd Avoid quadratic slowdown in regexp match/split functions. regexp_matches, regexp_split_to_table and regexp_split_to_array all work by compiling a list of match positions as character offsets (NOT byte positions) in the source string. Formerly, they then used text_substr to extract the matched text; but in a multi-byte encoding, that counts the characters in the string, and the characters needed to reach the starting byte position, on every call. Accordingly, the performance degraded as the product of the input string length and the number of match positions, such that splitting a string of a few hundred kbytes could take many minutes. Repair by keeping the wide-character copy of the input string available (only in the case where encoding_max_length is not 1) after performing the match operation, and extracting substrings from that instead. This reduces the complexity to being linear in the number of result bytes, discounting the actual regexp match itself (which is not affected by this patch). In passing, remove cleanup using retail pfree() which was obsoleted by commit `ff428cded` (Feb 2008) which made cleanup of SRF multi-call contexts automatic. Also increase (to ~134 million) the maximum number of matches and provide an error message when it is reached. Backpatch all the way because this has been wrong forever. Analysis and patch by me; review by Kaiting Chen. Discussion: https://postgr.es/m/87pnyn55qh.fsf@news-spur.riddles.org.uk see also https://postgr.es/m/87lg996g4r.fsf@news-spur.riddles.org.uk		2018-08-28 12:17:33 +01:00
..
backend	Avoid quadratic slowdown in regexp match/split functions.	2018-08-28 12:17:33 +01:00
bin	pg_verify_checksums: Message style improvements and NLS support	2018-08-28 11:49:11 +02:00
common	Require a C99-compliant snprintf(), and remove related workarounds.	2018-08-16 13:01:09 -04:00
fe_utils	Fix lexing of standard multi-character operators in edge cases.	2018-08-23 21:42:40 +01:00
include	Code review for simplehash.h.	2018-08-28 12:32:22 +12:00
interfaces	Fix lexing of standard multi-character operators in edge cases.	2018-08-23 21:42:40 +01:00
makefiles	Provide for contrib and pgxs modules to install include files.	2018-07-31 20:07:39 +01:00
pl	Fix snapshot leak warning for some procedures	2018-08-27 22:16:15 +02:00
port	Clean up assorted misuses of snprintf()'s result value.	2018-08-15 16:29:31 -04:00
template
test	Improve VACUUM and ANALYZE by avoiding early lock queue	2018-08-27 09:11:12 +09:00
timezone	Update time zone data files to tzdata release 2018e.	2018-05-09 13:56:22 -04:00
tools	Require C99 (and thus MSCV 2013 upwards).	2018-08-23 18:33:57 -07:00
tutorial	Deduplicate "invalid input syntax" messages for various types.	2018-07-22 14:58:01 -07:00
.gitignore
DEVELOPERS
Makefile
Makefile.global.in	Ensure we build generated headers at the start of some more cases.	2018-07-30 18:04:39 -04:00
Makefile.shlib
nls-global.mk