postgresql/contrib
Tom Lane afb0d0712f Replace the data structure used for keyword lookup.
Previously, ScanKeywordLookup was passed an array of string pointers.
This had some performance deficiencies: the strings themselves might
be scattered all over the place depending on the compiler (and some
quick checking shows that at least with gcc-on-Linux, they indeed
weren't reliably close together).  That led to very cache-unfriendly
behavior as the binary search touched strings in many different pages.
Also, depending on the platform, the string pointers might need to
be adjusted at program start, so that they couldn't be simple constant
data.  And the ScanKeyword struct had been designed with an eye to
32-bit machines originally; on 64-bit it requires 16 bytes per
keyword, making it even more cache-unfriendly.

Redesign so that the keyword strings themselves are allocated
consecutively (as part of one big char-string constant), thereby
eliminating the touch-lots-of-unrelated-pages syndrome.  And get
rid of the ScanKeyword array in favor of three separate arrays:
uint16 offsets into the keyword array, uint16 token codes, and
uint8 keyword categories.  That reduces the overhead per keyword
to 5 bytes instead of 16 (even less in programs that only need
one of the token codes and categories); moreover, the binary search
only touches the offsets array, further reducing its cache footprint.
This also lets us put the token codes somewhere else than the
keyword strings are, which avoids some unpleasant build dependencies.

While we're at it, wrap the data used by ScanKeywordLookup into
a struct that can be treated as an opaque type by most callers.
That doesn't change things much right now, but it will make it
less painful to switch to a hash-based lookup method, as is being
discussed in the mailing list thread.

Most of the change here is associated with adding a generator
script that can build the new data structure from the same
list-of-PG_KEYWORD header representation we used before.
The PG_KEYWORD lists that plpgsql and ecpg used to embed in
their scanner .c files have to be moved into headers, and the
Makefiles have to be taught to invoke the generator script.
This work is also necessary if we're to consider hash-based lookup,
since the generator script is what would be responsible for
constructing a hash table.

Aside from saving a few kilobytes in each program that includes
the keyword table, this seems to speed up raw parsing (flex+bison)
by a few percent.  So it's worth doing even as it stands, though
we think we can gain even more with a follow-on patch to switch
to hash-based lookup.

John Naylor, with further hacking by me

Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
2019-01-06 17:02:57 -05:00
..
adminpack Update copyright for 2019 2019-01-02 12:44:25 -05:00
amcheck Update copyright for 2019 2019-01-02 12:44:25 -05:00
auth_delay Update copyright for 2019 2019-01-02 12:44:25 -05:00
auto_explain Update copyright for 2019 2019-01-02 12:44:25 -05:00
bloom Update copyright for 2019 2019-01-02 12:44:25 -05:00
btree_gin Provide separate header file for built-in float types 2018-07-29 03:30:48 +02:00
btree_gist Remove WITH OIDS support, change oid catalog column visibility. 2018-11-20 16:00:17 -08:00
citext Add a 64-bit hash function for type citext. 2018-11-23 13:24:45 -05:00
cube Make float exponent output on Windows look the same as elsewhere. 2018-10-12 11:14:27 -04:00
dblink Update copyright for 2019 2019-01-02 12:44:25 -05:00
dict_int Update copyright for 2019 2019-01-02 12:44:25 -05:00
dict_xsyn Update copyright for 2019 2019-01-02 12:44:25 -05:00
earthdistance Fix earthdistance test suite function name typo. 2018-07-29 12:02:07 -07:00
file_fdw Update copyright for 2019 2019-01-02 12:44:25 -05:00
fuzzystrmatch Update copyright for 2019 2019-01-02 12:44:25 -05:00
hstore Fix hstore hash function for empty hstores upgraded from 8.4. 2018-11-24 09:59:49 +00:00
hstore_plperl Still further rethinking of build changes for macOS Mojave. 2018-10-18 14:55:23 -04:00
hstore_plpython Fix out-of-tree build for transform modules. 2018-09-16 18:46:45 +01:00
intagg Schema-qualify some references to regprocedure. 2016-06-10 10:41:58 -04:00
intarray Update copyright for 2019 2019-01-02 12:44:25 -05:00
isn Update copyright for 2019 2019-01-02 12:44:25 -05:00
jsonb_plperl Still further rethinking of build changes for macOS Mojave. 2018-10-18 14:55:23 -04:00
jsonb_plpython Remove redundant allocation 2018-10-05 17:10:58 +02:00
lo lo: Add test suite 2017-09-14 22:22:59 -04:00
ltree Allow btree comparison functions to return INT_MIN. 2018-10-05 16:01:29 -04:00
ltree_plpython Fix out-of-tree build for transform modules. 2018-09-16 18:46:45 +01:00
oid2name Add PGXS options to control TAP and isolation tests, take two 2018-12-03 09:27:35 +09:00
pageinspect Update copyright for 2019 2019-01-02 12:44:25 -05:00
passwordcheck Update copyright for 2019 2019-01-02 12:44:25 -05:00
pg_buffercache Remove WITH OIDS support, change oid catalog column visibility. 2018-11-20 16:00:17 -08:00
pg_freespacemap Default monitoring roles 2017-03-30 14:18:53 -04:00
pg_prewarm Update copyright for 2019 2019-01-02 12:44:25 -05:00
pg_standby Integrate recovery.conf into postgresql.conf 2018-11-25 16:33:40 +01:00
pg_stat_statements Replace the data structure used for keyword lookup. 2019-01-06 17:02:57 -05:00
pg_trgm Update copyright for 2019 2019-01-02 12:44:25 -05:00
pg_visibility Update copyright for 2019 2019-01-02 12:44:25 -05:00
pgcrypto Remove configure switch --disable-strong-random 2019-01-01 20:05:51 +09:00
pgrowlocks Replace AclObjectKind with ObjectType 2018-01-19 14:01:15 -05:00
pgstattuple Update copyright for 2019 2019-01-02 12:44:25 -05:00
postgres_fdw Update copyright for 2019 2019-01-02 12:44:25 -05:00
seg Make float exponent output on Windows look the same as elsewhere. 2018-10-12 11:14:27 -04:00
sepgsql Update copyright for 2019 2019-01-02 12:44:25 -05:00
spi Remove timetravel extension. 2018-10-11 11:43:56 -07:00
sslinfo Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
start-scripts Remove contrib/start-scripts/osx/. 2017-11-17 12:53:20 -05:00
tablefunc Update copyright for 2019 2019-01-02 12:44:25 -05:00
tcn Update copyright for 2019 2019-01-02 12:44:25 -05:00
test_decoding Update copyright for 2019 2019-01-02 12:44:25 -05:00
tsm_system_rows Update copyright for 2019 2019-01-02 12:44:25 -05:00
tsm_system_time Update copyright for 2019 2019-01-02 12:44:25 -05:00
unaccent unaccent: Make generate_unaccent_rules.py Python 3 compatible 2019-01-04 11:12:31 +01:00
uuid-ossp Update copyright for 2019 2019-01-02 12:44:25 -05:00
vacuumlo Update copyright for 2019 2019-01-02 12:44:25 -05:00
xml2 Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
contrib-global.mk
Makefile Transforms for jsonb to PL/Perl 2018-04-03 09:47:18 -04:00
README

The PostgreSQL contrib tree
---------------------------

This subtree contains porting tools, analysis utilities, and plug-in
features that are not part of the core PostgreSQL system, mainly
because they address a limited audience or are too experimental to be
part of the main source tree.  This does not preclude their
usefulness.

User documentation for each module appears in the main SGML
documentation.

When building from the source distribution, these modules are not
built automatically, unless you build the "world" target.  You can
also build and install them all by running "make all" and "make
install" in this directory; or to build and install just one selected
module, do the same in that module's subdirectory.

Some directories supply new user-defined functions, operators, or
types.  To make use of one of these modules, after you have installed
the code you need to register the new SQL objects in the database
system by executing a CREATE EXTENSION command.  In a fresh database,
you can simply do

    CREATE EXTENSION module_name;

See the PostgreSQL documentation for more information about this
procedure.