Instead of ~1/4 the range we can use ~1/3 the range for better
distance. It is possible that using ~1/2 - 1 might be even better,
but this is a trivial tweak.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This adds copyright verbiage to the Perl scripts. Scripts that are
known to be clean w.r.t. the 2-clause BSD license are given that
license; unclear ones are given the "LGPL for now".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Compress macros.c by representing macro directives with a single byte.
We can do this because we only use the ASCII character range inside
the standard macro files.
Note: we could save significant additional space by not having a
pointer array, and instead relying on the fact that we sweep
sequentially through the output array.
Concentrate compiler dependencies to compiler.h; make sure compiler.h
is included first in every .c file (since some prototypes may depend
on the presence of feature request macros.)
Actually use the conditional inclusion of various functions (totally
broken in previous releases.)
Use the same crc64 that we already use for the symbol table hash as
the perfect hash function prehash. We appear to get radically faster
convergence this way, and the crc64 is probably *faster*, since the
table likely to be resident in memory.
Combining arithmetric (add) and bitwise (xor) mixing seems to give
better result than either.
With the new prehash function, we find a valid hash much quicker.
Speed up pptok.c by just doing |= 0x20 instead of calling tolower() for
every character during prehashing. This is good enough for our needs,
since we don't have any tokens containing the characters @ [ \ ] _ nor
any high-bit characters (in which case we'd have to worry about multibyte
anyway.)