implement the md5() SQL-level function). The old code did the
following:
1. de-toast the datum
2. convert it to a cstring via textout()
3. get the length of the cstring via strlen()
Since we are treating the datum context as a blob of binary data,
the latter two steps are unnecessary. Once the data has been
detoasted, we can just use it as-is, and derive its length from
the varlena metadata.
This patch improves some run-of-the-mill md5() computations by
just under 10% in my limited tests, and passes the regression tests.
I also noticed that md5_text() wasn't checking the return value
of md5_hash(); encountering OOM at precisely the right moment
could result in returning a random md5 hash. This patch corrects
that. A better fix would be to make md5_hash() only return on
success (and/or allocate via palloc()), but since it's used in
the frontend as well I don't see an easy way to do that.
and parsing work in PL/PgSQL:
- memory management is now done via palloc(). The compiled representation
of each function now has its own memory context. Therefore, the storage
consumed by a function can be reclaimed via MemoryContextDelete().
During compilation, the CurrentMemoryContext is the function's memory
context. This means that a palloc() is sufficient to allocate memory
that will have the same lifetime as the function itself. As a result,
code invoked during compilation should be careful to pfree() temporary
allocations to avoid leaking memory. Since a lot of the code in the
backend is not careful about releasing palloc'ed memory, that means
we should switch into a temporary memory context before invoking
backend functions. A temporary context appropriate for such allocations
is `compile_tmp_cxt'.
- The ability to use palloc() allows us to simply a lot of the code in
the parser. Rather than representing lists of elements via ad hoc
linked lists or arrays, we can use the List type. Rather than doing
malloc followed by memset(0), we can just use palloc0().
- We now check that the user has supplied the right number of parameters
to a RAISE statement. Supplying either too few or too many results in
an error (at runtime).
- PL/PgSQL's parser needs to accept arbitrary SQL statements. Since we
do not want to duplicate the SQL grammar in the PL/PgSQL grammar, this
means we need to be quite lax in what the PL/PgSQL grammar considers
a "SQL statement". This can lead to misleading behavior if there is a
syntax error in the function definition, since we assume a malformed
PL/PgSQL construct is a SQL statement. Furthermore, these errors were
only detected at runtime (when we tried to execute the alleged "SQL
statement" via SPI).
To rectify this, the patch changes the parser to invoke the main SQL
parser when it sees a string it believes to be a SQL expression. This
means that synctically-invalid SQL will be rejected during the
compilation of the PL/PgSQL function. This is only done when compiling
for "validation" purposes (i.e. at CREATE FUNCTION time), so it should
not impose a runtime overhead.
- Fixes for the various buffer overruns I've patched in stable branches
in the past few weeks. I've rewritten code where I thought it was
warranted (unlike the patches applied to older branches, which were
minimally invasive).
- Various other minor changes and cleanups.
- Updates to the regression tests.
+ # Determine if printf supports %1$ argument selection, e.g. %5$ selects
+ # the fifth argument after the printf print string.
+ # This is not in the C99 standard, but in the Single Unix Specification (SUS).
+ # It is used in our langauge translation strings.
Nicolai Tufar with configure changes by Bruce.
during flat-file writing. The only difference is that SnapshotSelf
would consider tuples of the 'current command' within the current
transaction as valid, where SnapshotNow wouldn't. We can eliminate
the need for this with one extra CommandCounterIncrement call before
we start reading the catalogs.
the AMI_OVERRIDE flag. The fact that TransactionLogFetch treats
BootstrapTransactionId as always committed is sufficient to make
bootstrap work, and getting rid of extra tests in heavily used code
paths seems like a win. The files produced by initdb are demonstrably
the same after this change.
file now identifies group members by usesysid not name; this avoids
needing to depend on SearchSysCache which we can't use during startup.
(The old representation was entirely broken anyway, since we did not
regenerate the file following RENAME USER.) It's only a 95% solution
because if the group membership list is big enough to be toasted out
of line, we cannot read it during startup. I think this will do for
the moment, until we have time to implement the planned pg_role
replacement for pg_group.
in GetNewTransactionId(). Since the limit value has to be computed
before we run any real transactions, this requires adding code to database
startup to scan pg_database and determine the oldest datfrozenxid.
This can conveniently be combined with the first stage of an attack on
the problem that the 'flat file' copies of pg_shadow and pg_group are
not properly updated during WAL recovery. The code I've added to
startup resides in a new file src/backend/utils/init/flatfiles.c, and
it is responsible for rewriting the flat files as well as initializing
the XID wraparound limit value. This will eventually allow us to get
rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add
a trigger to pg_database.
that return tuples (such as EXPLAIN). Per gripe from Michael Fuhr.
Side effect: fix an old bug that unintentionally disabled backward scans
for all SPI-created cursors.
column with a default expression. In that situation, we need to rewrite
the heap relation. To evaluate the new default expression, we use
ExecEvalExpr(); however, this can allocate memory in the current memory
context, and ATRewriteTable() does not switch out of the active portal's
heap memory context. The end result is a rather large memory leak (on
the order of gigabytes for a reasonably sized table).
This patch changes ATRewriteTable() to switch to the per-tuple memory
context before beginning the per-tuple loop. It also removes an explicit
heap_freetuple() in the loop, since that is no longer needed.
In an unrelated change, I noticed the code was scanning through the
attributes of the new tuple descriptor for each tuple of the old table.
I changed this to use precomputation, which should slightly speed up
the loop.
Thanks to steve@deefs.net for reporting the leak.
there are corner cases involving dropping toasted columns in which the
previous coding would fail, too: the new version of the table might not
have any TOAST table, but we'd still propagate possibly-wide values of
dropped columns forward.
form of CASE (eg, CASE 0 WHEN 1 THEN ...) can be constant-folded as it
was in 7.4. Also, avoid constant-folding result expressions that are
certainly unreachable --- the former coding was a bit cavalier about this
and could generate unexpected results for all-constant CASE expressions.
Add regression test cases. Per report from Vlad Marchenko.
tests. Contributed by Koju Iijima, review from Neil Conway, Gavin Sherry
and Tom Lane.
Also, fix error in description of WITH CHECK OPTION clause in the CREATE
VIEW reference page: it should be "CASCADED", not "CASCADE".
estimate to less than the number of values estimated for any one grouping
Var, as suggested by Manfred. This is intuitively right, and what's
more it puts the plan choices in the subselect regression test back the
way they were before ...
initially NULL. For 8.0 we changed the main executor to have this
behavior in an UPDATE of an array column, but plpgsql's equivalent case
was overlooked. Per report from Sven Willenberger.
clamp the estimated number of groups to table row count over 10, instead
of table row count; this reflects a heuristic that people probably won't
group over a near-unique set of columns, and the knowledge that we don't
currently have any way to estimate the correlation of the columns better
than guessing. This change creates a trivial plan change in one of the
regression tests.
look at the actual aggregate transition datatypes and the actual overhead
needed by nodeAgg.c, instead of using pessimistic round numbers.
Per a discussion with Michael Tiemann.
command. This is useful because we can allow truncation of tables
referenced by foreign keys, so long as the referencing table is
truncated in the same command.
Alvaro Herrera
to avoid problems when a cursor depends on objects created or changed in
the same subtransaction. We'd like to do better someday, but this seems
the only workable answer for 8.0.1.
column values in -d mode. Per report from Marty Scholes. This doesn't
completely solve the issue, because we still need multiple copies of the
field value, but at least one copy can be got rid of painlessly ...
pre-7.3 pg_dump archive files: namespace isn't there, and in some cases
te->tag may already be quotified. Per report from Alan Pevec and
followup testing.
pass if "default_with_oids" is set to false. I took the approach of
explicitly adding WITH OIDS to the CREATE TABLEs where necessary, rather
than tweaking the default_with_oids GUC var.
(1) Keep a pin on the scan's current buffer and mark buffer. This
avoids the need to do a ReadBuffer() for each tuple produced by the
scan. Since ReadBuffer() is expensive, this is a significant win.
(2) Convert a ReleaseBuffer(); ReadBuffer() pair into
ReleaseAndReadBuffer(). Surely not a huge win, but it saves a lock
acquire/release...
(3) Remove a bunch of duplicated code in rtget.c; make rtnext() handle
both the "initial result" and "subsequent result" cases.
(4) Add support for index tuple killing
(5) Remove rtscancache(): it is dead code, for the same reason that
gistscancache() is dead code (an index scan ought not be invoked with
NoMovementScanDirection).
The end result is about a 10% improvement in rtree index scan perf,
according to contrib/rtree_gist/bench.