Cleanup the contrib/lo module: there is no need anymore to implement

a physically separate type.  Defining 'lo' as a domain over OID works
just fine and is more efficient.  Improve documentation and fix up the
test script.  (Would like to turn test script into a proper regression
test, but right now its output is not constant because of numeric OIDs;
plus it makes Unix-specific assumptions about files it can import.)
This commit is contained in:
Tom Lane 2005-06-23 00:06:37 +00:00
parent d20763dbee
commit 5b0c9d3603
6 changed files with 90 additions and 202 deletions

View File

@ -8,57 +8,77 @@ also), is that the specification assumes that references to BLOBS (Binary
Large OBjectS) are stored within a table, and if that entry is changed, the
associated BLOB is deleted from the database.
As PostgreSQL stands, this doesn't occur. It allocates an OID for each object,
and it is up to the application to store, and ultimately delete the objects.
As PostgreSQL stands, this doesn't occur. Large objects are treated as
objects in their own right; a table entry can reference a large object by
OID, but there can be multiple table entries referencing the same large
object OID, so the system doesn't delete the large object just because you
change or remove one such entry.
Now this is fine for new postgresql specific applications, but existing ones
using JDBC or ODBC wont delete the objects, arising to orphaning - objects
Now this is fine for new PostgreSQL-specific applications, but existing ones
using JDBC or ODBC won't delete the objects, resulting in orphaning - objects
that are not referenced by anything, and simply occupy disk space.
The Fix
I've fixed this by creating a new data type 'lo', some support functions, and
a Trigger which handles the orphaning problem.
a Trigger which handles the orphaning problem. The trigger essentially just
does a 'lo_unlink' whenever you delete or modify a value referencing a large
object. When you use this trigger, you are assuming that there is only one
database reference to any large object that is referenced in a
trigger-controlled column!
The 'lo' type was created because we needed to differentiate between plain
OIDs and Large Objects. Currently the JDBC driver handles this dilemma easily,
but (after talking to Byron), the ODBC driver needed a unique type. They had
created an 'lo' type, but not the solution to orphaning.
You don't actually have to use the 'lo' type to use the trigger, but it may be
convenient to use it to keep track of which columns in your database represent
large objects that you are managing with the trigger.
The 'lo' type was created because we needed to differenciate between normal
Oid's and Large Objects. Currently the JDBC driver handles this dilema easily,
but (after talking to Byron), the ODBC driver needed a unique type. They had created an 'lo' type, but not the solution to orphaning.
Install
Ok, first build the shared library, and install. Typing 'make install' in the
contrib/lo directory should do it.
Then, as the postgres super user, run the lo.sql script. This will install the
type, and define the support functions.
Then, as the postgres super user, run the lo.sql script in any database that
needs the features. This will install the type, and define the support
functions. You can run the script once in template1, and the objects will be
inherited by subsequently-created databases.
How to Use
The easiest way is by an example:
> create table image (title text,raster lo);
> create trigger t_image before update or delete on image for each row execute procedure lo_manage(raster);
> create table image (title text, raster lo);
> create trigger t_raster before update or delete on image
> for each row execute procedure lo_manage(raster);
Create a trigger for each column that contains a lo type, and give the column
name as the trigger procedure argument. You can have more than one trigger on
a table if you need multiple lo columns in the same table, but don't forget to
give a different name to each trigger.
Here, a trigger is created for each column that contains a lo type.
Issues
* dropping a table will still orphan any objects it contains, as the trigger
is not actioned.
* Dropping a table will still orphan any objects it contains, as the trigger
is not executed.
For now, precede the 'drop table' with 'delete from {table}'. However, this
could be fixed by having 'drop table' perform an additional
Avoid this by preceding the 'drop table' with 'delete from {table}'.
'select lo_unlink({colname}::oid) from {tablename}'
for each column, before actually dropping the table.
If you already have, or suspect you have, orphaned large objects, see
the contrib/vacuumlo module to help you clean them up. It's a good idea
to run contrib/vacuumlo occasionally as a back-stop to the lo_manage
trigger.
* Some frontends may create their own tables, and will not create the
associated trigger(s). Also, users may not remember (or know) to create
the triggers.
This can be solved, but would involve changes to the parser.
As the ODBC driver needs a permanent lo type (& JDBC could be optimised to
use it if it's Oid is fixed), and as the above issues can only be fixed by
some internal changes, I feel it should become a permanent built-in type.
@ -66,4 +86,3 @@ some internal changes, I feel it should become a permanent built-in type.
I'm releasing this into contrib, just to get it out, and tested.
Peter Mount <peter@retep.org.uk> June 13 1998

View File

@ -1,7 +1,7 @@
/*
* PostgreSQL type definitions for managed LargeObjects.
* PostgreSQL definitions for managed Large Objects.
*
* $PostgreSQL: pgsql/contrib/lo/lo.c,v 1.14 2003/11/29 19:51:35 pgsql Exp $
* $PostgreSQL: pgsql/contrib/lo/lo.c,v 1.15 2005/06/23 00:06:37 tgl Exp $
*
*/
@ -21,117 +21,12 @@
#define atooid(x) ((Oid) strtoul((x), NULL, 10))
/*
* This is the internal storage format for managed large objects
*
*/
/* forward declarations */
Datum lo_manage(PG_FUNCTION_ARGS);
typedef Oid Blob;
/*
* Various forward declarations:
*/
Blob *lo_in(char *str); /* Create from String */
char *lo_out(Blob * addr); /* Output oid as String */
Oid lo_oid(Blob * addr); /* Return oid as an oid */
Blob *lo(Oid oid); /* Return Blob based on oid */
Datum lo_manage(PG_FUNCTION_ARGS); /* Trigger handler */
/*
* This creates a large object, and sets its OID to the value in the
* supplied string.
*
* If the string is empty, then a new LargeObject is created, and its oid
* is placed in the resulting lo.
*/
Blob *
lo_in(char *str)
{
Blob *result;
Oid oid;
int count;
if (strlen(str) > 0)
{
count = sscanf(str, "%u", &oid);
if (count < 1)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("error in parsing \"%s\"", str)));
if (oid == InvalidOid)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("illegal oid: \"%s\"", str)));
}
else
{
/*
* There is no Oid passed, so create a new one
*/
oid = DatumGetObjectId(DirectFunctionCall1(lo_creat,
Int32GetDatum(INV_READ | INV_WRITE)));
if (oid == InvalidOid)
/* internal error */
elog(ERROR, "InvalidOid returned from lo_creat");
}
result = (Blob *) palloc(sizeof(Blob));
*result = oid;
return (result);
}
/*
* This simply outputs the Oid of the Blob as a string.
*/
char *
lo_out(Blob * addr)
{
char *result;
if (addr == NULL)
return (NULL);
result = (char *) palloc(32);
snprintf(result, 32, "%u", *addr);
return (result);
}
/*
* This function converts Blob to oid.
*
* eg: select lo_export(raster::oid,'/path/file') from table;
*
*/
Oid
lo_oid(Blob * addr)
{
if (addr == NULL)
return InvalidOid;
return (Oid) (*addr);
}
/*
* This function is used so we can convert oid's to lo's
*
* ie: insert into table values(lo_import('/path/file')::lo);
*
*/
Blob *
lo(Oid oid)
{
Blob *result = (Blob *) palloc(sizeof(Blob));
*result = oid;
return (result);
}
/*
* This handles the trigger that protects us from orphaned large objects
* This is the trigger that protects us from orphaned large objects
*/
PG_FUNCTION_INFO_V1(lo_manage);
@ -144,11 +39,10 @@ lo_manage(PG_FUNCTION_ARGS)
TupleDesc tupdesc; /* Tuple Descriptor */
HeapTuple rettuple; /* Tuple to be returned */
bool isdelete; /* are we deleting? */
HeapTuple newtuple = NULL; /* The new value for tuple */
HeapTuple newtuple; /* The new value for tuple */
HeapTuple trigtuple; /* The original value of tuple */
if (!CALLED_AS_TRIGGER(fcinfo))
/* internal error */
if (!CALLED_AS_TRIGGER(fcinfo)) /* internal error */
elog(ERROR, "not fired by trigger manager");
/*
@ -168,9 +62,12 @@ lo_manage(PG_FUNCTION_ARGS)
/* Are we deleting the row? */
isdelete = TRIGGER_FIRED_BY_DELETE(trigdata->tg_event);
/* Get the column were interested in */
/* Get the column we're interested in */
attnum = SPI_fnumber(tupdesc, args[0]);
if (attnum <= 0)
elog(ERROR, "column \"%s\" does not exist", args[0]);
/*
* Handle updates
*
@ -182,7 +79,7 @@ lo_manage(PG_FUNCTION_ARGS)
char *orig = SPI_getvalue(trigtuple, tupdesc, attnum);
char *newv = SPI_getvalue(newtuple, tupdesc, attnum);
if (orig != NULL && (newv == NULL || strcmp(orig, newv)))
if (orig != NULL && (newv == NULL || strcmp(orig, newv) != 0))
DirectFunctionCall1(lo_unlink,
ObjectIdGetDatum(atooid(orig)));
@ -196,7 +93,6 @@ lo_manage(PG_FUNCTION_ARGS)
* Handle deleting of rows
*
* Here, we unlink the large object associated with the managed attribute
*
*/
if (isdelete)
{

View File

@ -1,62 +1,29 @@
--
-- PostgreSQL code for LargeObjects
-- PostgreSQL code for managed Large Objects
--
-- $PostgreSQL: pgsql/contrib/lo/lo.sql.in,v 1.12 2005/01/29 22:35:02 tgl Exp $
-- $PostgreSQL: pgsql/contrib/lo/lo.sql.in,v 1.13 2005/06/23 00:06:37 tgl Exp $
--
--
-- Create the data type
--
-- used by the lo type, it takes an oid and returns an lo object
-- Adjust this setting to control where the objects get created.
SET search_path = public;
CREATE FUNCTION lo_in(cstring)
RETURNS lo
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE STRICT;
-- used by the lo type, it returns the oid of the object
CREATE FUNCTION lo_out(lo)
RETURNS cstring
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE STRICT;
-- finally the type itself
CREATE TYPE lo (
INTERNALLENGTH = 4,
EXTERNALLENGTH = variable,
INPUT = lo_in,
OUTPUT = lo_out
);
-- this returns the oid associated with a lo object
CREATE FUNCTION lo_oid(lo)
RETURNS oid
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE STRICT;
-- same function, named to allow it to be used as a type coercion, eg:
-- CREATE TABLE a (image lo);
-- SELECT image::oid FROM a;
--
CREATE FUNCTION oid(lo)
RETURNS oid
AS 'MODULE_PATHNAME', 'lo_oid'
LANGUAGE C IMMUTABLE STRICT;
CREATE CAST (lo as oid) WITH FUNCTION oid(lo) AS IMPLICIT;
-- Create the data type ... now just a domain over OID
--
-- this allows us to convert an oid to a managed lo object
-- ie: insert into test values (lo_import('/fullpath/file')::lo);
CREATE FUNCTION lo(oid)
RETURNS lo
AS 'MODULE_PATHNAME'
LANGUAGE C IMMUTABLE STRICT;
CREATE CAST (oid as lo) WITH FUNCTION lo(oid) AS IMPLICIT;
CREATE DOMAIN lo AS pg_catalog.oid;
--
-- For backwards compatibility, define a function named lo_oid.
--
-- The other functions that formerly existed are not needed because
-- the implicit casts between a domain and its underlying type handle them.
--
CREATE FUNCTION lo_oid(lo) RETURNS pg_catalog.oid AS
'SELECT $1::pg_catalog.oid' LANGUAGE SQL STRICT IMMUTABLE;
-- This is used in triggers
CREATE FUNCTION lo_manage()
RETURNS trigger
RETURNS pg_catalog.trigger
AS 'MODULE_PATHNAME'
LANGUAGE C;

View File

@ -1,15 +1,12 @@
--
-- This removes the type (and a test table)
-- This removes the LO type
-- It's used just for development
--
-- Adjust this setting to control where the objects get created.
SET search_path = public;
-- remove our test table
DROP TABLE a;
-- now drop the type and associated C functions
-- drop the type and associated functions
DROP TYPE lo CASCADE;
-- the trigger function has no dependency on the type, so drop separately

View File

@ -3,6 +3,11 @@
--
-- It's used just for development
--
-- XXX would be nice to turn this into a proper regression test
--
-- Check what is in pg_largeobject
SELECT count(DISTINCT loid) FROM pg_largeobject;
-- ignore any errors here - simply drop the table if it already exists
DROP TABLE a;
@ -11,18 +16,15 @@ DROP TABLE a;
CREATE TABLE a (fname name,image lo);
-- insert a null object
INSERT INTO a VALUES ('null');
-- insert an empty large object
INSERT INTO a VALUES ('empty','');
INSERT INTO a VALUES ('empty');
-- insert a large object based on a file
INSERT INTO a VALUES ('/etc/group',lo_import('/etc/group')::lo);
INSERT INTO a VALUES ('/etc/group', lo_import('/etc/group')::lo);
-- now select the table
SELECT * FROM a;
-- this select also returns an oid based on the lo column
-- check that coercion to plain oid works
SELECT *,image::oid from a;
-- now test the trigger
@ -32,7 +34,7 @@ FOR EACH ROW
EXECUTE PROCEDURE lo_manage(image);
-- insert
INSERT INTO a VALUES ('aa','');
INSERT INTO a VALUES ('aa', lo_import('/etc/hosts'));
SELECT * FROM a
WHERE fname LIKE 'aa%';
@ -43,7 +45,7 @@ SELECT * FROM a
WHERE fname LIKE 'aa%';
-- update the 'empty' row which should be null
UPDATE a SET image=lo_import('/etc/hosts')::lo
UPDATE a SET image=lo_import('/etc/hosts')
WHERE fname='empty';
SELECT * FROM a
WHERE fname LIKE 'empty%';
@ -60,10 +62,13 @@ WHERE fname LIKE 'aa%';
-- This deletes the table contents. Note, if you comment this out, and
-- expect the drop table to remove the objects, think again. The trigger
-- doesn't get thrown by drop table.
-- doesn't get fired by drop table.
DELETE FROM a;
-- finally drop the table
DROP TABLE a;
-- Check what is in pg_largeobject ... if different from original, trouble
SELECT count(DISTINCT loid) FROM pg_largeobject;
-- end of tests

View File

@ -1,9 +1,13 @@
$PostgreSQL: pgsql/contrib/vacuumlo/README.vacuumlo,v 1.4 2003/11/29 19:51:36 pgsql Exp $
$PostgreSQL: pgsql/contrib/vacuumlo/README.vacuumlo,v 1.5 2005/06/23 00:06:37 tgl Exp $
This is a simple utility that will remove any orphaned large objects out of a
PostgreSQL database. An orphaned LO is considered to be any LO whose OID
does not appear in any OID data column of the database.
If you use this, you may also be interested in the lo_manage trigger in
contrib/lo. lo_manage is useful to try to avoid creating orphaned LOs
in the first place.
Compiling
--------