mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-12-09 08:10:09 +08:00
221 lines
8.5 KiB
Plaintext
221 lines
8.5 KiB
Plaintext
|
|
-- EAN13 - UPC - ISBN (books) - ISMN (music) - ISSN (serials)
|
|
-------------------------------------------------------------
|
|
|
|
Copyright Germán Méndez Bravo (Kronuz), 2004 - 2006
|
|
This module is released under the same BSD license as the rest of PostgreSQL.
|
|
|
|
The information to implement this module was collected through
|
|
several sites, including:
|
|
http://www.isbn-international.org/
|
|
http://www.issn.org/
|
|
http://www.ismn-international.org/
|
|
http://www.wikipedia.org/
|
|
the prefixes used for hyphenation where also compiled from:
|
|
http://www.gs1.org/productssolutions/idkeys/support/prefix_list.html
|
|
http://www.isbn-international.org/en/identifiers.html
|
|
http://www.ismn-international.org/ranges.html
|
|
Care was taken during the creation of the algorithms and they
|
|
were meticulously verified against the suggested algorithms
|
|
in the official ISBN, ISMN, ISSN User Manuals.
|
|
|
|
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
|
THIS MODULE IS PROVIDED "AS IS" AND WITHOUT ANY WARRANTY
|
|
OF ANY KIND, EXPRESS OR IMPLIED.
|
|
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
|
|
|
|
-- Content of the Module
|
|
-------------------------------------------------
|
|
|
|
This directory contains definitions for a few PostgreSQL
|
|
data types, for the following international-standard namespaces:
|
|
EAN13, UPC, ISBN (books), ISMN (music), and ISSN (serials). This module
|
|
is inspired by Garrett A. Wollman's isbn_issn code.
|
|
|
|
I wanted the database to fully validate numbers and also to use the
|
|
upcoming ISBN-13 and the EAN13 standards, as well as to have it
|
|
automatically doing hyphenations for ISBN numbers.
|
|
|
|
This new module validates, and automatically adds the correct
|
|
hyphenations to the numbers. Also, it supports the new ISBN-13
|
|
numbers to be used starting in January 2007.
|
|
|
|
Premises:
|
|
1. ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers
|
|
2. EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some are)
|
|
3. some ISBN13 numbers can be displayed as ISBN
|
|
4. some ISMN13 numbers can be displayed as ISMN
|
|
5. some ISSN13 numbers can be displayed as ISSN
|
|
6. all UPC, ISBN, ISMN and ISSN can be represented as EAN13 numbers
|
|
|
|
Note: All types are internally represented as 64 bit integers,
|
|
and internally all are consistently interchangeable.
|
|
|
|
We have the following data types:
|
|
|
|
+ EAN13 for European Article Numbers.
|
|
This type will always show the EAN13-display format.
|
|
Te output function for this is -> ean13_out()
|
|
|
|
+ ISBN13 for International Standard Book Numbers to be displayed in
|
|
the new EAN13-display format.
|
|
+ ISMN13 for International Standard Music Numbers to be displayed in
|
|
the new EAN13-display format.
|
|
+ ISSN13 for International Standard Serial Numbers to be displayed
|
|
in the new EAN13-display format.
|
|
These types will always display the long version of the ISxN (EAN13)
|
|
The output function to do this is -> ean13_out()
|
|
* The need for these types is just for displaying in different
|
|
ways the same data:
|
|
ISBN13 is actually the same as ISBN, ISMN13=ISMN and ISSN13=ISSN.
|
|
|
|
+ ISBN for International Standard Book Numbers to be displayed in
|
|
the current short-display format.
|
|
+ ISMN for International Standard Music Numbers to be displayed in
|
|
the current short-display format.
|
|
+ ISSN for International Standard Serial Numbers to be displayed
|
|
in the current short-display format.
|
|
These types will display the short version of the ISxN (ISxN 10)
|
|
whenever it's possible, and it will show ISxN 13 when it's
|
|
impossible to show the short version.
|
|
The output function to do this is -> isn_out()
|
|
|
|
+ UPC for Universal Product Codes.
|
|
UPC numbers are a subset of the EAN13 numbers (they are basically
|
|
EAN13 without the first '0' digit.)
|
|
The output function to do this is also -> isn_out()
|
|
|
|
We have the following input functions:
|
|
+ To take a string and return an EAN13 -> ean13_in()
|
|
+ To take a string and return valid ISBN or ISBN13 numbers -> isbn_in()
|
|
+ To take a string and return valid ISMN or ISMN13 numbers -> ismn_in()
|
|
+ To take a string and return valid ISSN or ISSN13 numbers -> issn_in()
|
|
+ To take a string and return an UPC codes -> upc_in()
|
|
|
|
We are able to cast from:
|
|
+ ISBN13 -> EAN13
|
|
+ ISMN13 -> EAN13
|
|
+ ISSN13 -> EAN13
|
|
|
|
+ ISBN -> EAN13
|
|
+ ISMN -> EAN13
|
|
+ ISSN -> EAN13
|
|
+ UPC -> EAN13
|
|
|
|
+ ISBN <-> ISBN13
|
|
+ ISMN <-> ISMN13
|
|
+ ISSN <-> ISSN13
|
|
|
|
We have two operator classes (for btree and for hash) so each data type
|
|
can be indexed for faster access.
|
|
|
|
The C API is implemented as:
|
|
extern Datum isn_out(PG_FUNCTION_ARGS);
|
|
extern Datum ean13_out(PG_FUNCTION_ARGS);
|
|
extern Datum ean13_in(PG_FUNCTION_ARGS);
|
|
extern Datum isbn_in(PG_FUNCTION_ARGS);
|
|
extern Datum ismn_in(PG_FUNCTION_ARGS);
|
|
extern Datum issn_in(PG_FUNCTION_ARGS);
|
|
extern Datum upc_in(PG_FUNCTION_ARGS);
|
|
|
|
On success:
|
|
+ isn_out() takes any of our types and returns a string containing
|
|
the shortes possible representation of the number.
|
|
|
|
+ ean13_out() takes any of our types and returns the
|
|
EAN13 (long) representation of the number.
|
|
|
|
+ ean13_in() takes a string and return a EAN13. Which, as stated in (2)
|
|
could or could not be any of our types, but it certainly is an EAN13
|
|
number. Only if the string is a valid EAN13 number, otherwise it fails.
|
|
|
|
+ isbn_in() takes a string and return an ISBN/ISBN13. Only if the string
|
|
is really a ISBN/ISBN13, otherwise it fails.
|
|
|
|
+ ismn_in() takes a string and return an ISMN/ISMN13. Only if the string
|
|
is really a ISMN/ISMN13, otherwise it fails.
|
|
|
|
+ issn_in() takes a string and return an ISSN/ISSN13. Only if the string
|
|
is really a ISSN/ISSN13, otherwise it fails.
|
|
|
|
+ upc_in() takes a string and return an UPC. Only if the string is
|
|
really a UPC, otherwise it fails.
|
|
|
|
(on failure, the functions 'ereport' the error)
|
|
|
|
-- Testing/Playing Functions
|
|
-------------------------------------------------
|
|
isn_weak(boolean) - Sets the weak input mode.
|
|
This function is intended for testing use only!
|
|
isn_weak() gets the current status of the weak mode.
|
|
|
|
"Weak" mode is used to be able to insert "invalid" data to a table.
|
|
"Invalid" as in the check digit being wrong, not missing numbers.
|
|
|
|
Why would you want to use the weak mode? well, it could be that
|
|
you have a huge collection of ISBN numbers, and that there are so many of
|
|
them that for weird reasons some have the wrong check digit (perhaps the
|
|
numbers where scanned from a printed list and the OCR got the numbers wrong,
|
|
perhaps the numbers were manually captured... who knows.) Anyway, the thing
|
|
is you might want to clean the mess up, but you still want to be able to have
|
|
all the numbers in your database and maybe use an external tool to access
|
|
the invalid numbers in the database so you can verify the information and
|
|
validate it more easily; as selecting all the invalid numbers in the table.
|
|
|
|
When you insert invalid numbers in a table using the weak mode, the number
|
|
will be inserted with the corrected check digit, but it will be flagged
|
|
with an exclamation mark ('!') at the end (i.e. 0-11-000322-5!)
|
|
|
|
You can also force the insertion of invalid numbers even not in the weak mode,
|
|
appending the '!' character at the end of the number.
|
|
|
|
To work with invalid numbers, you can use two functions:
|
|
+ make_valid(), which validates an invalid number (deleting the invalid flag)
|
|
+ is_valid(), which checks for the invalid flag presence.
|
|
|
|
-- Examples of Use
|
|
-------------------------------------------------
|
|
--Using the types directly:
|
|
select isbn('978-0-393-04002-9');
|
|
select isbn13('0901690546');
|
|
select issn('1436-4522');
|
|
|
|
--Casting types:
|
|
-- note that you can only cast from ean13 to other type when the casted
|
|
-- number would be valid in the realm of the casted type;
|
|
-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
|
|
-- but these will:
|
|
select upc(ean13('0220356483481'));
|
|
select ean13(upc('220356483481'));
|
|
|
|
--Create a table with a single column to hold ISBN numbers:
|
|
create table test ( id isbn );
|
|
insert into test values('9780393040029');
|
|
|
|
--Automatically calculating check digits (observe the '?'):
|
|
insert into test values('220500896?');
|
|
insert into test values('978055215372?');
|
|
|
|
select issn('3251231?');
|
|
select ismn('979047213542?');
|
|
|
|
--Using the weak mode:
|
|
select isn_weak(true);
|
|
insert into test values('978-0-11-000533-4');
|
|
insert into test values('9780141219307');
|
|
insert into test values('2-205-00876-X');
|
|
select isn_weak(false);
|
|
|
|
select id from test where not is_valid(id);
|
|
update test set id=make_valid(id) where id = '2-205-00876-X!';
|
|
|
|
select * from test;
|
|
|
|
select isbn13(id) from test;
|
|
|
|
-- Contact
|
|
-------------------------------------------------
|
|
Please suggestions or bug reports to kronuz at users.sourceforge.net
|
|
|
|
Last reviewed on August 23, 2006 by Kronuz.
|