Extending PostgreSQL with Python

One of the features I enjoy the most about PostgreSQL is the ability to write stored procedures in C, Perl, TCL, PgSQL, and yes… obviously also in Python. I’ve been using this feature since 7.4, so any recent version of PostgreSQL is pretty much guaranteed to support it, but you’ll need to have the pl/python procedural language contrib module installed. Once it’s installed, you can activate it for your current database using the following query:

CREATE PROCEDURAL LANGUAGE plpythonu;

Once the language bindings have been activated, you can start writing your stored procedures in python, however you should really read up on the following subjects first:

As an example, I’ve written a little SP in PL/Python to provide support for PCRE since the stock distribution of PostgreSQL only supports LIKE/SIMILAR, and POSIX Style Regular Expressions.

Let’s create our Python language binding, and create a standard text storage table:

-- Activate PL/Python
CREATE PROCEDURAL LANGUAGE plpythonu;
 
-- Create a plain text-storage table
CREATE TABLE text_storage
(
  id serial NOT NULL,
  payload CHARACTER VARYING(128),
  CONSTRAINT text_storage_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE text_storage OWNER TO xavier;
 
-- Let's throw in an index on the payload field for good measure
CREATE INDEX txt_payload_idx
  ON text_storage
  USING btree
  (payload);

Let’s now populate our table with some junk data:

INSERT INTO text_storage (payload) VALUES ('hello, world');
INSERT INTO text_storage (payload) VALUES ('the quick brown fox, blah blah blah');
INSERT INTO text_storage (payload) VALUES ('PCREs in Postgres');
INSERT INTO text_storage (payload) VALUES ('All hail Python!');
INSERT INTO text_storage (payload) VALUES ('Hello, test data!');
INSERT INTO text_storage (payload) VALUES ('Python would like to say Hello!');

And now we can go ahead and create our Python SP itself:

CREATE OR REPLACE FUNCTION pcre(text, text)
  RETURNS INTEGER AS
$BODY$import re
 
regex  = args[0]
in_str = args[1]
 
compiled = re.compile(regex)
 
IF compiled.search(in_str):
	RETURN 1
ELSE:
	RETURN 0$BODY$
  LANGUAGE 'plpythonu' VOLATILE
  COST 100;

As you can see, our PCRE matching system is extremely simple, yet pretty powerful. We import Python’s built-in re module, compile the specified regex argument, then attempt to match it against the other argument. Here’s a usage example on our test table:

SELECT id, payload FROM text_storage WHERE pcre('[H|h]ello', payload) = 1;
 id |             payload             
----+---------------------------------
  1 | hello, world
  5 | Hello, test DATA!
  6 | Python would LIKE TO say Hello!
(3 ROWS)

As always, feel free to suggest any improvements.

Related posts:

  1. Fixing custom sequences in PostgreSQL
  2. Ahh those cool little CLI tools…
  3. Apache2 shortcuts
  4. Algorithms in Python: Base Expansion
  5. Algorithms in Python: Binary exponentiation

3 thoughts on “Extending PostgreSQL with Python

  1. I’m using PostgreSQL 8.4 on Windows, and I have not found any way to enable Python for stored functions. Have I overlooked something? Has support for Python been removed from the Windows distribution or installer?

  2. any idea what performance is like with this? Tcl regex syntax is a bit of a pain-in-the-arse if you’re used to PCRE. I’m hoping to expose an interface on the web for regex querying against a PostgreSQL database, and I would rather let users use PCRE-style regexs.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">