Fixing custom sequences in PostgreSQL

PostgreSQL provides a mechanism called sequences, which I believe is extracted from ANSI-SQL92, though I’m too lazy to check, which are basically stateful counters that provide some helper functions. The primary use of sequences in PostgreSQL and Oracle, is to implement auto-incrementing counters as a table field.

PostgreSQL will automatically create a sequence when you use the “SERIAL” datatype for your field, and will take care of assigning the default value of your field as the result of the nextval() call on the sequence, so most of the time you don’t need to interact directly with sequences.

Where things can get hairy however, is when you back up your data using pg_dump or any other mechanism, and restore that data in a table that is already populated. A common scenario for example, is populating a table that already has some recent data, with some older data you’ve been storing in archival. The opposite is true if you are trying to archive data from a table to a backup database for example.

Whenever you manually have to provide a value for the field assigned to a sequence, you are pretty much guaranteed to break the sequence unless you take the time to nextval() your sequence until it is in sync. This is a real problem, as pg_dump does not include sequence synchronization in its output.

The quickest way to synchronize a sequence, based on my observations, is to run the following query, taking care to replace the name of the sequence [SEQ] and the name of the associated table [TABLE] and field [FIELD]:

SELECT SETVAL([SEQ], COALESCE((SELECT [FIELD] FROM [TABLE] ORDER BY [FIELD] DESC LIMIT 1), 0)+1)

This will fetch the highest value of [FIELD] in [TABLE], increment it by 1, and synchronize the sequence [SEQ] to the new value.

I’ve also written the following little Python script that will look for any non-system sequence in the specified database, and use this method to repair it. Let me know if it works out for you, or if you’d like to suggest some improvements.

Requirements: Python 2.5+, PsycoPG2 Python module (python-psycopg2)

#!/usr/bin/env python
# -*- coding: utf-8 -*-
 
 
# Standard imports
import sys
import os
import time
from optparse import OptionParser
 
# psycopg2 import
try:
    import psycopg2
except ImportError, e:
    print 'You must install the python module named "psycopg2" in order to use this module.'
    sys.exit(os.EX_SOFTWARE)
 
 
 
class PgRepairman:
    def __init__(self, options, parser):
        self.options = options
        try:
            dsn = "dbname=%s host=%s user=%s" % (options.db, options.host, options.user)
            dsn += ("" != options.passwd) and ("password=%s" % options.password) or ""
            self.conn = psycopg2.connect(dsn)
            self.curs = self.conn.cursor()
        except Exception, e:
            print "ERROR - %s" % e
            sys.exit(1)
 
 
    # Returns a dict for a psycopg2 row object
    def _to_dict(self, desc, res):
        return dict(zip([x[0] for x in desc], res))
 
 
    # Attempt to locate all custom sequences
    def findSequences(self):
        seq_query = """
        SELECT pc1.relname AS seq, pc2.relname AS table, c.attname AS field 
        FROM pg_depend, pg_class pc1, pg_class pc2, pg_attribute c 
        WHERE pc1.oid = pg_depend.objid 
            AND pc2.oid = pg_depend.refobjid 
            AND c.attnum = pg_depend.refobjsubid 
            AND c.attrelid = pc2.oid 
            AND pc1.relkind = 'S' 
            AND pc1.relname NOT LIKE 'pg_toast%%'
        """
 
        try:
            self.print_verbose(seq_query)
            self.curs.execute(seq_query)
        except Exception, e:
            print "[ERROR] - %s" % e
            sys.exit(1)
 
        desc = self.curs.description
        for row in self.curs.fetchall():
            yield self._to_dict(desc, row)
 
 
    # Increment the key value to the value of the sequence + 1
    def fixSequences(self):
        for seq in self.findSequences():
            print "Fixing sequence %s in table %s" % (seq['seq'], seq['table'])
            fix_query = "SELECT setval('%s', COALESCE((SELECT %s FROM %s ORDER BY %s DESC LIMIT 1), 0)+1)" % (seq['seq'], seq['field'], seq['table'], seq['field'])
            try:
                self.print_verbose(fix_query)
                self.curs.execute(fix_query)
            except Exception, e:
                print "[WARNING] - %s" % e
                pass
 
    def print_verbose(self, msg):
        if (True == self.options.verbose):
            print "[DEBUG] - %s" % msg
 
if __name__=='__main__':
    usage       = "Usage: %prog <options> [-v --verbose] [-u --username | -p --password \ -o --host | -d --database]"
    version     = "%prog v1.0\nDistributed under the LGPL2 License"
    description = "Increments all sequences in a PostgreSQL database"
    parser = OptionParser(usage=usage, version=version, description=description)
    parser.add_option("-v", "--verbose", action="store_true", dest="verbose", default=False, help="Enable extra output")
    parser.add_option("-o", "--host", action="store", dest="host", default="127.0.0.1", help="Database hostname/IP")
    parser.add_option("-u", "--username", action="store", dest="user", default="postgres", help="Database Username")
    parser.add_option("-p", "--password", action="store", dest="passwd", default="", help="Database Password")
    parser.add_option("-d", "--database", action="store", dest="db", default="template1", help="Database Name")
 
 
    try:
        (options, args) = parser.parse_args()
        obj = PgRepairman(options, parser)
        obj.fixSequences()
    except KeyboardInterrupt, e:
        sys.exit(1)

You can also download the script here.

Related posts:

  1. Extending PostgreSQL with Python
  2. Apache2 shortcuts