PostgreSQL - how to run VACUUM from code outside transaction block?


Question

I am using Python with psycopg2 and I'm trying to run a full VACUUM after a daily operation which inserts several thousand rows. The problem is that when I try to run the VACUUM command within my code I get the following error:

psycopg2.InternalError: VACUUM cannot run inside a transaction block

How do I run this from the code outside a transaction block?

If it makes a difference, I have a simple DB abstraction class, a subset of which is displayed below for context (not runnable, exception-handling and docstrings omitted and line spanning adjustments made):

class db(object):
    def __init__(dbname, host, port, user, password):
        self.conn = psycopg2.connect("dbname=%s host=%s port=%s \
                                      user=%s password=%s" \
                                      % (dbname, host, port, user, password))

        self.cursor = self.conn.cursor()

    def _doQuery(self, query):
        self.cursor.execute(query)
        self.conn.commit()

    def vacuum(self):
        query = "VACUUM FULL"
        self._doQuery(query)
1
32
6/19/2009 11:26:33 AM

Accepted Answer

After more searching I have discovered the isolation_level property of the psycopg2 connection object. It turns out that changing this to 0 will move you out of a transaction block. Changing the vacuum method of the above class to the following solves it. Note that I also set the isolation level back to what it previously was just in case (seems to be 1 by default).

def vacuum(self):
    old_isolation_level = self.conn.isolation_level
    self.conn.set_isolation_level(0)
    query = "VACUUM FULL"
    self._doQuery(query)
    self.conn.set_isolation_level(old_isolation_level)

This article (near the end on that page) provides a brief explanation of isolation levels in this context.

53
6/19/2009 12:18:49 PM

While vacuum full is questionable in current versions of postgresql, forcing a 'vacuum analyze' or 'reindex' after certain massive actions can improve performance, or clean up disk usage. This is postgresql specific, and needs to be cleaned up to do the right thing for other databases.

from django.db import connection
# Much of the proxy is not defined until this is done
force_proxy = connection.cursor()
realconn=connection.connection
old_isolation_level = realconn.isolation_level
realconn.set_isolation_level(0)
cursor = realconn.cursor()
cursor.execute('VACUUM ANALYZE')
realconn.set_isolation_level(old_isolation_level)

Unfortunately the connection proxy provided by django doesn't provide access to set_isolation_level.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon