Skip to main content

PostgreSQL Vacuum and Vacuum full are not two different processes

 PostgreSQL’s VACUUM and VACUUM FULL are not separate processes but rather different operational modes of the same maintenance command. Here’s why:

Core Implementation

Both commands share the same underlying codebase and are executed through the vacuum_rel() function in PostgreSQL’s source code (src/backend/commands/vacuum.c). The key distinction lies in the FULL option, which triggers additional steps:

  • Standard VACUUM:
    • Removes dead tuples (obsolete rows) and marks space reusable within PostgreSQL
    • Updates the visibility map to optimize future queries
    • Runs concurrently with read/write operations
  • VACUUM FULL:
    • Rewrites the entire table into a new disk file, compressing it and reclaiming space for the operating system
    • Rebuilds all indexes and requires an ACCESS EXCLUSIVE lock, blocking other operations

Key Differences in Behavior

AspectStandard VACUUMVACUUM FULL
Space ReclamationInternal reuse onlyOS-level space release
LockingNon-blockingFull table lock
Performance ImpactLightweight, incrementalHeavy, resource-intensive
Use CaseRoutine maintenanceSevere table bloat remediation

 

Why They Aren’t Separate Processes

  • Shared Code Path: Both use the same core logic for dead-tuple identification and cleanup. VACUUM FULL adds a table-rewrite step by calling cluster_rel()
  • Configuration Integration: Parameters like autovacuum_vacuum_scale_factor apply to both, and autovacuum workers handle standard VACUUM by default
  • Unified Command Structure: The FULL option is a modifier rather than a standalone tool, as seen in the SQL syntax:

When to Use Each

  • Standard VACUUM: Daily maintenance to prevent bloat from MVCC dead tuples
  • VACUUM FULL: Rarely, for extreme cases where table size has grown uncontrollably due to long-unvacuumed updates/deletes

In summary, while their outcomes differ significantly, VACUUM and VACUUM FULL are part of a single maintenance framework, differentiated primarily by the aggressiveness of space reclamation and locking behavior.

Comments

  1. Very Useful blog for health checks, thanks for sharing

    ReplyDelete

Post a Comment

Popular posts from this blog

Job scheduler for PostgreSQL "pg_cron"

What is pg_cron   : -   pg_cron is a simple cron-based job scheduler for   PostgreSQL (9.5 or higher)   that runs inside the database as an extension. It uses the same syntax as regular cron, but it allows you to schedule PostgreSQL commands directly from the database . Why We need it ? Running periodic maintenance jobs or removing old data is a common requirement in PostgreSQL. A simple way to achieve this is to configure cron or another external daemon to periodically connect to the database and run a command. Let's see how it's works  Step 1 :-  For implementing/Installation of pg_cron you need to download source code from git Dowload link  export PATH=/usr/local/pgsql/bin:$PATH wget https://github.com/citusdata/pg_cron/archive/master.zip unzip master cd pg_cron-master/ make make install    Step 2 : - To start the pg_cron background worker when PostgreSQL starts, you need to add pg_cron to  shared_preload_libraries   in post...

All about pg_hba.conf(authentication methods- Postgresql)

  pg_hba.conf is the PostgreSQL access policy configuration file, which is located in the /var/lib/pgsql/10/data/ directory (PostgreSQL10) by default. The configuration file has 5 parameters, namely: TYPE (host type), DATABASE (database name), USER (user name), ADDRESS (IP address and mask), METHOD (encryption method) host all all 192.168.109.103/22 md5 host dbName user 192.168.109.106/22 trust Modify the server-side pg_hba.conf file Make the shell can connect to the postgres database secretly: Modify the authentication file $PGDATA/pg_hba.conf, add the following lines, and reload to make the configuration take effect immediately. host pankajconnect postgresql 192.168.8.103/32 trust Reload to take effect: pg_ctl reload -D $PGDATA Examples: 1. Allow local login to the database using PGAdmin3, database address  localhost, user user1, database user1db: host user1db user1 127.0.0.1/32 md5 2. Allow 10.1.1.0~10.1.1.255 network segments to log in to the database: host all all 10.1.1....