Skip to main content

Huge archive generation with PostgreSQL cluster??

 Recently I found huge archive logs generation on one of the servers(for my client). Around 3500 files per day and it needs 3500 * 16 MB = 54GB space every day and space is low on the server. First thing came to my mind was checkpoint parameters, I had a look at them and they were at default. Hmm, so had changed them as below:


checkpoint_segments = 180
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9 

Monitored for couple of days and found that archive logs generation decreased to 50%(around 1700 files). 

With the above parameter settings, archive log generation should be minimal. However if not, Here are some general suggestions to look at:

-- Make sure that you don't UPDATE a row to the same values it already has. The row will be logged even if old and new values match. One way to handle this is to make sure that any UPDATE statements which might set a column to its existing value include in the WHERE clause a test that oldvalue IS DISTINCT FROM new value. Or you can consider creating trigger on problem tables using the Postgres built-in function suppress_redundant_updates_trigger (you might want to look here: http://www.postgresql.org/docs/9.3/static/functions-trigger.html) 

-- Use one UPDATE statement instead of several where possible. If updates to different columns or from multiple processing steps can be combined, this can reduce WAL logging.

-- Consider using UNLOGGED or TEMPORARY tables where possible. For example, if data is inserted into the database and updated right away, perform the initial steps in an UNLOGGED or TEMPORARY table, and INSERT into the main table once the data has been prepared.

-- Identify and eliminate unused indexes. Index changes are WAL-logged so that indexes can be recovered on a crash. The storage space needed for indexes can be more than the storage space needed for the data itself, with a corresponding increase in generated WAL.

Comments

Popular posts from this blog

PostgreSQL Vacuum and Vacuum full are not two different processes

  PostgreSQL’s   VACUUM   and   VACUUM FULL   are not separate processes but rather different operational modes of the same maintenance command. Here’s why: Core Implementation Both commands share the same underlying codebase and are executed through the  vacuum_rel()  function in PostgreSQL’s source code ( src/backend/commands/vacuum.c ). The key distinction lies in the  FULL  option, which triggers additional steps: Standard  VACUUM : Removes dead tuples (obsolete rows) and marks space reusable  within PostgreSQL Updates the visibility map to optimize future queries Runs concurrently with read/write operations VACUUM FULL : Rewrites the entire table into a new disk file, compressing it and reclaiming space for the  operating system Rebuilds all indexes and requires an  ACCESS EXCLUSIVE  lock, blocking other operations Key Differences in Behavior Aspect Standard VACUUM VACUUM FULL Space Reclamation Internal reuse onl...

Job scheduler for PostgreSQL "pg_cron"

What is pg_cron   : -   pg_cron is a simple cron-based job scheduler for   PostgreSQL (9.5 or higher)   that runs inside the database as an extension. It uses the same syntax as regular cron, but it allows you to schedule PostgreSQL commands directly from the database . Why We need it ? Running periodic maintenance jobs or removing old data is a common requirement in PostgreSQL. A simple way to achieve this is to configure cron or another external daemon to periodically connect to the database and run a command. Let's see how it's works  Step 1 :-  For implementing/Installation of pg_cron you need to download source code from git Dowload link  export PATH=/usr/local/pgsql/bin:$PATH wget https://github.com/citusdata/pg_cron/archive/master.zip unzip master cd pg_cron-master/ make make install    Step 2 : - To start the pg_cron background worker when PostgreSQL starts, you need to add pg_cron to  shared_preload_libraries   in post...

All about pg_hba.conf(authentication methods- Postgresql)

  pg_hba.conf is the PostgreSQL access policy configuration file, which is located in the /var/lib/pgsql/10/data/ directory (PostgreSQL10) by default. The configuration file has 5 parameters, namely: TYPE (host type), DATABASE (database name), USER (user name), ADDRESS (IP address and mask), METHOD (encryption method) host all all 192.168.109.103/22 md5 host dbName user 192.168.109.106/22 trust Modify the server-side pg_hba.conf file Make the shell can connect to the postgres database secretly: Modify the authentication file $PGDATA/pg_hba.conf, add the following lines, and reload to make the configuration take effect immediately. host pankajconnect postgresql 192.168.8.103/32 trust Reload to take effect: pg_ctl reload -D $PGDATA Examples: 1. Allow local login to the database using PGAdmin3, database address  localhost, user user1, database user1db: host user1db user1 127.0.0.1/32 md5 2. Allow 10.1.1.0~10.1.1.255 network segments to log in to the database: host all all 10.1.1....