Posts

Showing posts from August, 2021

Installing pgbadger on Linux for PostgreSQL log analysis

As a DBA, we often encounter scenarios when you need to track PostgreSQL database performance by analyzing workloads and identify bottlenecks if any. There comes  pgbadger  – an open source log analyzer for PostgreSQL. It parse the PostgreSQL log files and gives you a nice overview of the PostgreSQL cluster including various query metrics like number of query executions, their types and duration, session details, Locks statistics, warnings, errors and other event patterns etc. All of these details will be available in a nice HTML format on your web browser. Installation: Prerequisite: The  pgbadger  is written in pure  Perl  and uses a JavaScript library ( flotr2 ) to draw graphs. Hence, you need to ensure that a modern Perl distribution is available in your system. Charts are rendered using a JavaScript library and Your web browser will do all the work. Nothing additional is required here. Use the below command to install  perl  if it is not already installed in your system. $ yum ins

InitDB uses in Postgresql

  It will create a new PostgreSQL cluster. In Postgres cluster is a collection of databases that are managed by a single cluster. It will create the data directories in which the database data will live, generating the shared catalog tables and creating default databases. * Postgresql — used for connections * template 0 — user connection is not allowed and maintain DB consistency * template 1 — When we creating a database will take an exact copy of template1. In case the user doesn’t have permission on the data directory will create an empty directory. initdb initializes the database cluster with default locale and character set encoding. This character set encoding, collation order (LC_COLLATE) and character set classes (LC_CTYPE, e.g., upper, lower, digit) can be set separately for a database when it is created. initdb determines those settings for the template1 database, which will serve as the default for all other databases. Before the Initdb initialises, the data directory was em

What is a Postmaster PID file in PostgreSQL?

  As per the PostgreSQL architecture, the postmaster is the first background process that will starts when the cluster is started. It is a parent process of all other background processes. In case any of the other background processes got failed, The postmaster automatically restart the dead processes. It also works as a listener. When the client’s request to the server first it will land into the postmaster and perform syntax and semantic checking, Once it is pass and spawn a new connection this process called “ Pgprocess ”. The postmaster act as a listener well as the parent process, so it is also called “ Supervision ”. ps -ef|grep postmaster postgres 393 32729 0 22:28 pts/3 00:00:00 grep postmaster Purpose of Postmaster PID file: The postmaster PID file resides on the data directory when the cluster is up and running. The purpose of this PID file uses to capture 1. the current postmaster process ID (PID), 2. cluster data directory path, postmaster start timestamp, 3. port number, 4

Data directory Layout in Postgres

Image
One instance of Postgres is known as a cluster .one cluster is on the disk known as a data directory by default is located on “/var/lib/psql/11/data/”, this is the default location for the data directory 1. Global -It contains cluster-wide database objects. we some files here each files some kind of tables or user info or metadata files which is related to the dictionary files or tables which pg_filenode.map = A file that maps internal relations (mainly system catalogue tables) to their OIDr here proper node file mapping info stored over here. -It is an internal file that is used by the server when the server is started. PostgreSQL  Data directory Layout 2. Base -It is the main directory and contains DB, inside the directory, we can see dB directories and its one directory per DB 3.pg_tblsc -This is a directory that contains symbolic links to the tablespace locations, in case we have created the tablespace in our instance. Then link to the physical location of the tablespace It will be

Backup Script PostgreSQL

  Please use the below script to backup all the DB using pg_dump if pg_dumpall is not an option. Please exclude the DB which you do not want to backup. For example, rdsadmin in case of RDS. And template1 and template0 which are not user related DB #!/bin/bash ## Backup Script of Postgres ## Script will take backup of individual database and global objects. Script is good if DB size is small and do not need backup of WAL Files ## Author - srinivas123raodba@gmail.com set -x DATE=`(date +"%Y-%m-%d")` export PGPASSWORD=postgres SIMPLE_DATE=`(date +"%Y%m%d")` BACKUP_DIRECTORY=/opt/postgresql/postgres_backup/$SIMPLE_DATE LOG_FILE_DIRECTORY=/opt/postgresql/postgres_backup/backup_log_file BACKUP_USER=postgres PG_DUMP=/opt/postgresql/pg96/bin/pg_dump PORT=5432 PSQL=/opt/postgresql/pg96/bin/psql DUMPALL=/opt/postgresql/pg96/bin/pg_dumpall if [ ! -d $LOG_FILE_DIRECTORY ] then /bin/mkdir -p $LOG_FILE_DIRECTORY else echo "Log File Directory exist"

CLUSTER: IMPROVING POSTGRESQL PERFORMANCE

  When designing a database application, the layout of data in disk is often neglected. However, the way data is stored by PostgreSQL can have a major performance impact. Therefore it makes sense to take a look at what can be done to improve speed and throughput. In this post you will learn one of the most important tricks. PostgreSQL: To sort or not to sort To demonstrate the importance of the on-disk layout I have created a simple test set: 1 2 3 4 5 6 7 test=# CREATE TABLE t_test AS SELECT *          FROM generate_series(1, 10000000); SELECT 10000000 test=# CREATE TABLE t_random AS SELECT *          FROM t_test          ORDER BY random(); SELECT 10000000 Note that both data sets are absolutely identical. I have loaded 10 million rows into a simple table. However, in the first case data has been sorted, then inserted. generate_series returns data in ascending order and because the table is new data will be written to disk in that order. In the second case I decided to shuffle the dat