Skip to main content

Data directory Layout in Postgres

One instance of Postgres is known as a cluster .one cluster is on the disk known as a data directory


by default is located on “/var/lib/psql/11/data/”, this is the default location for the data directory


1. Global


-It contains cluster-wide database objects. we some files here each files some kind of tables or user info or metadata files which is related to the dictionary files or tables which pg_filenode.map = A file that maps internal relations (mainly system catalogue tables) to their OIDr here proper node file mapping info stored over here.

-It is an internal file that is used by the server when the server is started.




PostgreSQL Data directory Layout

2. Base


-It is the main directory and contains DB, inside the directory, we can see dB directories and its one directory per DB


3.pg_tblsc


-This is a directory that contains symbolic links to the tablespace locations, in case we have created the tablespace in our instance. Then link to the physical location of the tablespace It will be recreated inside the pg_tblsc directory


-We can see the links in case we have created the tablespaces.in case we used the default tablespace instead of creating new user tablespaces, we can’t see any links inside the directories


4.pg_wal


-It contains transaction logs or wal segments

-These files are binary-files


5.pg_log


-Which contains our start-up logs


6.logs


-Which contains the error log, All the error logs are located under these directories

-It is a human-readable file which is in text formats


7.Status Directories


-These are the multiple directories that start with pg_*, a lot of status data and lots of transaction info, temporary statistic info.

-It is just required the information of the server and content of the directories are temporary


8.Configuration Files


1.PostgreSQL.conf — one of the most parameter files inside the directory

2.pg_hba.conf — host-based access control file or host-based authentication file

3.pg_ident.conf — which is used for OS authentication

4.postgresql.auto.conf

5. postmaster. PID — which contains the postmaster details and is only available when the cluster is up and running

6.postmaster.opts — the option used to start the server

Comments

Popular posts from this blog

PostgreSQL Vacuum and Vacuum full are not two different processes

  PostgreSQL’s   VACUUM   and   VACUUM FULL   are not separate processes but rather different operational modes of the same maintenance command. Here’s why: Core Implementation Both commands share the same underlying codebase and are executed through the  vacuum_rel()  function in PostgreSQL’s source code ( src/backend/commands/vacuum.c ). The key distinction lies in the  FULL  option, which triggers additional steps: Standard  VACUUM : Removes dead tuples (obsolete rows) and marks space reusable  within PostgreSQL Updates the visibility map to optimize future queries Runs concurrently with read/write operations VACUUM FULL : Rewrites the entire table into a new disk file, compressing it and reclaiming space for the  operating system Rebuilds all indexes and requires an  ACCESS EXCLUSIVE  lock, blocking other operations Key Differences in Behavior Aspect Standard VACUUM VACUUM FULL Space Reclamation Internal reuse onl...

PostgreSQL Health Check - Performance Audit and Recommendations

   PostgreSQL Server Capacity Planning and Optimization: Conduct a thorough assessment of current and projected workload demands on the database server's critical resources. This evaluation should cover key factors such as CPU utilization, RAM consumption, storage requirements, I/O performance, and network bandwidth usage. Consider both short-term fluctuations and long-term usage patterns to accurately forecast future needs and potential bottlenecks. Collect and analyze performance metrics over extended periods, including peak and off-peak hours, to gain a comprehensive understanding of server behavior. Use appropriate monitoring tools to gather detailed data on query execution times, resource utilization patterns, and system responsiveness under varying loads. Identify recurring performance issues, resource contention points, or capacity limitations that may affect the overall efficiency of the PostgreSQL environment. Based on this analysis, develop actionable recommendations...

PostgreSQL - Architecture

When you start PostgreSQL`s instance (via pg_ctl start) the main process of the databases is started. In the past versions it was known as the postmaster process and now days it is called postgres. When this processes is started it allocates memory for two main things : shared memory and other background processes. The shared memory is a memory that the database uses mainly for cashing. Its main goal is to reduce the I/O operations and as a result of that improve the performance of the database. The other processes that the postgres process starts are responsible for the ongoing work of the database.  PostgreSQL`s architecture is based on a client/server communication. In the server side there is a server process that manages the files, database operations, connection and so on.. On the client side there is a tool/program which can be a graphical tool (gui)  like pgAdmin or a command line tool(cli) like psql  that is repnsible for connection the database and all...