AWS Database Blog

Improve PostgreSQL performance using the pgstattuple extension

As businesses continue to generate and store vast amounts of data, the need for efficient and reliable database management systems becomes increasingly important. PostgreSQL, an open source relational database management system (RDBMS), has established itself as a powerful solution for handling complex data requirements. One of the key strengths of PostgreSQL lies in it’s extensibility. Through a rich ecosystem of extensions and plugins, developers can enhance a database’s functionality to meet specific requirements. These extensions range from spatial data support and full-text search capabilities to advanced data types and performance optimization tools. Although PostgreSQL offers a wide range of features and capabilities, one extension that often goes overlooked is pgstattuple—a tool that can provide significant value for gaining insights into the internal workings of PostgreSQL databases.

In this post, we explore pgstattuple in depth—what insights it offers, how to use it to diagnose issues in HAQM Aurora PostgreSQL-Compatible Edition and HAQM Relational Database Service (HAQM RDS) for PostgreSQL, and best practices for harnessing its capabilities.

Overview of pgstattuple

The pgstattuple extension provides a set of functions to query detailed statistics at the tuple level within PostgreSQL tables and indexes. This offers visibility into the physical storage layer that standard PostgreSQL statistics views don’t provide.

Some of the table and index level metrics exposed by pgstattuple include:

  • tuple_count – Number of live tuples
  • dead_tuple_count – Number of dead tuples not yet cleaned up
  • tuple_len – Average length of live tuples in bytes
  • free_space – Total free space available in bytes
  • free_percent – Percentage of free space; higher values indicate more bloat
  • dead_tuple_len – Total length of dead tuples in bytes
  • dead_tuple_percent – Percentage of space occupied by dead tuples

These metrics are more than just numbers – they’re early warning system for database health and performance issues. By monitoring these statistics, you can proactively identify storage issues that might be silently impacting your database performance. Whether it’s excessive table bloat consuming disk space, or index fragmentation slowing down queries, pgstattuple helps spot these issues before they become critical problems.

Using pgstattuple in Aurora and HAQM RDS

Both Aurora and HAQM RDS support using the pgstattuple extension. To enable it, you first need to create the extension in your database using the CREATE EXTENSION pgstattuple; command. After it’s enabled, you can use functions like pgstattuple(relation) to get details on the physical storage used by a table including the number of pages, live tuples, dead tuples, and more. The pgstattuple_approx(relation) function provides a faster estimate of these metrics. You can also get index statistics using pgstatindex(index). Analyzing this low-level data can help identify bloated tables that need vacuuming, find tables with high dead tuple ratios that could benefit from being rewritten, and optimize your database’s physical storage utilization.

The output of pgstattuple provides actionable insights for monitoring, maintenance, and performance tuning, as discussed in the following sections.

Detecting and managing table bloat

Identifying and managing bloat is one of the most useful applications of pgstattuple for PostgreSQL tables. Bloat arises when UPDATE and DELETE operations leave behind unused space not automatically reclaimed. PostgreSQL maintains data consistency through a Multiversion Concurrency Control (MVCC) model, where each SQL statement sees a snapshot of data from some prior time, regardless of the underlying data’s current state. This prevents statements from viewing inconsistent data due to concurrent transactions updating the same rows, providing transaction isolation per database session. Unlike traditional locking methods, MVCC minimizes lock contention, allowing reasonable multiuser performance.

When deleting a row in MVCC systems like PostgreSQL, the row isn’t immediately removed from data pages. Instead, it’s marked as deleted or expired for the current transaction but remains visible to transactions viewing an older snapshot, avoiding conflicts. As transactions complete, these dead or expired tuples are expected to eventually be vacuumed and the space is reclaimed. In PostgreSQL, an UPDATE operation is equivalent to a combination of DELETE and INSERT. When a row is updated, PostgreSQL marks the old version as expired (like a DELETE) but keeps it visible to older transaction snapshots. It then inserts a new version of the row with the updated values (like an INSERT). Over time, the expired row versions accumulate until the VACUUM process removes them, reclaiming space. This approach enables PostgreSQL’s MVCC model, providing snapshot isolation without explicit locking during updates.

PostgreSQL’s autovacuum is an automated maintenance process that reclaims storage occupied by dead tuples and updates statistics used by the query planner. The autovacuum process triggers when maximum age (in transactions) crosses autovacuum_freeze_max_age, or when the threshold is hit: autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * number of tuples. In this formula, autovacuum_vacuum_threshold represents the minimum number of updated or deleted tuples needed to initiate cleanup, while autovacuum_vacuum_scale_factor is the table size fraction added to the threshold calculation to determine when maintenance should occur. If autovacuum fails to clean up dead tuples for certain reasons, you may need to handle highly bloated tables manually.

Dead tuples get stored alongside live tuples in data pages. Bloat could also be because of free space in the pages, for example after autovacuum cleaned up the dead tuples. During query execution, PostgreSQL scans more pages filled with dead tuples, causing increased I/O and slower queries. Highly bloated tables make database workloads consume unnecessary read I/O, impacting application performance. Cleaning up bloat may be required if autovacuum fails.

Before we dive into analyzing table bloat with pgstattuple, let’s ensure you have everything set up to follow along. You’ll need access to an HAQM RDS or Aurora PostgreSQL instance, as well as a client machine with psql installed and properly configured to connect to your database. Make sure you have the necessary permissions to create tables and install extensions in your PostgreSQL environment. For this demonstration, we’ll use the pgbench_accounts table. If you don’t have this table already, you can easily create it using the pgbench utility. Run the command pgbench -i -s 10 to initialize a pgbench schema with a scale factor of 10, which will create the pgbench_accounts table along with other necessary tables. This will provide us with sample data to work with in our analysis. Additionally, you should have the pgstattuple extension installed on your database instance. If you haven’t installed it yet, you can do so by running CREATE EXTENSION pgstattuple; as a user with sufficient privileges. With these prerequisites in place, you’ll be ready to explore table bloat analysis using real data in a controlled environment.

While pgstattuple provides comprehensive table bloat analysis, it can be resource-intensive. We recommend first using the lightweight bloat estimation queries documented here. If more detailed analysis is required, here’s how to use pgstattuple. The following example demonstrates how to use pgstattuple to analyze bloat information in a table.

Create the table pgbench_accounts_test with 10,000 records:


bench=> create table pgbench_accounts_test as (select * from pgbench_accounts limit 10000);
SELECT 10000

In this example, querying pgstattuple returns dead tuple count as zero, and table size 1672kB:

SELECT * from pgstattuple('pgbench_accounts_test');
-[ RECORD 1 ]------+--------
table_len		   | 1343488
tuple_count 	   | 10000
tuple_len 		   | 1210000
tuple_percent 	   | 90.06
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space 		   | 18896
free_percent 	   | 1.41

SELECT pg_size_pretty(pg_total_relation_size('public.pgbench_accounts_test'));
pg_size_pretty

1672 kB

To demonstrate the usage of pgstattuple, we turn off autovacuum (not recommended in a production environment), and update 2,500 records:

SHOW autovacuum;
autovacuum
off

UPDATE pgbench_accounts_test SET bid=2 WHERE aid <2501;
UPDATE 2500

Now, pgstattuple data for this table shows 2,500 old versioned tuples are moved to the dead tuple.

SELECT * from pgstattuple('pgbench_accounts_test');
-[ RECORD 1 ]------+--------
table_len 		   | 1679360
tuple_count 	   | 10000
tuple_len 		   | 1210000
tuple_percent 	   | 72.05
dead_tuple_count   | 2500
dead_tuple_len 	   | 302500
dead_tuple_percent | 18.01
free_space 		   | 23620
free_percent 	   | 1.41
bench=> SELECT pg_size_pretty(pg_total_relation_size('public.pgbench_accounts_test'));
pg_size_pretty

1672 kB
(1 row)

bloat_percentage in PostgreSQL refers to the ratio of space that could be reclaimed in a table or index compared to its total size. It can be calculated using data from pgstattuple as follows:

bloat_ percentage = (table_len - tuple_len/(table_len) * 100

A bloat_percentage value exceeding 30%–40% often indicates problematic bloat requiring attention. To clean up bloat, use the VACUUM command:

VACUUM pgbench_accounts_test;

Let’s inspect the pgstattuple data after the VACUUM operation:

SELECT * from pgstattuple('pgbench_accounts_test'); 
-[ RECORD 1 ]------+--------
table_len 		   | 1679360
tuple_count 	   | 10000
tuple_len 		   | 1210000
tuple_percent 	   | 72.05
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space 		   | 353220
free_percent 	   | 21.03

The VACUUM operation resets dead_tuple_count to 0. The available space still attached with the table is available for insert or update operations in the same table. This makes table_len the same even after the VACUUM operation. For reclaiming disk storage occupied by bloat, there are two options:

  • VACUUM FULL – VACUUM FULL can reclaim more disk space but runs much more slowly. It requires an ACCESS EXCLUSIVE lock on the table it’s working on, and therefore can’t be done in parallel with other uses of the table. While VACUUM FULL operations are generally discouraged in production environments, they might be acceptable during scheduled maintenance windows where downtime is planned and approved.
  • pg_repackpg_repack is a PostgreSQL extension that efficiently removes table and index bloat while maintaining online availability. Unlike CLUSTER and VACUUM FULL, it minimizes exclusive lock duration during processing, offering performance comparable to CLUSTER. While pg_repack allows for online table and index reorganization with minimal application downtime, it’s important to consider its limitations. The extension still requires brief exclusive locks during operation and may struggle to complete on tables with high-velocity transactions, potentially impacting database performance. For heavily-used tables where full repacking is challenging, consider the alternative of index-only repacking. Best practices include thorough testing in a non-production environment, scheduling during low-traffic periods, and having a monitoring and rollback plan in place. Despite its benefits, users should be aware of the potential risks and plan accordingly when implementing pg_repack in their PostgreSQL environments.

The VACUUM FULL operation reduces the table_len:

VACUUM FULL pgbench_accounts_test;
VACUUM
SELECT * from pgstattuple('pgbench_accounts_test');
-[ RECORD 1 ]------+--------
table_len 		   | 1343488
tuple_count 	   | 10000
tuple_len 		   | 1210000
tuple_percent 	   | 90.06
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space 		   | 18896
free_percent 	   | 1.41

The VACUUM FULL operation reclaims the wasted space to disk storage and reduces table_len. The following query identifies bloat for the top 10 largest tables in your database using pgstattuple:

SELECT relname,
(table_len - tuple_len) AS bloat_size,
ROUND(100 * (table_len - tuple_len)::numeric / NULLIF(table_len::numeric, 0), 2) AS bloat_percentage
FROM (
SELECT relname,
(pgstattuple(oid)).table_len,
(pgstattuple(oid)).tuple_len
FROM pg_class
WHERE relname IN (
SELECT relname
FROM pg_statio_user_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(relid) DESC,
pg_relation_size(relid) DESC
LIMIT 10
)
) AS subquery
ORDER BY bloat_percentage DESC;
relname | bloat_size | bloat_percentage
-----------------------+-------------+------------------
pgbench_accounts_test  | 469360		 | 27.95
pgbench_tellers 	   | 1307008 	 | 26.64
pgbench_history 	   | 2081296 	 | 25.15
pgbench_accounts 	   | 27945461248 | 18.76
cust_accounts 		   | 267097	     | 9.94
seller_accounts 	   | 2661689	 | 9.91
prod_accounts 		   | 2661689	 | 9.91
prod_teller 		   | 2661689	 | 9.91
pgbench_accounts_1 	   | 6647546624  | 9.90
stage_accounts 		   | 27659129 	 | 9.90
(10 rows)

pgstattuple does a full table scan and may cause higher consumption of instance resources such as CPU and I/O. This makes the pgstattuple operation slow for large tables. Alternatively, the pgstattuple_approx(relation) function provides a faster estimate of these metrics. While it’s less resource-intensive than pgstattuple, it could still be heavy on very large tables or busy systems. Consider running during off-peak hours or on a replica if available.

Automating manual vacuum

Regularly monitoring for bloat enables you to identify maintenance needs proactively, before performance suffers. The bloat metrics can also help fine-tune autovacuum settings to clean up space more aggressively if needed. After you identify the top 10 bloated tables, you can automate the VACUUM operation by using the pg_cron extension. pg_cron is a cron-based job scheduler for PostgreSQL that runs inside the database as an extension. It uses the same syntax as regular cron, but it allows you to schedule PostgreSQL commands directly from the database. The following code is an example of using the pg_cron function cron.schedule to set up a job to use VACUUM on a specific table every day at 23:00 (GMT):

SELECT cron.schedule('manual vacuum', '0 23 * * *', 'VACUUM pgbench_accounts_test');
 schedule
----------
1
(1 row)

After the preceding example runs, below query checks the history in the cron.job_run_details table as follows:

SELECT * FROM cron.job_run_details;
jobid  | runid | job_pid | database | username | command                        | status    | return_message | start_time                    | end_time
-------+-------+---------+----------+----------+--------------------------------+-----------+----------------+-------------------------------+-------------------------------
 1     | 1     | 3395    | postgres | adminuser| vacuum pgbench_accounts_test | succeeded | VACUUM         | 2024-05-11 21:10:00.050386+00 | 2024-05-11 21:10:00.072028+00
(1 row)

Diagnosing and resolving index bloat

Like tables, indexes in PostgreSQL can experience bloat that wastes space and impacts performance. pgstattuple enables detecting index bloat using pgstatindex.

The following query shows the index identifier, total index size in bytes and average leaf density:

SELECT i.indexrelid::regclass AS index,
       s.index_size,
       s.avg_leaf_density
FROM pg_index AS i
CROSS JOIN LATERAL pgstatindex(i.indexrelid) AS s
WHERE indrelid = 'myschema.mytable'::regclass;

Average leaf density is the percentage of useful data in the leaf pages of the index. Significantly bloated indexes can be rebuilt with REINDEX or pg_repack to eliminate dead space and restore optimal performance. It’s recommended to periodically check busy, high-churn indexes for bloat.

Assessing index fragmentation

Another valuable use of pgstattuple is identifying index fragmentation issues. Fragmentation occurs when index pages become scattered due to deletions, updates, and page splits. Heavily fragmented indexes have more dead tuples occupying space inefficiently.

We can check the degree of fragmentation using leaf_fragmentation:

SELECT * FROM pgstatindex('<index_name>');

-[ RECORD 1 ]------+-------
version            | 2
tree_level         | 1
index_size         | 196608
root_block_no      | 3
internal_pages     | 1
leaf_pages         | 22
empty_pages        | 0
deleted_pages      | 0
avg_leaf_density   | 64.48
leaf_fragmentation | 13.64

If leaf_fragmentation is high, the index is likely fragmented and a REINDEX should be considered. Rebuilding eliminates fragmentation and associated performance overhead.

Best practices for using pgstattuple

Consider the following best practices when using pgstattuple for PostgreSQL monitoring and maintenance:

  • To estimate bloat in PostgreSQL tables, use check_postgres query mentioned on PostgreSQL wiki.
  • Use the pgstattuple extension to analyze the physical storage of database tables, providing detailed statistics on space usage within the database, including how much space is wasted due to bloat.
  • Rebuild significantly bloated tables and indexes to reclaim dead space
  • Watch for high dead_tuple_percent to identify fragmentation issues
  • Focus maintenance on tables and indexes critical for workload performance
  • Avoid running pgstattuple on highly active tables to prevent interference
  • Use pgstattuple metrics to fine-tune auto-vacuum settings
  • Combine pgstattuple with query analysis and logs for holistic database insights

Conclusion

The pgstattuple extension serves as a powerful tool for uncovering crucial diagnostic metrics in PostgreSQL databases, exposing detailed storage statistics that help teams identify and address performance-impacting issues like bloat and index fragmentation. Working seamlessly with Aurora and RDS PostgreSQL, the extension provides essential visibility into storage patterns and maintenance requirements. Following pgstattuple best practices is key to maintaining efficient, high-performing PostgreSQL databases, and organizations can further enhance their database management through AWS’s support options – AWS Enterprise Support, Enterprise On-Ramp, and Business Support customers can leverage AWS Countdown Premium engagements for optimization guidance, enabling teams to confidently implement best practices and maintain optimal performance while focusing on their core business objectives.

We welcome your comments and feedback in the comments section.

If you have any questions or concerns, leave them in the comment section.


About the Author

Vivek Singh is a Principal Database Specialist Technical Account Manager with AWS focusing on HAQM RDS for PostgreSQL and HAQM Aurora PostgreSQL engines. He works with enterprise customers providing technical assistance on PostgreSQL operational performance and sharing database best practices. He has over 17 years of experience in open-source database solutions, and enjoys working with customers to help design, deploy, and optimize relational database workloads on AWS.

Kiran Singh is a Senior Partner Solutions Architect and an HAQM RDS and HAQM Aurora specialist at AWS focusing on relational databases. She helps customers and partners build highly optimized, scalable, and secure solutions; modernize their architectures; and migrate their database workloads to AWS.

Sagar Patel is a Principal Database Specialty Architect with the Professional Services team at HAQM Web Services. He works as a database migration specialist to provide technical guidance and help HAQM customers to migrate their on-premises databases to AWS.