Reading Postgres Execution Plans doesn't have to be so complicated

Reading Postgres Execution Plans doesn't have to be so complicated

·

6 min read

In this blog post, we’ll see

  • What is an SQL Execution Plan?

  • Why it is important

  • Which tools can we use to read the plan

What is an SQL Execution Plan?

Postgres documentation says: The execution plan shows how the table(s) referenced by the statement will be scanned — by plain sequential scan, index scan, etc. — and if multiple tables are referenced, what join algorithms will be used to bring together the required rows from each input table.

In other words, an SQL command only tells the database what data to bring but it doesn’t tell the database how to fetch this data. This is by design, as every database has a good query optimizer to make the decisions about which indexes to use (if any), in what order to join the tables, and which algorithm to use to join them.

Why is it important?

An SQL command may be inefficient. No matter whether we wrote them by hand, or we used an ORM (Sequelize, SQLAlchemy, Hibernate, etc). To understand how exactly the database engine executes the SQL and where the bottlenecks are we’ll have to

  1. Get the execution plan (actual or estimated)

  2. Analyze the plan

The execution plan can show us multiple issues that result in performance degradation. For instance:

  • The query scans the entire table and doesn’t use any index. What’s worse, the index may be there, but the database engine decides not to use it. The execution plan shows us exactly which indexes are used, so we can see if some of them are ignored.

  • The query is suboptimal. It may run well on the small database of the Staging environment when all the data is in the cache (especially after we run QA tests), but it may not perform well in the Production environment. The execution plan shows the “cost” in terms of CPU and IO, so we can understand how the query will behave on a much bigger scale.

  • An UPDATE or DELETE command changes too many rows which might indicate you’ll have a data-related problem. The execution plan can show us the details of what data is changed and why.

Actual vs Estimated Plan

There are two types of execution plans: Actual and Estimated.

The Actual Execution Plan shows the real calculations and steps executed by the database. Since the Actual Execution Plan shows accurate data, it is recommended to use it for troubleshooting. We need to keep in mind that the database runs the query (as the name of the plan suggests). So if we want to understand why exactly it took a query 10 minutes to run, getting the actual plan probably would take another 10 minutes. Or, if the SQL commands try to insert new rows, new rows will be added when running it again to get its execution plan and we might get primary key violation errors.

The Estimated Execution Plan shows the approximate calculations. It is very fast to generate the estimated plan, usually under a few milliseconds. The database engine uses statistics about the data distribution of each column to generate the best execution plan. The statistics used by the database engine may be out of date. In that case, a suboptimal plan is generated. We’ll discuss fixing statistics in some other article.

How to get the execution plan in PostgreSQL

You can use the command EXPLAIN before the SQL command to generate the execution plan.

For example, the command

EXPLAIN SELECT * FROM table1 WHERE col1 = 5

generates the estimated execution plan. The EXPLAIN command expects several parameters, detailed in the official PG documentation. The main options we need to know are:

  1. We can run the SQL command and get the actual command by using EXPLAIN (ANALYZE) SELECT ….

  2. We can get a detailed JSON with EXPLAIN (FORMAT JSON). As we can see below, some very useful tools know how to “read” this JSON and visualize it. The JSON contains more details than the plain textual form.

How to visualize the execution plan

The execution plan can be visualized in a textual form. It may look like in the picture below:

Longer queries, joining many tables, will generate a much longer plan. This may quickly become unreadable.

A graphic interface can be used to easily understand the plan and detect the most “expensive” steps in terms of CPU and IO.

The following tools can help with reading an execution plan:

The Tools

Explain.dalibo.com

About: a simple and useful tool. Just paste the execution plan and the SQL (optional) to get a clear visualization of the steps. Click on a step to get more information.

URL: https://explain.dalibo.com/

Free Tier: Yes

Open Source: Yes

Recommendations for how to solve the problems: No

Depesz

About: This tool shows the JSON as a table and highlights the expensive steps.

URL: https://explain.depesz.com/

Free Tier: Yes

Open Source: No

Recommendations for how to solve the problems: No

Explain.tensor.ru

About: This tool shows the JSON as a table and highlights the expensive steps. It also shows the diagram and how the tables are joined.

URL: https://explain.tensor.ru/about/

Free Tier: Yes

Open Source: No

Recommendations for how to solve the problems: No

pgMustard

About: pgMustard is another tool that can help you review Postgres query plans quickly

URL: https://www.pgmustard.com/

Free Tier: Yes, but for only 5 times!

Open Source: No.

Recommendations for how to solve the problems: No

Metis

About: Metis helps prevent your database code from breaking production. The app is designed for analysis of execution plans at scale. Metis was built for modern observability technologies, it integrates with Open Telemetry so the developers can continuously send the execution plans for analysis. Metis not only shows the raw plan (it uses Dalibo under the hood) but also provides insights and remediation plans, based on a long list of best practices.

It also has a CLI for a quick analysis of the SQL commands and integrates with the web app for an easier and deeper analysis.

URL: https://metisdata.io

Free Tier: Yes

Open Sourced No.

Recommendations on how to solve the problems: Yes. Metis uses the schema too, to provide deep insights about the problems and how to solve them.

Manually insert a query and the execution plan

Analyze queries at scale. View the recent activity in one place. Notice the system analyzes each query and shows the detected problems.

Query Analysis: Open Telemetry Traces integration gives you context, which REST API or GQL command generated the SQLs. Under it - the insights. The insights provide clear explanations about the execution of the plan and what problems were found.

You can read more about Open Telemetry instrumentation here: https://docs.metisdata.io/metis/getting-started/setup/sdk

A very powerful feature is schema enrichment. The execution plan only shows how the DB engine decided to execute the query. But it doesn’t show what other indexes exist, their structure, and why the optimizer eventually decided not to use them. That saves a lot of time in analyzing why a query performs badly. Especially for those who are not experts with database internals.

Showing the SQL command and the access method of each table (table scan or index seek) on one screen. That also saves a lot of time in analyzing the query, in a tight correlation of the logical commands and their physical implementation.

Conclusion

Analyzing an execution plan is critical for understanding why a query runs slowly, consumes too many resources, or prevents other queries from running (locks). A responsible developer checks the execution plan during development and testing to prevent production problems. Luckily, there are some powerful tools to help you with collecting the plans and analyzing them.

More Resources

https://scalegrid.io/blog/postgres-explain-cost/

https://scalegrid.io/blog/introduction-to-auto-explain-postgres/

About Open Telemetry: https://opentelemetry.io/