Like all AWS services, the roadmap for Amazon Aurora PostgreSQL is driven mostly by customer feedback and requests for product enhancement. The feedback from several enterprise customers who have migrated their databases from Oracle and Microsoft SQL Server to Amazon Aurora suggests two things. Enterprises that run their database workloads for critical applications require optimal database performance. Also, enterprises need stable and consistent performance during various system changes on Aurora PostgreSQL-compatible databases.
One of the major sources for variability in PostgreSQL database performance is the checkpoint process. It’s often a trade-off between performance and recoverability. On Aurora PostgreSQL, we have addressed this problem by eliminating database checkpoints. To decouple the logging and storage layer, we implement a log-based storage subsystem. Another major cause for response time variability is query plan instability. There are various factors that can unexpectedly change the execution plan of queries. For example:
- Change in optimizer statistics (manually or automatically)
- Changes to the query planner configuration parameters
- Changes to the schema, such as adding new index
- Changes to the bind variables used in the query
- Minor version or major version upgrades to the PostgreSQL database version. For example, PostgreSQL 9.6.x to 10.x
Query Plan Management
The Aurora PostgreSQL Query Plan Management (QPM) feature solves the problem of plan instability by allowing database users to maintain stable, yet optimal, performance for a set of managed SQL statements. QPM primarily serves two main objectives:
- Plan Stability. QPM prevents plan regression and improves plan stability when any of the above changes occur in the system.
- Plan Adaptability. QPM automatically detects new minimum-cost plans and controls when new plans may be used and adapts to the changes.
How QPM works
The flowchart below shows how QPM works. Steps 1 to 9 in the flowchart are explained below the flowchart diagram. For additional details, refer to QPM basics in the Amazon Aurora documentation.
- When the optimizer processes a SQL statement, it generates a minimum cost plan.
- When QPM is not enabled, the optimizer simply executes this minimum cost plan. When it is enabled, the optimizer uses query plan management.
- If the optimizer determines the SQL statement is not a managed statement, it executes the generated minimum cost plan.
- If the manual or automatic capture is enabled, the optimizer goes through the capture plan processing.
- If QPM use_plan_baselines option is not enabled, the optimizer runs the minimum-cost plan.
- If the optimizer’s generated plan is not a stored plan, the optimizer captures and stores the plan as a new unapproved plan.
- If the plan is not a rejected or disabled, optimizer picks the generated plan for execution. Optionally, you can configure QPM to simply execute any plan that has an estimated cost below a threshold that you specify. This avoids plan enforcement overhead for trivial plans.
- If the managed statement has any enabled and valid preferred plans, the optimizer runs the minimum cost one. If one or more preferred plans exist, then the optimizer re-costs each preferred plan, and selects the minimum-cost preferred plan that can be re-created.
- Optimizer chooses the approved and enabled minimum cost plan for execution. If no preferred plans exist or can be re-created, then the optimizer re-costs each approved plan, and selects the minimum-cost approved plan. If no preferred or approved plan exists or can be re-created, then the optimizer’s minimum-cost plan is used as a last resort.
Quick start guide on using QPM with automatic capture
Query plan management is available with Amazon Aurora PostgreSQL version 10.5-compatible (Aurora 2.1.0) and later, or Amazon Aurora PostgreSQL version 9.6.11-compatible (Aurora 1.4.0) and later. The quickest way to enable QPM is to use the automatic plan capture, which enables the plan capture for all SQL statements that run at least two times.
Here are the steps to configure and enable the use of QPM on your Aurora PostgreSQL cluster for automatic capture and using managed plans with QPM:
- Modify the Amazon Aurora DB Cluster and DB Instance Parameters related to the QPM.
a. Open the Amazon RDS console at https://console.aws.amazon.com/rds/.
b. Create a new instance-level parameter group to use for query plan management. For more information, see Creating a DB Parameter Group.
c. Create a new cluster-level parameter group to use for query plan management. For more information, see Creating a DB Cluster Parameter Group.
d. Open your cluster-level parameter group and set the rds.enable_plan_management parameter to 1. For more information, see Modifying Parameters in a DB Cluster Parameter Group.
e. Open your database level parameter group and set the apg_plan_mgmt.capture_plan_baselines parameter to automatic and apg_plan_mgmt.use_plan_baselines to true. For more information, see Modifying Parameters in a DB Parameter Group. Please note that these parameters can be set at either the cluster level or at the database level. The default recommendation would be to set it at the Aurora cluster level.
f. Restart your DB instance to enable this new setting.
g. Connect to your DB instance with a SQL client such as psql. For more information, see Using psql to Connect to a PostgreSQL DB Instance.
h. Query the Aurora PostgreSQL compatible version for the cluster.
- Create and verify the apg_plan_mgmt extension for your DB instance.
a. Create the apg_plan_mgmt extension for your DB instance.
b. Query to make sure that all QPM-related parameters are modified to the appropriate value.
- Run synthetic workload with automatic capture.
a. I use pgbench (a PostgreSQL benchmarking tool) to generate a simulated workload, which runs same queries for a specified period. With automatic capture enabled, QPM captures plans for each query that runs at least twice. Below is the example.
b. Query apg_plan_mgmt.dba_plans table to view the managed statements and the execution plans for the SQL statements started with the pgbench tool.
c. Turn off automatic capture.
Capturing all plans with automatic capture has little runtime overhead and can be enabled in production. We are turning off the automatic capture to make sure that we don’t capture SQL statements outside the pgbench workload. This can be turned off by setting the apg_plan_mgmt.capture_plan_baselines parameter to off from the DB instance-level parameter group.
d. Verify that the execution plan of the managed statement is the plan captured by QPM.
We have manually executed the explain plan on one of the managed statements (highlighted in the red above). The explain plan output does show the SQL hash and the plan hash matches with the QPM approved plan for that statement.
In addition to automatic plan capture, QPM also offers manual capture, which offers a mechanism to capture execution plans for known problematic queries. Capturing the plans automatically is recommended generally. However, there are situations where capturing plans manually would be the best option, such as:
- You don’t want to enable plan management at the Database level, but you do want to control a few critical SQL statements only.
- You want to save the plan for a specific set of literals or parameter values that are causing a performance problem.
QPM Plan adaptability with plan evolution mechanism
If the optimizer’s generated plan is not a stored plan, the optimizer captures and stores it as a new unapproved plan to preserve stability for the QPM-managed SQL statements.
Query plan management provides techniques and functions to add, maintain, and improve execution plans and thus provides Plan adaptability. Users can on demand or periodically instruct QPM to evolve all the stored plans to see if there is a better minimum cost plan available than any of the approved plans.
QPM provides apg_plan_mgmt.evolve_plan_baselines function to compare plans based on their actual performance. Depending on the outcome of your performance experiments, you can change a plan’s status from unapproved to either approved or rejected. You can instead decide to use the apg_plan_mgmt.evolve_plan_baselines function to temporarily disable a plan if it does not meet your requirements.
For additional details about the QPM Plan evolution, see Evaluating Plan Performance. For a detailed example of QPM plan evolution, see Use cases for query plan management in Amazon Aurora PostgreSQL.
That’s your quick introduction to the Aurora PostgreSQL Query Plan Management (QPM) and a quick start guide to QPM automatic capture. QPM helps you sustain your database performance by providing plan stability, adaptability, and consistent optimal database performance. It does this even with changes that otherwise cause problems with execution plans or performance.
For advanced use cases, see the second blog post in this series, Use cases for query plan management in Amazon Aurora PostgreSQL. For more details about QPM, see Managing Query Execution Plans for Aurora PostgreSQL.
About the Authors
Sameer Malik is a principal database solutions architect at Amazon Web Services.
Jim Finnerty is a principal software engineer for Aurora PostgreSQL at Amazon Web Services.