Best practices for Amazon Aurora MySQL database configuration

After migrating or launching your new Amazon Aurora MySQL instance in the AWS Cloud, have you asked yourself one or more of the following questions?

  • “What are the next steps? How can I make it run optimally?”
  • “Is it recommended to modify any existing parameters?”
  • “What parameters should I modify?”

If so, I hope that in this blog post I can provide some guidance on what to do (and what not to do).

In this post, I discuss, clarify, and provide recommendations on the configuration parameters for Amazon Aurora with MySQL compatibility. These database parameters and their values are important when carrying over, tuning, or reconfiguring your newly created or migrated instance in the AWS Cloud. I also discuss which parameters are carried over from an Amazon RDS for MySQL instance into an Aurora instance. I cover which are the default values, and which ones are critical for your instance’s stability and optimum performance.

The most important consideration before making changes is to understand the needs and motivations behind your change. Although most parameters settings are fine in their default values, application workload changes might cause these parameters to require adjustment. Before making any changes, ask yourself the following questions:

  • Am I experiencing stability issues, such as restarts or failovers?
  • Can I make my application run queries faster?

Quick primer on Aurora parameter groups

There are two types of Aurora MySQL parameter groups: DB parameter groups and DB cluster parameter groups. Some parameters affect the configuration for an entire DB cluster, like binary log format, time zone, and character set defaults. Others limit their scope to a single DB instance.

In this blog post, I classify them in a different context: which parameters affect your Aurora cluster’s behavior, stability, and functionality, and which ones affect performance when modified.

Keep in mind that both parameter types come out of the box with preset defaults, and some parameters allow modification.

For a refresher and more in-depth look into the basics of modifying and working with parameter groups, see these topics in the Aurora User Guide:

Before making changes to your production database

Parameter changes can produce unexpected results, including degraded performance and system instability. Follow these best practices before changing any database configuration parameters:

  • Make changes in a test environment by creating a clone or restoring a snapshot of your production instance (as described in the documentation). This way, your setup resembles your production environment as closely as possible.
  • Generate a workload for your test instance that mimics your production workload.
  • Check system performance on key performance indicators like CPU utilization, number of database connections, memory utilization, cache hit rates and query throughput, and latency. Do this before the change to obtain baseline numbers, and after to observe results.
  • Only make one parameter change at a time to avoid ambiguity.
  • If changes made no measurable impact on your test system, consider reverting your parameter back to default.
  • Document which parameter had the positive impact you expect and which key performance indicators showed improvement.

Default parameter values and their importance

Certain DB instance parameters contain variables or formulas, in which the value is determined by constants. Examples are the instance’s size and memory footprint, network port for the instance, and its allocated storage. It’s best to leave these unchanged, as they adjust automatically whenever an instance scale-up or down operation is performed.

For example, Aurora DB parameter innodb_buffer_pool_size defaults to:

{DBInstanceClassMemory*3/4}

DBInstanceClassMemory is a variable that is set to your instance’s memory size in GiB.

Example: For a db.r4.xlarge instance with 30.5 GiB of memory, this value is 20,090,716,160 bytes or 18.71 GiB.

Suppose that we decide to set this parameter to a fixed value, say to 18,000,000,000 bytes, and later we run a scale-down operation to db.r4.large, which has half as much memory (15.2 GiB). In this case, we will likely then encounter an out-of-memory condition on the database engine after modifying it, and the instance can’t properly start.

For a quick glance on which parameters are automatically calculated by system variables, you can search for these parameters inside your parameter group definition. To do this, search for the curly bracket character “{”.

If you want to query the actual value in use by the instance, there are two ways to do this on a command line. These are by using a SHOW GLOBAL VARIABLES or SELECT statement:

mysql> SHOW GLOBAL VARIABLES where Variable_Name='innodb_buffer_pool_size';
+-------------------------+------------+
| Variable_name           | Value      |
+-------------------------+------------+
| innodb_buffer_pool_size | 8001683456 |
+-------------------------+------------+
1 row in set (0.01 sec)

mysql> SELECT @@innodb_buffer_pool_size;
+---------------------------+
| @@innodb_buffer_pool_size |
+---------------------------+
|                8001683456 |
+---------------------------+
1 row in set (0.00 sec)

Symptoms and diagnosis for incorrectly set parameter values

When certain parameters are misconfigured, it can manifest itself as an out-of-memory condition that is recorded in your MySQL error log. In this case, the instance enters a rolling reboot state and generate event logs similar to the following, with recommendations about what parameter values to adjust:

2018-12-29 19:05:16 UTC  [-]MySQL has been crashing due to incompatible parameters. Please check your memory parameters, in particular the max_connections, innodb_buffer_pool_size, key_buffer_size, query_cache_size, tmp_table_size, innodb_additional_mem_pool_size and innodb_log_buffer_size. Modify the memory parameter and reboot the instance.

Classifying parameters

For the scope of this blog post, we can classify Aurora MySQL parameters into two main groups:

  1. Parameters that control the database’s behavior and functionality but have no impact on resource utilization and instance stability
  2. Parameters that might affect performance by managing how resources, such as caching and internal memory-based buffers, are allocated in the instance

Let’s take a look at some of these, their default values, and how they affect your instance’s behavior or performance when modified. The following table describes the parameter names as you find them in the parameter group, the Aurora and MySQL default values, and a summary of the functionality affected by modifying this parameter.

Parameter nameImpactsAurora default valueMySQL 5.6/5.7 valueParameter description
autocommitFunctionality1 (ON)1 (ON)When enabled, automatically commits transactions to disk. If disabled, multi-statement transactions must be explicitly started and committed or rolled back. If a transaction isn’t explicitly started, each successful statement is automatically committed.
max_connectionsFunctionality{Variable}{Variable}Limits the maximum number of concurrent database connections.
max_allowed_packetFunctionality4194304 (bytes)4194304 (bytes)Maximum allowed packet size that the server can receive.
group_concat_max_lenFunctionality1024 (bytes)1024 (bytes)Maximum result length returned by the server for the GROUP_CONCAT() function.
innodb_ft_result_cache_limitFunctionality2000000000 (bytes)2000000000 (bytes)Limits cache size for InnoDB full-text search query results.
max_heap_table_sizeFunctionality

16777216

(bytes)

16777216

(bytes)

Limits size of new user-defined MEMORY tables. Doesn’t apply limit to existing tables.
performance_schemaFunctionalityOFFOFFEnables or disables MySQL Performance Schema.
binlog_cache_sizePerformance

32768

(bytes)

32768 (bytes)Controls binary log cache size, increasing it improves performance on systems with large transactions. Limits value in environments with a large number of DB connections.
bulk_insert_buffer_sizePerformanceControls MyISAM cache size to speed up bulk insert operations. Doesn’t apply for Aurora MySQL.
innodb_buffer_pool_sizePerformance

{Variable}

3/4 of instance memory

134217728

(128 MB)

Controls memory size of the InnoDB buffer pool where table and index data is cached.
innodb_sort_buffer_sizePerformance

1048576

(bytes)

1048576

(bytes)

Defines how much data is read into memory for sorting operations before writing to disk.
join_buffer_sizePerformance

262144

(bytes)

262144

(bytes)

Minimum buffer size used for joins, indexing, and range scans that are not indexed.
key_buffer_sizePerformance

16777216

(bytes)

8388608

(bytes)

Key cache for MyISAM tables. Doesn’t apply to Aurora.
myisam_sort_buffer_sizePerformance

8388608

(bytes)

8388608

(bytes)

MyISAM index buffer. Doesn’t apply to Aurora.
query_cache_sizePerformance{Variable} 1/24 memory size

1048576

(bytes)

Amount of memory reserved for caching result sets. in multiples of 1,024.
query_cache_typePerformance10Enables or disables query caching.
read_buffer_sizePerformance

262144

(bytes)

262144

(bytes)

Controls memory allocation for multiple types of buffers, for example when sorting rows for an ORDER BY clause, partition inserts, and nested queries.
read_rnd_buffer_sizePerformance

524288

(bytes)

262144

(bytes)

Improves performance on systems with multi-range read queries.
table_open_cachePerformance60002000Limits number of open tables in memory for all threads.
table_definition_cachePerformance20000{Variable} less than 2000Limits number of table definitions stored in the cache without using file descriptors.
tmp_table_sizePerformance

16777216

(bytes)

16777216

(bytes)

Limits the size of the engine’s internal, in-memory temporary tables.

Recommendations and impact

Here’s a brief explanation on how each of these critical parameters can affect your database, along with some use cases on how to tune them:

autocommit

Recommended setting: Use the default value (1 or ON) to ensure that each SQL statement is automatically committed as you run it, unless it’s part of a transaction opened by the user.

Impact: A value of OFF might encourage incorrect usage patterns such as transactions that are held open longer than required, not closed, or committed at all. This can affect the performance and stability of the database.

max_connections

Recommended setting: Default (variable value). When using a custom value, configure only as many connections as the application actively uses to perform work.

Impact: Configuring a too-high connection limit can contribute to higher memory use even if connections are not actively used. It can also cause high database connection spikes that affect the databases’ performance and stability.

This variable parameter is automatically populated based on your instance’s memory allocation and size using the following formula, so use the default value first:

GREATEST({log(DBInstanceClassMemory/805306368,2)*45},{log(DBInstanceClassMemory/8187281408,2)*1000})

For example, for an Aurora MySQL db.r4.large instance with 15.25 GiB of memory, it is set to 1,000:

DBInstanceClassMemory = 16374562816 bytes
max_connections = GREATEST({log(16374562816/805306368,2)*45},{log(16374562816/8187281408,2)*1000})
max_connections = GREATEST(195.56,1000) = 1000

If you are encountering connection errors and getting excessive Too many connections in your error logs, you can set this parameter to a fixed value instead of the variable setting.

When you’re considering setting max_connections to a fixed value if your application requires a higher number of connections, consider using a connection pool or proxy between the application and the database. You can also do this if connections can’t be predicted or controlled reliably.

When you manually configure a value for this parameter that exceeds the recommended number of connections, Amazon CloudWatch metrics for DB connections show a red line where the threshold is exceeded. This is the formula that CloudWatch uses:

Threshold value for max_connections = {DBInstanceClassMemory/12582880}

For example, for a db.r4.large instance whose memory size is 15.25 GiB ( 15.25 x 1024 x 1024 x 1024 = 16374562816 bytes), the warning threshold is approximately 1,300 connections. You can still use the maximum number of configured connections, provided that there are enough resources on the instance.

max_allowed_packet

Recommended setting: Default (4,194,304 bytes). Use a custom value only if required by your database workload. Tune this parameter when you are dealing with queries returning large elements, like long strings or BLOBs.

Impact: Setting a large value here doesn’t affect the initial size of the message buffers. Instead, it allows them to grow up to the defined size if required by your queries. A large parameter value coupled with a large number of concurrent eligible queries can increase the risk of out-of-memory conditions.

The following example error is shown when setting this parameter too small:

 ERROR 1153 (08S01) at line 3: Got a packet bigger than 'max_allowed_packet' bytes

group_concat_max_len

Recommended setting: Default (1,024 bytes). Use a custom value only if your workload requires it. You need to tune this parameter only when you want to alter the return of the GROUP_CONCAT() statement and allow the engine to return longer column values. This value should be used in parallel with max_allowed_packet, as this determines the maximum size of a response.

Impact: Some of the symptoms of setting this parameter too high are high memory use and out-of-memory conditions. Setting it too low causes queries to fail.

innodb_ft_result_cache_limit

Recommended setting: Default (2,000,000,000 bytes). Use a custom value depending on your workload.

Impact: As the value is already close to 1.9 GiB, increasing it beyond its default can result in out-of-memory conditions.

max_heap_table_size

Recommended setting: Default (16,777,216 bytes). Limits the maximum size of tables created in memory as defined by a user.  Altering this value only has effect on newly created tables, and doesn’t affect existing ones.

Impact: Setting this parameter too high causes high memory utilization or out-of-memory conditions if in-memory tables grow.

performance_schema

Recommended setting: Disable for t2 instances due to its high memory utilization.

Impact: In Aurora MySQL 5.6, Performance Schema memory is preallocated heuristically. This preallocation is based on other configuration parameters such as max_connections, table_open_cache, and table_definition_cache. In Aurora MySQL 5.7, Performance Schema memory is allocated on-demand. The Performance Schema typically consumes around 1 to 3 GB of memory depending on the instance class, workload, and database configuration. If a DB instance is running low on memory, enabling Performance Schema can lead to out-of-memory conditions.

binlog_cache_size

Recommended setting: Default (32,768 bytes). This parameter controls the amount of memory that the binary log cache can use. By increasing it, you can improve performance on systems with large transactions by using buffers to avoid excessive disk writes. This cache is allocated on a per connection basis.

Impact: Limit this value in environments with a large number of DB connections to avoid causing an out-of-memory condition.

bulk_insert_buffer_size

Recommended setting: Leave as is, because it doesn’t apply to Aurora MySQL.

innodb_buffer_pool_size

Recommended setting: Default (variable value), as it is preconfigured in Aurora to 75 percent of instance memory size. You can see buffer pool use in the output of SHOW ENGINE INNODB STATUS.

Impact: A larger buffer pool improves overall performance by allowing less disk I/O when accessing the same table data repeatedly. The actual amount of allocated memory might be slightly higher than the actual configured value due to InnoDB engine overhead.

innodb_sort_buffer_size

Recommended setting: Default (1,048,576 bytes)

Impact: Higher-than-default values can increase the overall memory pressure on systems with a large number of concurrent queries

join_buffer_size

Recommended setting: Default (262,144 bytes). This value is preallocated for various types of operations (such as joins) and a single query may allocate multiple instances of this buffer. If you want to improve the performance of your joins, we recommend that you add indexes to such tables.

Impact: Changing this parameter can cause severe memory pressure in environments with a large number of concurrent queries. Increasing this value doesn’t provide faster JOIN query performance, even when you add indexes.

key_buffer_size

Recommended setting: Leave as the default value (16,777,216 bytes), because it is not relevant to Aurora and affects only MyISAM table performance.

Impact: No impact on Aurora’s performance.

myisam_sort_buffer_size

Recommended setting: Leave as the default value (8,388,608 bytes). It’s not applicable for Aurora because it has no effect on InnoDB.

Impact: No impact on Aurora’s performance.

query_cache_size

Recommended setting: Default (variable value). The parameter is pretuned in Aurora, and the value is much larger than MySQL default. Aurora’s query cache doesn’t suffer from scalability issues (as the query cache does in MySQL). It’s an acceptable practice to modify it for high-throughput, demanding workloads.

Impact: Query performance is affected when accessing queries through this cache. You can see query cache use in the output of a SHOW STATUS command under the “QCache” section.

query_cache_type

Recommended setting: Enabled. By default, the query cache is enabled in Aurora and is recommended to keep it enabled for its performance gains and low overhead. However, it’s fine to disable the query cache if you know that your workload doesn’t benefit from it. An example is a write-heavy workload with limited to no read queries.

Impact: Disabling query caching in Aurora might affect database performance if your workload reuses queries, like repeatable SQL statements. You can see query cache use in the output of a SHOW STATUS command under the “Qcache” section.

read_buffer_size

Recommended setting: Default (262,144 bytes).

Impact: Large values cause higher overall memory pressure and provoke out-of-memory issues. Don’t increase the setting unless you can prove that the higher value benefits performance without compromising stability.

read_rnd_buffer_size

Recommended setting: Default (524,288 bytes). There’s no need to increase the setting for Aurora, due to the performance characteristics of the underlying storage cluster.

Impact: Large values might cause out-of-memory issues.

table_open_cache

Recommended setting: Leave as is, unless your workload requires accessing a very large number of tables simultaneously. The table cache is a major memory consumer, and the default value in Aurora is significantly higher than the MySQL defaults. This parameter is automatically adjusted based on instance size.

Impact: A database with a large number of tables (in the hundreds of thousands) requires a larger setting, because not all tables fit in memory. Setting this value too high might contribute to out of memory conditions. This setting also indirectly contributes to Performance Schema memory usage, if the Performance Schema is enabled.

table_definition_cache

Recommended settings: Defaults. This setting is pretuned in Aurora to be significantly larger than in MySQL, and it’s automatically adjusted based on instance size and class. If your workload requires it and your database requires a very large number of tables to be opened concurrently, increasing this value might speed up opening tables operations. This parameter is used in conjunction with table_open_cache.

Impact: This setting also indirectly contributes to Performance Schema memory usage, if the Performance Schema is enabled. Watch out for higher-than-default settings as they might provoke out-of-memory issues.

tmp_table_size

Recommended settings: Default (16,777,216 bytes). Together with max_heap_table_size, this parameter limits the size for in-memory tables used for query processing. When the temporary table size limit is exceeded, tables are then swapped to disk.

Impact: Very large values (hundreds of megabytes or more) are notorious for causing memory issues and out-of-memory errors. This parameter doesn’t affect tables created with the MEMORY engine.

Conclusion and key points to take away

Many parameters have already been optimized when deploying a new Aurora MySQL instance, and they serve as a good baseline before performing any parameter changes. The exact combination of values for each parameter highly depends on individual systems, application workload, and desired throughput characteristics. Moreover, on database systems with a high rate of change, growth, data ingestion rates, and dynamic workloads, these parameters also require constant monitoring and evaluation. We recommend that you do this monitoring and evaluation every few months, perhaps every few weeks as you adapt your database to your application’s and business needs.

To perform successful parameter tuning that translate into measurable performance increases, it’s a good practice to experiment, establish a baseline, and compare your results after the changes are made. We recommend that you do this before changes are committed into a live, production system.

If you want further information on a specific parameter, contact AWS Support or your assigned AWS Technical Account team.


About the Author

Fabio Higa is a Database Specialist Technical Account Manager with AWS focusing on RDS Aurora/MySQL engines. He has worked with enterprise customers around the globe for more than 3 years. In his spare time, he likes working on his cars and taking them to the local racetracks.