Taking Advantage of Redshift Pause and Resume: The Smart Way

AWS Redshift Smart Pause and Resume is a serverless tool aimed to automate when to pause and resume a Redshift (single and multi-node) cluster. The tool makes use of a number of AWS services including Lambda Functions, Amazon Forecast, Step Functions and Cloudwatch Metrics and Events, and all are deployed together using the serverless framework.

Last month AWS unveiled the ability to pause and resume Redshift clusters. Overall, this feature now allows users to completely shut down a Redshift cluster at times when it is not utilised, thus lessening the costs associated. This is particularly useful for Redshift clusters used for development purposes. These clusters can now be paused from time to time when no users are seen actively querying the database.

This tool is built on top of the aforementioned capability and uses CPU utilisation forecasting (with Amazon Forecast) to effectively schedule when to pause and resume a Redshift cluster. As a result, instead of having a fixed schedule of when to pause and resume a Redshift cluster, the schedule can dynamically change based on forecasts generated using patterns observed from past Redshift CPU utilisation data.

Average CPU utilisation data, at 15-minute intervals, is collected from a Redshift cluster and this is used to train a forecast model. Resulting forecasts are then used to determine when to pause and resume a Redshift cluster. Moreover, the following heuristic is used to accomplish this. A Redshift cluster will be resumed 30 minutes before the time its average CPU utilisation is predicted to be over 5%. On the other hand, the cluster will be paused 30 minutes after the time its average CPU utilisation is predicted to be under 5%. To illustrate, if resulting forecasts showcase that CPU utilisation will be over and under 5% at 7:00 and 20:00, respectively; the tool will schedule the Redshift cluster to be resumed and paused on 6:30 and 20:30, respectively. Hence, providing a buffer and giving ample time for the cluster to start and end.

In terms of the AWS serverless architecture, this tool consists of 2 AWS Step Functions: (1) the Train Forecast Model Step Function and (2) the Generate Forecasts Step Function. These Step Functions consists of a number of Lambda Functions that use Amazon Forecast API’s. Also, each of these Step Functions is executed with a Lambda Function which is scheduled to run using Cloudwatch Events. The Train Forecast Model Step Function consists of steps to train and retrain the forecast model with the collected data on a monthly basis. The Generate Forecasts Step Function, on the other hand, is executed daily and consists of steps to scrape Redshift CPU utilisation data, use these alongside existing data to generate forecasts, and with the forecast results determine the right time to pause and resume the Redshift cluster.