Streamline HAQM Aurora database operations at scale: Introducing the AWS Database Acceleration Toolkit

In this post, we introduce the AWS Database Acceleration Toolkit (DAT), an open source database accelerator. DAT is an infrastructure as code solution using Terraform to simplify and automate initial setup, provisioning, and on-going maintenance activities for HAQM Aurora. DAT helps you reduce the time to market, which in turn increases customer satisfaction and improves cost efficiency. We guide you through DAT’s high-level architecture, its key features, and how to use it to create a new Aurora cluster.

Whether you’re a DevOps-focused organization, a software as a service (SaaS) provider managing multi-tenant databases, or an enterprise migrating from commercial databases to Aurora, DAT offers a streamlined approach to database management. In fact, DAT was built based on our experience as Solution Architects working with a large SaaS provider who achieved remarkable results. One of our SaaS customers reduced their database migration time from 12 months to just 4 weeks for over 100 customer databases, achieved zero outages over a six-month period, improved database team productivity by 60 to 70%, and reduced infrastructure costs by 42%. While individual results can vary, these outcomes demonstrate the potential of DAT to transform database operations and deliver significant improvements in efficiency and reliability.

Solution overview

The following figure shows the high-level architecture of DAT.

DAT uses Terraform for automated resource provisioning and security. The source code is hosted in a Git repository for collaboration and version control.
The toolkit provides different options to provision Aurora clusters in AWS:
- Using Terraform command line interface (CLI)
- Using Jenkins pipelines.

This flexibility allows you to choose the method that best fits your existing workflows and tools.

DAT includes specialized Terraform modules designed to help you get started with Aurora:
- Aurora cluster: This module facilitates the creation of a new Aurora cluster in an existing HAQM Virtual Private Cloud (HAQM VPC) environment or from the latest database snapshot. The module allows customization of cluster configuration, reducing manual effort and time. Aurora supports both serverless and provisioned modes of cluster creation, providing flexibility to choose the appropriate mode based on your application requirements.
- HAQM RDS Proxy for Aurora cluster: HAQM RDS Proxy is a database proxy feature that boosts application performance and reliability by pooling database connections, reducing connection overhead, and enhancing security with credential management and failover support. With HAQM RDS Proxy module, you can generate a proxy for an existing Aurora cluster. This module enhances connectivity and availability for an already operational Aurora database.
- Aurora GlobalDB: This module facilitates the creation of Aurora Global Database clusters across primary and secondary AWS Regions. You can use this module to provision new Aurora clusters in multiple Regions to support fast disaster recovery and globally distributed read access
- Aurora Monitoring: This module sets up and configures HAQM CloudWatch monitoring dashboards for Aurora databases. These dashboards present key performance metrics and insights that efficiently monitor database health, performance, and usage.

Use case examples

The Git repository contains use case examples with step-by-step instructions for running these modules on Aurora PostgreSQL-Compatible Edition and Aurora MySQL-Compatible Edition engines. These examples serve as practical guides, offering detailed steps and automation scripts for executing various operations related to Aurora databases.

Aurora MySQL-Compatible examples

aurora-mysql-cluster-existing-vpc
aurora-mysql-cluster-global-db
aurora-mysql-cluster-latest-snapshot
aurora-mysql-dbproxy
aurora-mysql-monitoring

Aurora PostgreSQL-Compatible examples

aurora-postgres-cluster-existing-vpc
aurora-postgres-cluster-global-db
aurora-postgres-cluster-latest-snapshot
aurora-postgres-dbproxy
aurora-postgres-monitoring

After reviewing the prerequisites and deployment options, we walk you through an example deployment using an Aurora PostgreSQL cluster and Terraform.

Prerequisites

First, make sure that you have installed the following tools locally.

You need access to an AWS account. If you don’t have one, you can create a new AWS account.
Install and configure AWS Command Line Interface (AWS CLI).
Install Terraform.
Install git.

Deployment options

You can deploy AWS DAT using any of the following three options.

Deployment using Terraform: DAT deployment using the Terraform CLI is a straightforward approach that allows direct interaction with Terraform commands, offering flexibility and customization.
Deployment using a new Jenkins instance: If you prefer an integrated automation approach, you can deploy DAT using a new Jenkins instance. This option involves provisioning a new Jenkins server, configuring it, and setting up pipelines to deploy DAT examples. By integrating DAT deployment into Jenkins, you can use the power of Jenkins automation capabilities and streamline the workflow for consistent deployment.
Deployment using an existing Jenkins instance: If you already have a Jenkins setup, you can integrate DAT with your existing Jenkins environment. This option allows you to extend your existing automation processes to include the deployment of DAT examples, minimizing disruptions to their current workflows.

Provision an Aurora PostgreSQL cluster using Terraform

The following steps will guide you through provisioning a new Aurora PostgreSQL cluster using Terraform CLI with one writer and one reader instance. However, you can customize the reader and writer instances. See DB cluster prerequisites before setting up the database cluster.

Clone the source code from the DAT repository, which contains the DAT usage examples and terraform modules required to deploy the solution.
```
git clone http://github.com/aws-samples/aws-database-acceleration-toolkit.git
```

Navigate to the Aurora PostgreSQL folder.

cd aws-database-acceleration-toolkit/examples/aurora-postgres-cluster-existing-vpc

Review the Terraform variable definition file called terraform.tfvars and configure the values for the variables as needed for your use case. The updated terraform.tfvars looks like the following:

#(mandatory) AWS Region where your resources will be located
# For example: "us-west-2"
region = "<REGION>"

# VPC Id where your resources will be located. 
# For example: "vpc-11112222333344445"
vpc_id = "<VPC_ID>"

# Database Engine for your Aurora mysql Cluster.
engine = "aurora-postgresql"

# Database engine version (optional). If not specified, the default version for the selected engine will be used.
# For example: "15.3"  
engine_version = "<ENGINE_VERSION>"

# Database engine mode. Valid value: provisioned
# Refer AWS documentation for supported regions and engine versions for engine mode 
engine_mode = "provisioned"

# DB Instance class. 
# Refer AWS documentation for supported DB instance class for DB engine.
# For example: "db.r6g.large"  
instance_class ="<INSTANCE_CLASS>"

# Specify number of DB instances to be created in the cluster.
# Optionally, you can pass the configuration parameters and values (for e.g., instance_class="db.r6g.xlarge") for each instance within the curly braces.
# If no parameters are specified, all the DB instances will be created with the same values.
instances = {
    instance1   = {}
    instance2   = {}
}

# Database cluster name
# For example: "aurora-pg-poc"  
name = "<CLUSTER_NAME>"

# Database environment
# For example: "dev"  
environment = "<ENVIRONMENT>"

# Tagging : Team/Group Name
# For example: "data-engineering"  
groupname = "<GROUPNAME>"

# Tagging : Project or Application Name
# For example: "myapp"  
project = "<PROJECTNAME>"

# Skip final snapshot during cluster deletion (optional). If set to 'true' (default), no final snapshot will be taken before deleting the cluster.
skip_final_snapshot= "true"

Initialize the working directory using the terraform init command.
```
terraform init
```
Execute the terraform plan command to create an execution plan, which lets you preview the proposed changes that Terraform will make to your infrastructure. The plan command will not carry out the proposed changes.
```
terraform plan -var-file terraform.tfvars
```
Finally, execute the terraform apply command to execute the actions proposed in a Terraform plan. The terraform apply command can take up to 15 minutes to complete. After the deployment is successful, you can use the AWS Management Console to view the new Aurora cluster.
```
terraform apply -var-file terraform.tfvars
```

Additional aspects

DAT’s features enhance the security posture of database operations by providing strong encryption practices, secure credential management, flexible authentication, and comprehensive logging and monitoring capabilities.

DAT allows you to use your own customer managed keys (CMKs). Using a CMK gives you the ability to rotate the key according to your own policies. If a user doesn’t input a CMK, DAT defaults to using an AWS managed key.
DAT integrates with AWS Secrets Manager to manage master user passwords for your database clusters. This allows for centralized and secure management of sensitive database credentials.
DAT allows you to choose your preferred authentication option. If no specific authentication option is selected, DAT defaults to password authentication.
DAT provides enhanced monitoring and visualization of database activities, which can be crucial for performance tracking, troubleshooting, and security monitoring. DAT publishes events from your Aurora PostgreSQL DB cluster’s PostgreSQL log to CloudWatch. You can create CloudWatch dashboards as shown in the following figure based on the available log data in CloudWatch logs.

Cleanup

To clean up your environment, destroy the Aurora cluster created using Terraform by running the following command:

terraform destroy -var-file terraform.tfvars

Conclusion

In this post, we introduced the Database Acceleration Toolkit (DAT), a tool to help you automate database provisioning, efficiently manage your Aurora databases, and help ensure a resilient and scalable database infrastructure. With DAT, you can reduce the time to market and optimize cost efficiency by reducing manual intervention. For deeper insights into DAT and to experiment with practical examples, visit the Database Acceleration Toolkit (DAT) Wiki.

About the authors

Piyush Mattoo is a Sr. Solution Architect for Financial Services Data Provider segment at AWS. He is a software technology leader with over a decade long experience building scalable and distributed software systems to enable business value through the use of technology. He is based out of Southern California and current interests include outdoor camping and nature walks.

Mitesh Purohit is a Sr. Solutions Architect at AWS, based in Dallas, TX. He is a technologist who helps ISV fintech customers modernize on the cloud. His areas of depths and passions are serverless architectures, microservices, and help customers to design highly scalable, innovative, and secure cloud solutions.

Ravi Mathur is a Sr. Solutions Architect at AWS. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings several years of experience in software engineering and architecture roles for various large-scale enterprises.

Munish Dabra is a Principal Solutions Architect at AWS. His current areas of focus are AI/ML and Containers. He has a strong background in designing and building scalable distributed systems. He enjoys helping customers innovate and transform their business in AWS.

Mythili Annamalai Sekar is a Solutions Architect at AWS. She specializes in providing technical assistance and architectural guidance to ISV customers on AWS platforms. She has a strong background in designing and building BPM applications, and her current areas of focus are Serverless and AI/ML.

AWS Database Blog