AWS Database Blog
Integrate HAQM Aurora MySQL and HAQM Bedrock using SQL
Because organizations store a large amount of their data in relational databases, there is a clear impetus to augment these datasets using generative artificial intelligence (AI) foundation models to elevate end-user experiences. In this post, we explore how to integrate HAQM Aurora MySQL-Compatible Edition with a generative AI model using HAQM Aurora Machine Learning. We walk through the HAQM Aurora MySQL integration with HAQM Bedrock and demonstrate two use cases:
- Complementing database information for service improvement – Obtain supplemental information using HAQM Bedrock and store it in an Aurora database for real-time access
- Improving productivity – Summarize a long text stored in an Aurora database using HAQM Bedrock
To learn about other ways to use ML with HAQM Aurora MySQL, refer to Build a generative AI- powered agent assistance application using HAQM Aurora and HAQM SageMaker JumpStart.
Solution overview
Generative AI is a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music.
Foundation models (FMs) are machine learning (ML) models trained on a broad spectrum of generalized and unlabeled data. They’re capable of performing a wide variety of general tasks. Large language models (LLMs) are a class of FMs. LLMs are specifically focused on language-based tasks such as such as summarization, text generation, classification, open-ended conversation, and information extraction.
The solution is based on the following key components:
- HAQM Aurora – HAQM Aurora is a relational database management system (RDBMS) built for the cloud with MySQL and PostgreSQL compatibility. Aurora gives you the performance and availability of commercial-grade databases at one-tenth the cost. Aurora ML enables you to call a wide variety of ML algorithms, including ML-based predictions, generative AI, and sentiment analysis, using the familiar SQL programming language. You don’t need prior ML experience to use Aurora ML. Aurora ML provides straightforward, optimized, and secure integration between Aurora and AWS ML services without having to build custom integrations or move data around. Aurora calls HAQM SageMaker for a wide variety of ML algorithms, including FMs, HAQM Comprehend for sentiment analysis, and HAQM Bedrock models, so your application doesn’t need to call these services directly.
- HAQM Bedrock – HAQM Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and HAQM via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. HAQM Bedrock offers a straightforward way to build and scale generative AI applications with FMs. Because HAQM Bedrock is serverless, you don’t have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with.
This following diagram illustrates an example of using ML with HAQM Aurora MySQL.
In the following sections, we demonstrate how to make real-time calls to HAQM Bedrock using SQL queries from HAQM Aurora MySQL. The high-level steps are as follows:
- Create a new cluster.
- Create a database and database user.
- Create an AWS Identity and Access Management (IAM) role and policy for the Aurora cluster.
- Assign the IAM role to the Aurora cluster.
- Use Aurora ML by enabling HAQM Bedrock base models.
- Create functions for accessing HAQM Bedrock.
Prerequisites
This post assumes familiarity with navigating the AWS Management Console. For this solution, you’ll also need the following resources and services enabled in your AWS account:
- HAQM Aurora MySQL 3.06.0 or later version is required to use HAQM Bedrock integration.
- The Aurora MySQL cluster must use a custom DB cluster parameter group.
- IAM access is required to create roles and permissions.
You must have access to use specific FMs in HAQM Bedrock. - In this post, we use HAQM Titan Text G1 – Express (amazon.titan-text-express-v1) and Anthropic Claude 3 Haiku (anthropic.claude-3-haiku-20240307-v1:0).
- The ML services must be running in the same AWS Region as your Aurora MySQL cluster.
- The network configuration of your Aurora MySQL cluster must allow outbound connections to endpoints for HAQM Bedrock.
Create an Aurora MySQL cluster
The first step is to create an Aurora MySQL cluster. For full instructions, refer to Creating and connecting to an Aurora MySQL DB cluster and Using HAQM Aurora machine learning with Aurora MySQL. We highlight a few specific configuration options used in this example:
- On the Aurora console, create a new cluster in a Region that supports HAQM Bedrock. (For example, us-east-1)
- For Engine options, select Aurora (MySQL Compatible).
For Engine version, we use Aurora MySQL 3.06.0 for using HAQM Bedrock integration. - For Configuration options, select either Aurora Standard or Aurora I/O Optimized.
- For DB instance class, select your instance class.
- For HAQM Bedrock, you need to modify the parameter group later,
so you should apply custom DB cluster parameter group at this time.
- Create your Aurora cluster.
After the cluster is provisioned, you need to run a series of SQL commands to prepare your cluster for integrating with HAQM Bedrock. - Log in to your Aurora cluster either as a user with the
rds_superuser_role
privilege, such as the master user by using MySQL Command-Line Client, and run the following code. The AWS_BEDROCK_ACCESS database role must be granted to the user in order to work with HAQM Bedrock ML functions.
Your database user is now set up to integrate with HAQM Bedrock. Now you can create an IAM role to provide the Aurora MySQL DB cluster access to HAQM Bedrock.
Create an IAM role and policy for the Aurora cluster
To allow Aurora ML to work with HAQM Bedrock, you first create an IAM policy that allows the Aurora cluster to communicate with HAQM Bedrock models. Complete the following steps:
- On the IAM console, choose Policies in the navigation pane.
- Choose Create policy.
- On the Specify permissions page, for Select a service, choose Bedrock.
- In the policy editor, expand Bedrock and under Read, and select InvokeModel to allow that action.
- For Resources, select All or Specific. As a best practice, make sure to only give access to models in HAQM Bedrock that your team requires.
- Choose Next.
- For Policy name, enter a name for your policy, such as
AuroraBedrockInvokeModel
. - Choose Create policy.
- On the IAM console, choose Roles in the navigation pane.
- Choose Create role.
- For Trusted entity type, select AWS service.
- For Service or use case, choose RDS.
- Select RDS – Add Role to Database.
- Choose Next.
Now you assign the IAM policy you created in the previous step to this IAM role. - For Permission policies, find and select the
AuroraBedrockInvokeModel
policy. - Choose Next.
- In the Role details section, enter a name (for this post,
AuroraBedrockRole
) and a description. - Review the IAM role and confirm that the
AuroraBedrockInvokeModel
policy is attached. - Choose Create to create the role.
Assign the IAM role to the Aurora cluster
Now you need to assign the AuroraBedrockRole
IAM role to the HAQM Aurora MySQL cluster. Complete the following steps:
- On the HAQM RDS console, navigate to your Aurora MySQL cluster details page.
- On the Connectivity & security tab, locate the Manage IAM roles section.
- Select IAM roles to add to this cluster, choose the
AuroraBedrockRole
role. - Choose Add role.
- Add the ARN of this IAM role to the
aws_default_bedrock_role
parameter of the custom DB cluster parameter group associated with your Aurora MySQL cluster.
- Choose Save changes to save the setting.
You can confirm the parameters by using the AWS Management Console or AWS Command Line Interface (AWS CLI). You can also check using the MySQL Client tool as shown in the following example:
Your cluster can now invoke models in HAQM Bedrock.
Use Aurora ML
Aurora ML is an Aurora feature that lets builders work directly with AWS ML services using SQL commands, including HAQM Bedrock, SageMaker, and HAQM Comprehend.
You can list the HAQM Bedrock FMs using the AWS CLI:
Before you use base models, make sure the target model is enabled on the HAQM Bedrock Console. If it’s not enabled, please add model access to the target model.
You are now able to create functions that allow you to directly access HAQM Bedrock from Aurora. The following example shows how to generate a function using HAQM Titan Text G1 – Express and Anthropic Claude 3 Haiku model on HAQM Bedrock which supports the TEXT modality. If you want to use different model ID, refer to Base model IDs for the model ID and Supported foundation models in HAQM Bedrock for a list of supported models.
- Log in with the
bedrock_user
account that you created earlier. - Navigate to the
bedrockdb
database and create functions after setting the role toAWS_BEDROCK_ACCESS
. The function definitions are provided in the GitHub repository. - Create a function to call HAQM Titan Text G1 -Express:
- Create a function to call Anthropic Claude 3 Haiku:
- Ask “What is the proportion of land and sea on Earth?” and invoke an HAQM Titan function:
- Invoke an Anthropic Claude 3 function:
You can verify the correctness of the response by checking the output of the question in HAQM Bedrock playgrounds on the HAQM Bedrock console. Refer to Anthropic’s Claude 3 Haiku model is now available on HAQM Bedrock for more information.
Now you have finished preparing your Aurora cluster to use HAQM Bedrock.
In the following sections, we demonstrate two use cases to use existing data integrated with HAQM Bedrock.
If the Aurora cluster you created can’t communicate with HAQM Bedrock, you may need to adjust the network configuration so your Aurora cluster can communicate with HAQM Bedrock. For more information on how to do this, see Enabling network communication from HAQM Aurora MySQL to other AWS services, Create a VPC endpoint and Use AWS PrivateLink to set up private access to HAQM Bedrock. In this post, we choose the bedrock-runtime endpoint for using a private connection between the VPC and HAQM Bedrock.
Use case 1: Complement existing data with HAQM Bedrock
Let’s see how you can complement the information by linking existing data with HAQM Bedrock. In this example, you don’t have data in this database yet, so you create a table and add data for verification purposes. The table definition is provided in the GitHub repository.
- Create a sample table:
- Insert sample data (assume that this is existing data stored in your database):
- Check that the data is stored in the table.
- Create a procedure for retrieving data from the table then run the function for the data. The procedure definition is provided in the GitHub repository:
- Run the procedure which retrieves data from an existing table, requests HAQM Bedrock based on the target data, retrieves the data according to the contents, and updates the contents of the table:
As a result, you should be able to obtain complementary information for your data.
The following example show your output, which lists popular food from different countries:
For example, if you’re running a travel site and want to post tourist spots in Kyoto as reference information on your site, you can change the inquiry content as shown in the following example to obtain data. The data required varies depending on the use case, so try customizing the inquiry according to your needs:
Anthropic Claude 3 supports English, Spanish, Japanese, and multiple other languages. For example, the following request asks for five recommended sightseeing spots in Kyoto in Japanese:
Let’s also look at how to summarize long texts stored in a database as our next example.
Use case 2: Summarize existing data with HAQM Bedrock
Using HAQM Bedrock to summarize long texts stored in the database makes them less complicated to read. With summarization, you can get an overview in a short time, and if necessary, you can read the original data in the details. This use case has found application in summarizing product or service reviews and in summarizing case notes for support agents.
In this use case, we retrieve news about the latest releases of AWS services from What’s New with AWS?, then store it in the database, and retrieve only the product name from the saved data. We also summarize the release note. The table definition is provided in the GitHub repository.
Complete the following steps:
- Create a sample table for storing data:
- Insert sample data by running the following script from other terminal (requires Python 3.7 or higher; you can run it using “python3 feed.py”. ):
After you run the script, documents are saved as the following:
Now you are ready to get the product name from the description column and store it in the product column.
- Create a procedure to add product names only from the description column:
- Run the procedure to get product name from description column by using HAQM Bedrock:
- After you run the procedure, you can confirm that the data has been added to the product column. This makes it more straightforward to understand the content.
If you want to summarize the description column, you need to change the question to something like “Please summarize the following descriptions in 200 characters or less” and change the target column to summary. It will summarize each description in approximately 200 characters, which can help improve work efficiency.
- Create a procedure to add the summary of the description column to the summary column:
- Run the procedure to get summary data from HAQM Bedrock:
- After you run the procedure, you can confirm that the data has been add to the summary column:
- In the following example, you can also check the number of characters in the description and summary columns:
By using group_concat, it’s also possible to create a summary by combining multiple data. The following output is an example of getting data from 20 rows and creating a summary. The sample code is provided in the GitHub repository:
Considerations
Although the demonstration shows a response within a minute when running SQL commands, this can vary depending on the granularity and volume of the context. It’s important to adapt this proof of concept to your own implementation before deploying it to production. If you plan to perform large amounts of processing, we also recommend that you refer to Quotas for HAQM Bedrock and HAQM Bedrock Pricing for costs.
Clean up
If you don’t need to use any of the resources you created, delete them when you are finished:
- Drop unnecessary objects and users from the database.
- If you no longer need to use Aurora ML but want to continue using your cluster, you can remove the ML related parameter (
aws_default_bedrock_role
) and IAM roles from you cluster. - If you no longer need the IAM role you created to access HAQM Bedrock, you can remove it and you may need to update some of your networking configurations.
- If you no longer need your Aurora cluster, follow the instructions in Deleting Aurora DB clusters and DB instances.
Conclusion
Businesses today want to enhance the data stored in their relational databases with generative AI FMs to improve the end-user experience or improve productivity. In this post, we demonstrated how to obtain supplemental information for stored data using HAQM Bedrock. We also demonstrated how you can summarize the document stored in an Aurora database by using the Aurora ML integration capabilities with HAQM Bedrock.
The ability to invoke FMs on HAQM Bedrock as SQL functions using Aurora ML simplifies the learning curve in using LLMs while building generative AI applications. It provides straightforward, optimized, and secure integration between Aurora and AWS ML services without having to build custom integrations or move data around.
For more information about Aurora ML, see HAQM Aurora Machine Learning. To explore the latest in Aurora ML with HAQM Aurora MySQL refer to Using HAQM Aurora machine learning with Aurora MySQL.
We invite you to leave feedback in the comments.
About the Authors
Steve Dille is a Senior Product Manager for HAQM Aurora. He leads all generative AI strategy and product initiatives with Aurora databases for AWS. Previous to this role, Steve founded the performance and benchmark team for Aurora and then built and launched the RDS Data API for HAQM Aurora Serverless v2. He has been with AWS for 4 years. Prior to this, he served as a software developer at NCR, product manager at HP and Data Warehousing Director at Sybase (SAP). He has over 20 years of experience as VP of Product or CMO on the executive teams of companies resulting in five successful company acquisitions and one IPO in the data management, analytics, and big data sectors. Steve earned a Master’s in Information and Data Science at UC Berkeley, an MBA from the University of Chicago Booth School of Business and a BS in Computer Science/Math with distinction from University of Pittsburgh.
Shinya Sugiyama is a Sr. Database Solutions Architect with HAQM Web Services. He worked for 10 years as a database technical consultant at a hardware and database software vendor. For over 10 years, he has been involved in a variety of work for internet companies, including site operation, produce and management. Currently, he supports AWS customers who use MySQL and MariaDB, mainly HAQM RDS and HAQM Aurora, to solve their problems and provide solutions.