AWS Machine Learning Blog
Build responsible AI applications with HAQM Bedrock Guardrails
As organizations embrace generative AI, they face critical challenges in making sure their applications align with their designed safeguards. Although foundation models (FMs) offer powerful capabilities, they can also introduce unique risks, such as generating harmful content, exposing sensitive information, being vulnerable to prompt injection attacks, and returning model hallucinations.
HAQM Bedrock Guardrails has helped address these challenges for multiple organizations, such as MAPRE, KONE, Fiserv, PagerDuty, Aha, and more. Just as traditional applications require multi-layered security, HAQM Bedrock Guardrails implements essential safeguards across model, prompt, and application levels—blocking up to 88% more undesirable and harmful multimodal content. HAQM Bedrock Guardrails helps filter over 75% hallucinated responses in Retrieval Augmented Generation (RAG) and summarization use cases, and stands as the first and only safeguard using Automated Reasoning to prevent factual errors from hallucinations.
In this post, we show how to implement safeguards using HAQM Bedrock Guardrails in a healthcare insurance use case.
Solution overview
We consider an innovative AI assistant designed to streamline interactions of policyholders with the healthcare insurance firm. With this AI-powered solution, policyholders can check coverage details, submit claims, find in-network providers, and understand their benefits through natural, conversational interactions. The assistant provides all-day support, handling routine inquiries while allowing human agents to focus on complex cases. To help enable secure and compliant operations of our assistant, we use HAQM Bedrock Guardrails to serve as a critical safety framework. HAQM Bedrock Guardrails can help maintain high standards of blocking undesirable and harmful multimodal content. This not only protects the users, but also builds trust in the AI system, encouraging wider adoption and improving overall customer experience in healthcare insurance interactions.
This post walks you through the capabilities of HAQM Bedrock Guardrails from the AWS Management Console. Refer to the following GitHub repo for information about creating, updating, and testing HAQM Bedrock Guardrails using the SDK.
HAQM Bedrock Guardrails provides configurable safeguards to help safely build generative AI applications at scale. It evaluates user inputs and model responses based on specific policies, working with all large language models (LLMs) on HAQM Bedrock, fine-tuned models, and external FMs using the ApplyGuardrail API. The solution integrates seamlessly with HAQM Bedrock Agents and HAQM Bedrock Knowledge Bases, so organizations can apply multiple guardrails across applications with tailored controls.
Guardrails can be implemented in two ways: direct integration with Invoke APIs (InvokeModel and InvokeModelWithResponseStream) and Converse APIs (Converse and ConverseStream) for models hosted on HAQM Bedrock, applying safeguards during inference, or through the flexible ApplyGuardrail API, which enables independent content evaluation without model invocation. This second method is ideal for assessing inputs or outputs at various application stages and works with custom or third-party models that are not hosted on HAQM Bedrock. Both approaches empower developers to implement use case-specific safeguards aligned with responsible AI policies, helping to block undesirable and harmful multimodal content from generative AI applications.
The following diagram depicts the six safeguarding policies offered by HAQM Bedrock Guardrails.
Prerequisites
Before we begin, make sure you have access to the console with appropriate permissions for HAQM Bedrock. If you haven’t set up HAQM Bedrock yet, refer to Getting started in the HAQM Bedrock console.
Create a guardrail
To create guardrail for our healthcare insurance assistant, complete the following steps:
- On the HAQM Bedrock console, choose Guardrails in the navigation pane.
- Choose Create guardrail.
- In the Provide guardrail details section, enter a name (for this post, we use
MyHealthCareGuardrail
), an optional description, and a message to display if your guardrail blocks the user prompt, then choose Next.
Configuring Multimodal Content filters
Security is paramount when building AI applications. With image content filters in HAQM Bedrock Guardrails, content filters can now detect and filter both text and image content through six protection categories: Hate, Insults, Sexual, Violence, Misconduct, and Prompt Attacks.
- In the Configure content filters section, for maximum protection, especially in sensitive sectors like healthcare in our example use case, set your confidence thresholds to High across all categories for both text and image content.
- Enable prompt attack protection to prevent system instruction tampering, and use input tagging to maintain accurate classification of system prompts, then choose Next.
Denied topics
In healthcare applications, we need clear boundaries around medical advice. Let’s configure HAQM Bedrock Guardrails to prevent users from attempting disease diagnosis, which should be handled by qualified healthcare professionals.
- In the Add denied topics section, create a new topic called Disease Diagnosis, add example phrases that represent diagnostic queries, and choose Confirm.
This setting helps makes sure our application stays within appropriate boundaries for insurance-related queries while avoiding medical diagnosis discussions. For example, when users ask questions like “Do I have diabetes?” or “What’s causing my headache?”, the guardrail will detect these as diagnosis-related queries and block them with an appropriate response.
- After you set up your denied topics, choose Next to proceed with word filters.
Word filters
Configuring word filters in HAQM Bedrock Guardrails helps keep our healthcare insurance application focused and professional. These filters help maintain conversation boundaries and make sure responses stay relevant to health insurance queries.
Let’s set up word filters for two key purposes:
- Block inappropriate language to maintain professional discourse
- Filter irrelevant topics that fall outside the healthcare insurance scope
To set them up, do the following:
- In the Add word filters section, add custom words or phrases to filter (in our example, we include off-topic terms like “stocks,” “investment strategies,” and “financial performance”), then choose Next.
Sensitive information filtersWith
sensitive information filters, you can configure filters to block email addresses, phone numbers, and other personally identifiable information (PII), as well as set up custom regex patterns for industry-specific data requirements. For example, healthcare providers use these filters to maintain HIPAA compliance to help automatically block PII types that they include. This way, they can use AI capabilities while helping to maintain strict patient privacy standards.
- For our example, configure filters for blocking the email address and phone number of healthcare insurance users, then choose Next.
Contextual grounding checks We use HAQM Bedrock Guardrails contextual grounding and relevance checks in our application to help validate model responses, detect hallucinations, and support alignment with reference sources.
- Set up the thresholds for contextual grounding and relevance checks (we set them to 0.7), then choose Next.
Automated Reasoning checks
Automated Reasoning checks help detect hallucinations and provide a verifiable proof that our application’s model (LLM) response is accurate.
The first step to incorporate Automated Reasoning checks for our application is to create an Automated Reasoning policy that is composed of a set of variables, defined with a name, type, and description, and the logical rules that operate on the variables. These rules are expressed in formal logic, but they’re translated to natural language to make it straightforward for a user without formal logic expertise to refine a model. Automated Reasoning checks use the variable descriptions to extract their values when validating a Q&A.
- To create an Automated Reasoning policy, choose the new Automated Reasoning menu option under Safeguards.
- Create a new policy and give it a name, then upload an existing document that defines the right solution space, such as an HR guideline or an operational manual. For this demo, we use an example healthcare insurance policy document that includes the insurance coverage policies applicable to insurance holders.
Automated Reasoning checks is in preview in HAQM Bedrock Guardrails in the US West (Oregon) AWS Region. To request to be considered for access to the preview today, contact your AWS account team.
- Define the policy’s intent and processing parameters and choose Create policy.
The system now initiates an automated process to create your Automated Reasoning policy. This process involves analyzing your document, identifying key concepts, breaking down the document into individual units, translating these natural language units into formal logic, validating the translations, and finally combining them into a comprehensive logical model. You can review the generated structure, including the rules and variables, and edit these for accuracy through the UI.
- To attach the Automated Reasoning policy to your guardrail, turn on Enable Automated Reasoning policy, choose the policy and policy version you want to use, then choose Next.
- Review the configurations set in the previous steps and choose Create guardrail.
Test your guardrail
We can now test our healthcare insurance call center application with different inputs and see how the configured guardrail intervenes for harmful and undesirable multimodal content.
- On the HAQM Bedrock console, on the guardrail details page, choose Select model in the Test panel.
- Choose your model, then choose Apply.
For our example, we use the HAQM Nova Lite FM, which is a low-cost multimodal model that is lightning fast for processing image, video, and text input. For your use case, you can use another model of your choice.
- Enter a query prompt with a denied topic.
For example, if we ask “I have cold and sore throat, do you think I have Covid, and if so please provide me information on what is the coverage,” the system recognizes this as a request for a disease diagnosis. Because Disease Diagnosis is configured as a denied topic in the guardrail settings, the system blocks the response.
- Choose View trace to see the details of the intervention.
You can test with other queries. For example, if we ask “What is the financial performance of your insurance company in 2024?”, the word filter guardrail that we configured earlier intervenes. You can choose View trace to see that the word filter was invoked.
Next, we use a prompt to validate if PII data in input can be blocked using the guardrail. We ask “Can you send my lab test report to abc@gmail.com?” Because the guardrail was set up to block email addresses, the trace shows an intervention due to PII detection in the input prompt.
If we enter the prompt “I am frustrated on someone, and feel like hurting the person.” The text content filter is invoked for Violence because we set up Violence as a high threshold for detection of the harmful content while creating the guardrail.
If we provide an image file in the prompt that contains content of the category Violence, the image content filter gets invoked for Violence.
Finally, we test the Automated Reasoning policy by using the Test playground on the HAQM Bedrock console. You can input a sample user question and an incorrect answer to check if your Automated Reasoning policy works correctly. In our example, according to the insurance policy provided, new insurance claims take a minimum 7 days to get processed. Here, we input the question “Can you process my new insurance claim in less than 3 days?” and the incorrect answer “Yes, I can process it in 3 days.”
The Automated Reasoning checks marked the answer as Invalid and provided details about why, including which specific rule was broken, the relevant variables it found, and recommendations for fixing the issue.
Independent API
In addition to using HAQM Bedrock Guardrails as shown in the preceding section for HAQM Bedrock hosted models, you can now use HAQM Bedrock Guardrails to apply safeguards on input prompts and model responses for FMs available in other services (such as HAQM SageMaker), on infrastructure such as HAQM Elastic Compute Cloud (HAQM EC2), on on-premises deployments, and other third-party FMs beyond HAQM Bedrock. The ApplyGuardrail API assesses text using your preconfigured guardrails in HAQM Bedrock, without invoking the FMs.
While testing HAQM Bedrock Guardrails, select Use ApplyGuardrail API to validate user inputs using MyHealthCareGuardrail
. The following test doesn’t require you to choose an HAQM Bedrock hosted model, you can test configured guardrails as an independent API.
Conclusion
In this post, we demonstrated how HAQM Bedrock Guardrails helps block harmful and undesirable multimodal content. Using a healthcare insurance call center scenario, we walked through the process of configuring and testing various guardrails. We also highlighted the flexibility of our ApplyGuardrail
API, which implements guardrail checks on any input prompt, regardless of the FM in use. You can seamlessly integrate safeguards across models deployed on HAQM Bedrock or external platforms.
Ready to take your AI applications to the next level of safety and compliance? Check out HAQM Bedrock Guardrails announces IAM Policy-based enforcement to deliver safe AI interactions, which enables security and compliance teams to establish mandatory guardrails for model inference calls, helping to consistently enforce your guardrails across AI interactions. To dive deeper into HAQM Bedrock Guardrails, refer to Use guardrails for your use case, which includes advanced use cases with HAQM Knowledge Bases and HAQM Bedrock Agents.
This guidance is for informational purposes only. You should still perform your own independent assessment and take measures to ensure that you comply with your own specific quality control practices and standards, and the local rules, laws, regulations, licenses and terms of use that apply to you, your content, and the third-party model referenced in this guidance. AWS has no control or authority over the third-party model referenced in this guidance and does not make any representations or warranties that the third-party model is secure, virus-free, operational, or compatible with your production environment and standards. AWS does not make any representations, warranties, or guarantees that any information in this guidance will result in a particular outcome or result.
References
- Use the ApplyGuardrail API with long-context inputs and streaming outputs in HAQM Bedrock
- Guardrails for HAQM Bedrock helps implement safeguards customized to your use cases and responsible AI policies
- Detect and filter harmful content by using HAQM Bedrock Guardrails
About the authors
Divya Muralidharan is a Solutions Architect at AWS, supporting a strategic customer. Divya is an aspiring member of the AI/ML technical field community at AWS. She is passionate about using technology to accelerate growth, provide value to customers, and achieve business outcomes. Outside of work, she spends time cooking, singing, and growing plants.
Rachna Chadha is a Principal Technologist at AWS, where she helps customers leverage generative AI solutions to drive business value. With decades of experience in helping organizations adopt and implement emerging technologies, particularly within the healthcare domain, Rachna is passionate about the ethical and responsible use of artificial intelligence. She believes AI has the power to create positive societal change and foster both economic and social progress. Outside of work, Rachna enjoys spending time with her family, hiking, and listening to music.