AWS Machine Learning Blog

Build responsible AI applications with HAQM Bedrock Guardrails

As organizations embrace generative AI, they face critical challenges in making sure their applications align with their designed safeguards. Although foundation models (FMs) offer powerful capabilities, they can also introduce unique risks, such as generating harmful content, exposing sensitive information, being vulnerable to prompt injection attacks, and returning model hallucinations.

HAQM Bedrock Guardrails has helped address these challenges for multiple organizations, such as MAPRE, KONE, Fiserv, PagerDuty, Aha, and more. Just as traditional applications require multi-layered security, HAQM Bedrock Guardrails implements essential safeguards across model, prompt, and application levels—blocking up to 88% more undesirable and harmful multimodal content. HAQM Bedrock Guardrails helps filter over 75% hallucinated responses in Retrieval Augmented Generation (RAG) and summarization use cases, and stands as the first and only safeguard using Automated Reasoning to prevent factual errors from hallucinations.

In this post, we show how to implement safeguards using HAQM Bedrock Guardrails in a healthcare insurance use case.

Solution overview

We consider an innovative AI assistant designed to streamline interactions of policyholders with the healthcare insurance firm. With this AI-powered solution, policyholders can check coverage details, submit claims, find in-network providers, and understand their benefits through natural, conversational interactions. The assistant provides all-day support, handling routine inquiries while allowing human agents to focus on complex cases. To help enable secure and compliant operations of our assistant, we use HAQM Bedrock Guardrails to serve as a critical safety framework. HAQM Bedrock Guardrails can help maintain high standards of blocking undesirable and harmful multimodal content. This not only protects the users, but also builds trust in the AI system, encouraging wider adoption and improving overall customer experience in healthcare insurance interactions.

This post walks you through the capabilities of HAQM Bedrock Guardrails from the AWS Management Console. Refer to the following GitHub repo for information about creating, updating, and testing HAQM Bedrock Guardrails using the SDK.

HAQM Bedrock Guardrails provides configurable safeguards to help safely build generative AI applications at scale. It evaluates user inputs and model responses based on specific policies, working with all large language models (LLMs) on HAQM Bedrock, fine-tuned models, and external FMs using the ApplyGuardrail API. The solution integrates seamlessly with HAQM Bedrock Agents and HAQM Bedrock Knowledge Bases, so organizations can apply multiple guardrails across applications with tailored controls.

Guardrails can be implemented in two ways: direct integration with Invoke APIs (InvokeModel and InvokeModelWithResponseStream) and Converse APIs (Converse and ConverseStream) for models hosted on HAQM Bedrock, applying safeguards during inference, or through the flexible ApplyGuardrail API, which enables independent content evaluation without model invocation. This second method is ideal for assessing inputs or outputs at various application stages and works with custom or third-party models that are not hosted on HAQM Bedrock. Both approaches empower developers to implement use case-specific safeguards aligned with responsible AI policies, helping to block undesirable and harmful multimodal content from generative AI applications.

The following diagram depicts the six safeguarding policies offered by HAQM Bedrock Guardrails.

Diagram showing HAQM Bedrock Guardrails system flow from user input to final response with content filtering steps

Prerequisites

Before we begin, make sure you have access to the console with appropriate permissions for HAQM Bedrock. If you haven’t set up HAQM Bedrock yet, refer to Getting started in the HAQM Bedrock console.

Create a guardrail

To create guardrail for our healthcare insurance assistant, complete the following steps:

  1. On the HAQM Bedrock console, choose Guardrails in the navigation pane.
  2. Choose Create guardrail.
  3. In the Provide guardrail details section, enter a name (for this post, we use MyHealthCareGuardrail), an optional description, and a message to display if your guardrail blocks the user prompt, then choose Next.

HAQM Bedrock Guardrails configuration interface for MyHealthCareGuardrail with multi-step setup process and customizable options

Configuring Multimodal Content filters

Security is paramount when building AI applications. With image content filters in HAQM Bedrock Guardrails, content filters can now detect and filter both text and image content through six protection categories: Hate, Insults, Sexual, Violence, Misconduct, and Prompt Attacks.

  1. In the Configure content filters section, for maximum protection, especially in sensitive sectors like healthcare in our example use case, set your confidence thresholds to High across all categories for both text and image content.
  2. Enable prompt attack protection to prevent system instruction tampering, and use input tagging to maintain accurate classification of system prompts, then choose Next.

WS guardrail configuration interface for content filtering showing harmful content categories, threshold controls, and prompt attack prevention settings

Denied topics

In healthcare applications, we need clear boundaries around medical advice. Let’s configure HAQM Bedrock Guardrails to prevent users from attempting disease diagnosis, which should be handled by qualified healthcare professionals.

  1. In the Add denied topics section, create a new topic called Disease Diagnosis, add example phrases that represent diagnostic queries, and choose Confirm.

This setting helps makes sure our application stays within appropriate boundaries for insurance-related queries while avoiding medical diagnosis discussions. For example, when users ask questions like “Do I have diabetes?” or “What’s causing my headache?”, the guardrail will detect these as diagnosis-related queries and block them with an appropriate response.

HAQM Bedrock Guardrails interface showing Disease Diagnosis denied topic setup with sample phrases

  1. After you set up your denied topics, choose Next to proceed with word filters.

HAQM Bedrock Guardrails configuration interface with Disease Diagnosis as denied topic

Word filters

Configuring word filters in HAQM Bedrock Guardrails helps keep our healthcare insurance application focused and professional. These filters help maintain conversation boundaries and make sure responses stay relevant to health insurance queries.

Let’s set up word filters for two key purposes:

  • Block inappropriate language to maintain professional discourse
  • Filter irrelevant topics that fall outside the healthcare insurance scope

To set them up, do the following:

  1. In the Add word filters section, add custom words or phrases to filter (in our example, we include off-topic terms like “stocks,” “investment strategies,” and “financial performance”), then choose Next.

HAQM Bedrock guardrail creation interface showing word filter configuration steps and options

Sensitive information filtersWith sensitive information filters, you can configure filters to block email addresses, phone numbers, and other personally identifiable information (PII), as well as set up custom regex patterns for industry-specific data requirements. For example, healthcare providers use these filters to maintain HIPAA compliance to help automatically block PII types that they include. This way, they can use AI capabilities while helping to maintain strict patient privacy standards.

  1. For our example, configure filters for blocking the email address and phone number of healthcare insurance users, then choose Next.

HAQM Bedrock interface for configuring sensitive information filters with PII and regex options
Contextual grounding checks We use HAQM Bedrock Guardrails contextual grounding and relevance checks in our application to help validate model responses, detect hallucinations, and support alignment with reference sources.

  1. Set up the thresholds for contextual grounding and relevance checks (we set them to 0.7), then choose Next.

HAQM Bedrock guardrail configuration for contextual grounding and relevance checks

Automated Reasoning checks

Automated Reasoning checks help detect hallucinations and provide a verifiable proof that our application’s model (LLM) response is accurate.

The first step to incorporate Automated Reasoning checks for our application is to create an Automated Reasoning policy that is composed of a set of variables, defined with a name, type, and description, and the logical rules that operate on the variables. These rules are expressed in formal logic, but they’re translated to natural language to make it straightforward for a user without formal logic expertise to refine a model. Automated Reasoning checks use the variable descriptions to extract their values when validating a Q&A.

  1. To create an Automated Reasoning policy, choose the new Automated Reasoning menu option under Safeguards.
  2. Create a new policy and give it a name, then upload an existing document that defines the right solution space, such as an HR guideline or an operational manual. For this demo, we use an example healthcare insurance policy document that includes the insurance coverage policies applicable to insurance holders.

Automated Reasoning checks is in preview in HAQM Bedrock Guardrails in the US West (Oregon) AWS Region. To request to be considered for access to the preview today, contact your AWS account team.

  1. Define the policy’s intent and processing parameters and choose Create policy.

HAQM Bedrock interface showing HealthCareCoveragePolicy creation page with policy details, generation settings, and file upload

The system now initiates an automated process to create your Automated Reasoning policy. This process involves analyzing your document, identifying key concepts, breaking down the document into individual units, translating these natural language units into formal logic, validating the translations, and finally combining them into a comprehensive logical model. You can review the generated structure, including the rules and variables, and edit these for accuracy through the UI.

HAQM Bedrock policy editor displaying comprehensive healthcare coverage rules and variables with types, descriptions, and configuration options

  1. To attach the Automated Reasoning policy to your guardrail, turn on Enable Automated Reasoning policy, choose the policy and policy version you want to use, then choose Next.

HAQM Bedrock guardrail creation wizard on step 7, showing HealthCareCoveragePolicy Automated Reasoning configuration options

  1. Review the configurations set in the previous steps and choose Create guardrail.

HAQM Bedrock Guardrail 8-step configuration summary showing MyHealthCareGuardrail setup with safety measures and blocked response messages

HAQM Bedrock Guardrail content filter configuration showing harmful categories and denied topics

HAQM Bedrock Guardrail Steps 4-5 showing enabled profanity filter, word lists, and PII blocking settings

HAQM Bedrock Guardrail setup steps 6-7 with enabled grounding checks and HealthCareCoveragePolicy settings

Test your guardrail

We can now test our healthcare insurance call center application with different inputs and see how the configured guardrail intervenes for harmful and undesirable multimodal content.

  1. On the HAQM Bedrock console, on the guardrail details page, choose Select model in the Test panel.

HAQM Bedrock healthcare guardrail dashboard displaying overview, status, and test interface

  1. Choose your model, then choose Apply.

For our example, we use the HAQM Nova Lite FM, which is a low-cost multimodal model that is lightning fast for processing image, video, and text input. For your use case, you can use another model of your choice.

AWS Guardrail configuration interface showing model categories, providers, and inference options with Nova Lite selected

  1. Enter a query prompt with a denied topic.

For example, if we ask “I have cold and sore throat, do you think I have Covid, and if so please provide me information on what is the coverage,” the system recognizes this as a request for a disease diagnosis. Because Disease Diagnosis is configured as a denied topic in the guardrail settings, the system blocks the response.

HAQM Bedrock interface with Nova Lite model blocking COVID-19 related question

  1. Choose View trace to see the details of the intervention.

HAQM Bedrock Guardrails interface with Nova Lite model, blocked response for COVID-19 query
You can test with other queries. For example, if we ask “What is the financial performance of your insurance company in 2024?”, the word filter guardrail that we configured earlier intervenes. You can choose View trace to see that the word filter was invoked.

HAQM Bedrock interface showing blocked response due to guardrail word filter detection

Next, we use a prompt to validate if PII data in input can be blocked using the guardrail. We ask “Can you send my lab test report to abc@gmail.com?” Because the guardrail was set up to block email addresses, the trace shows an intervention due to PII detection in the input prompt.

HAQM Bedrock healthcare guardrail demonstration showing blocked response due to sensitive information filter detecting email

If we enter the prompt “I am frustrated on someone, and feel like hurting the person.” The text content filter is invoked for Violence because we set up Violence as a high threshold for detection of the harmful content while creating the guardrail.

HAQM Bedrock guardrail test interface showing blocked response due to detected violence in prompt

If we provide an image file in the prompt that contains content of the category Violence, the image content filter gets invoked for Violence.

HAQM Bedrock guardrail test interface showing blocked response due to detected violence

Finally, we test the Automated Reasoning policy by using the Test playground on the HAQM Bedrock console. You can input a sample user question and an incorrect answer to check if your Automated Reasoning policy works correctly. In our example, according to the insurance policy provided, new insurance claims take a minimum 7 days to get processed. Here, we input the question “Can you process my new insurance claim in less than 3 days?” and the incorrect answer “Yes, I can process it in 3 days.”

HAQM Bedrock Automated Reasoning interface showing HealthCareCoveragePolicy test playground and guardrail configuration

The Automated Reasoning checks marked the answer as Invalid and provided details about why, including which specific rule was broken, the relevant variables it found, and recommendations for fixing the issue.

Invalid validation result for electronic claims processing rule showing 7-10 day requirement with extracted CLAIM variable logic

Independent API

In addition to using HAQM Bedrock Guardrails as shown in the preceding section for HAQM Bedrock hosted models, you can now use HAQM Bedrock Guardrails to apply safeguards on input prompts and model responses for FMs available in other services (such as HAQM SageMaker), on infrastructure such as HAQM Elastic Compute Cloud (HAQM EC2), on on-premises deployments, and other third-party FMs beyond HAQM Bedrock. The ApplyGuardrail API assesses text using your preconfigured guardrails in HAQM Bedrock, without invoking the FMs.

While testing HAQM Bedrock Guardrails, select Use ApplyGuardrail API to validate user inputs using MyHealthCareGuardrail. The following test doesn’t require you to choose an HAQM Bedrock hosted model, you can test configured guardrails as an independent API.

HAQM Bedrock Guardrail API test interface with health-related prompt and safety intervention

Conclusion

In this post, we demonstrated how HAQM Bedrock Guardrails helps block harmful and undesirable multimodal content. Using a healthcare insurance call center scenario, we walked through the process of configuring and testing various guardrails. We also highlighted the flexibility of our ApplyGuardrail API, which implements guardrail checks on any input prompt, regardless of the FM in use. You can seamlessly integrate safeguards across models deployed on HAQM Bedrock or external platforms.

Ready to take your AI applications to the next level of safety and compliance? Check out HAQM Bedrock Guardrails announces IAM Policy-based enforcement to deliver safe AI interactions, which enables security and compliance teams to establish mandatory guardrails for model inference calls, helping to consistently enforce your guardrails across AI interactions. To dive deeper into HAQM Bedrock Guardrails, refer to Use guardrails for your use case, which includes advanced use cases with HAQM Knowledge Bases and HAQM Bedrock Agents.

This guidance is for informational purposes only. You should still perform your own independent assessment and take measures to ensure that you comply with your own specific quality control practices and standards, and the local rules, laws, regulations, licenses and terms of use that apply to you, your content, and the third-party model referenced in this guidance. AWS has no control or authority over the third-party model referenced in this guidance and does not make any representations or warranties that the third-party model is secure, virus-free, operational, or compatible with your production environment and standards. AWS does not make any representations, warranties, or guarantees that any information in this guidance will result in a particular outcome or result.

References


About the authors

Divya Muralidharan is a Solutions Architect at AWS, supporting a strategic customer. Divya is an aspiring member of the AI/ML technical field community at AWS. She is passionate about using technology to accelerate growth, provide value to customers, and achieve business outcomes. Outside of work, she spends time cooking, singing, and growing plants.

Blog AuthorRachna Chadha is a Principal Technologist at AWS, where she helps customers leverage generative AI solutions to drive business value. With decades of experience in helping organizations adopt and implement emerging technologies, particularly within the healthcare domain, Rachna is passionate about the ethical and responsible use of artificial intelligence. She believes AI has the power to create positive societal change and foster both economic and social progress. Outside of work, Rachna enjoys spending time with her family, hiking, and listening to music.