AWS Machine Learning Blog

Category: Artificial Intelligence

AWS Step Functions state machine for audio processing: Whisper transcription, speaker identification, and Bedrock summary tasks

Build a serverless audio summarization solution with HAQM Bedrock and Whisper

In this post, we demonstrate how to use the Open AI Whisper foundation model (FM) Whisper Large V3 Turbo, available in HAQM Bedrock Marketplace, which offers access to over 140 models through a dedicated offering, to produce near real-time transcription. These transcriptions are then processed by HAQM Bedrock for summarization and redaction of sensitive information.

Solution workflow

Implement semantic video search using open source large vision models on HAQM SageMaker and HAQM OpenSearch Serverless

In this post, we demonstrate how to use large vision models (LVMs) for semantic video search using natural language and image queries. We introduce some use case-specific methods, such as temporal frame smoothing and clustering, to enhance the video search performance. Furthermore, we demonstrate the end-to-end functionality of this approach by using both asynchronous and real-time hosting options on HAQM SageMaker AI to perform video, image, and text processing using publicly available LVMs on the Hugging Face Model Hub. Finally, we use HAQM OpenSearch Serverless with its vector engine for low-latency semantic video search.

Multi-account support for HAQM SageMaker HyperPod task governance

In this post, we discuss how an enterprise with multiple accounts can access a shared HAQM SageMaker HyperPod cluster for running their heterogenous workloads. We use SageMaker HyperPod task governance to enable this feature.

Data flow between user, Streamlit app, HAQM Bedrock, and Microsoft SQL Server, illustrating query processing and response generation

Build a Text-to-SQL solution for data consistency in generative AI using HAQM Nova

This post evaluates the key options for querying data using generative AI, discusses their strengths and limitations, and demonstrates why Text-to-SQL is the best choice for deterministic, schema-specific tasks. We show how to effectively use Text-to-SQL using HAQM Nova, a foundation model (FM) available in HAQM Bedrock, to derive precise and reliable answers from your data.

Modernize and migrate on-premises fraud detection machine learning workflows to HAQM SageMaker

Radial is the largest 3PL fulfillment provider, also offering integrated payment, fraud detection, and omnichannel solutions to mid-market and enterprise brands. In this post, we share how Radial optimized the cost and performance of their fraud detection machine learning (ML) applications by modernizing their ML workflow using HAQM SageMaker.

Contextual retrieval in Anthropic using HAQM Bedrock Knowledge Bases

Contextual retrieval enhances traditional RAG by adding chunk-specific explanatory context to each chunk before generating embeddings. This approach enriches the vector representation with relevant contextual information, enabling more accurate retrieval of semantically related content when responding to user queries. In this post, we demonstrate how to use contextual retrieval with Anthropic and HAQM Bedrock Knowledge Bases.

SageMaker PyTorch containers

Run small language models cost-efficiently with AWS Graviton and HAQM SageMaker AI

In this post, we demonstrate how to deploy a small language model on SageMaker AI by extending our pre-built containers to be compatible with AWS Graviton instances. We first provide an overview of the solution, and then provide detailed implementation steps to help you get started. You can find the example notebook in the GitHub repo.

Supercharge your development with Claude Code and HAQM Bedrock prompt caching

In this post, we’ll explore how to combine HAQM Bedrock prompt caching with Claude Code—a coding agent released by Anthropic that is now generally available. This powerful combination transforms your development workflow by delivering lightning-fast responses from reducing inference response latency, as well as lowering input token costs.

Detailed MCP Bedrock architecture with intelligent query processing workflow and AWS service connections

Unlocking the power of Model Context Protocol (MCP) on AWS

We’ve witnessed remarkable advances in model capabilities as generative AI companies have invested in developing their offerings. Language models such as Anthropic’s Claude Opus 4 & Sonnet 4 and HAQM Nova on HAQM Bedrock can reason, write, and generate responses with increasing sophistication. But even as these models grow more powerful, they can only work […]