AWS Machine Learning Blog

Category: Announcements

Use HAQM Bedrock Intelligent Prompt Routing for cost and latency benefits

Today, we’re happy to announce the general availability of HAQM Bedrock Intelligent Prompt Routing. In this blog post, we detail various highlights from our internal testing, how you can get started, and point out some caveats and best practices. We encourage you to incorporate HAQM Bedrock Intelligent Prompt Routing into your new and existing generative AI applications.

Pixtral Large is now available in HAQM Bedrock

In this post, we demonstrate how to get started with the Pixtral Large model in HAQM Bedrock. The Pixtral Large multimodal model allows you to tackle a variety of use cases, such as document understanding, logical reasoning, handwriting recognition, image comparison, entity extraction, extracting structured data from scanned images, and caption generation.

Llama 4 family of models from Meta are now available in SageMaker JumpStart

Today, we’re excited to announce the availability of Llama 4 Scout and Maverick models in HAQM SageMaker JumpStart. In this blog post, we walk you through how to deploy and prompt a Llama-4-Scout-17B-16E-Instruct model using SageMaker JumpStart.

Evaluate models or RAG systems using HAQM Bedrock Evaluations – Now generally available

Today, we’re excited to announce the general availability of these evaluation features in HAQM Bedrock Evaluations, along with significant enhancements that make them fully environment-agnostic. In this post, we explore these new features in detail, showing you how to evaluate both RAG systems and models with practical examples. We demonstrate how to use the comparison capabilities to benchmark different implementations and make data-driven decisions about your AI deployments.

Introducing AWS MCP Servers for code assistants (Part 1)

We’re excited to announce the open source release of AWS MCP Servers for code assistants — a suite of specialized Model Context Protocol (MCP) servers that bring HAQM Web Services (AWS) best practices directly to your development workflow. This post is the first in a series covering AWS MCP Servers. In this post, we walk through how these specialized MCP servers can dramatically reduce your development time while incorporating security controls, cost optimizations, and AWS Well-Architected best practices into your code.

AWS App Studio introduces a prebuilt solutions catalog and cross-instance Import and Export

In a recent AWS What’s New Post, App Studio announced two new features to accelerate application building: Prebuilt solutions catalog and cross-instance Import and Export. In this post, we walk through how to use the prebuilt solutions catalog to get started quickly and use the Import and Export feature

HAQM Bedrock Guardrails image content filters provide industry-leading safeguards, helping customer block up to 88% of harmful multimodal content: Generally available today

HAQM Bedrock Guardrails announces the general availability of image content filters, enabling you to moderate both image and text content in your generative AI applications. In this post, we discuss how to get started with image content filters in HAQM Bedrock Guardrails.

HAQM SageMaker JumpStart adds fine-tuning support for models in a private model hub

Today, we are announcing an enhanced private hub feature with several new capabilities that give organizations greater control over their ML assets. These enhancements include the ability to fine-tune SageMaker JumpStart models directly within the private hub, support for adding and managing custom-trained models, deep linking capabilities for associated notebooks, and improved model version management.

Unleashing the multimodal power of HAQM Bedrock Data Automation to transform unstructured data into actionable insights

Today, we’re excited to announce the general availability of HAQM Bedrock Data Automation, a powerful, fully managed capability within HAQM Bedrock that seamlessly transforms unstructured multimodal data into structured, application-ready insights with high accuracy, cost efficiency, and scalability.

How to run Qwen 2.5 on AWS AI chips using Hugging Face libraries

In this post, we outline how to get started with deploying the Qwen 2.5 family of models on an Inferentia instance using HAQM Elastic Compute Cloud (HAQM EC2) and HAQM SageMaker using the Hugging Face Text Generation Inference (TGI) container and the Hugging Face Optimum Neuron library. Qwen2.5 Coder and Math variants are also supported.