AWS Machine Learning Blog

Category: Foundation models

AWS Field Experience reduced cost and delivered low latency and high performance with HAQM Nova Lite foundation model

The AFX team’s product migration to the Nova Lite model has delivered tangible enterprise value by enhancing sales workflows. By migrating to the HAQM Nova Lite model, the team has not only achieved significant cost savings and reduced latency, but has also empowered sellers with a leading intelligent and reliable solution.

Reduce ML training costs with HAQM SageMaker HyperPod

In this post, we explore the challenges of large-scale frontier model training, focusing on hardware failures and the benefits of HAQM SageMaker HyperPod – a solution that minimizes disruptions, enhances efficiency, and reduces training costs.

Pixtral Large is now available in HAQM Bedrock

In this post, we demonstrate how to get started with the Pixtral Large model in HAQM Bedrock. The Pixtral Large multimodal model allows you to tackle a variety of use cases, such as document understanding, logical reasoning, handwriting recognition, image comparison, entity extraction, extracting structured data from scanned images, and caption generation.

Revolutionizing customer service: MaestroQA’s integration with HAQM Bedrock for actionable insight

In this post, we dive deeper into one of MaestroQA’s key features—conversation analytics, which helps support teams uncover customer concerns, address points of friction, adapt support workflows, and identify areas for coaching through the use of HAQM Bedrock. We discuss the unique challenges MaestroQA overcame and how they use AWS to build new features, drive customer insights, and improve operational inefficiencies.

Deploy DeepSeek-R1 distilled models on HAQM SageMaker using a Large Model Inference container

Deploying DeepSeek models on SageMaker AI provides a robust solution for organizations seeking to use state-of-the-art language models in their applications. In this post, we show how to use the distilled models in SageMaker AI, which offers several options to deploy the distilled versions of the R1 model.

Build a Multi-Agent System with LangGraph and Mistral on AWS

In this post, we explore how to use LangGraph and Mistral models on HAQM Bedrock to create a powerful multi-agent system that can handle sophisticated workflows through collaborative problem-solving. This integration enables the creation of AI agents that can work together to solve complex problems, mimicking humanlike reasoning and collaboration.

Achieve ~2x speed-up in LLM inference with Medusa-1 on HAQM SageMaker AI

Researchers developed Medusa, a framework to speed up LLM inference by adding extra heads to predict multiple tokens simultaneously. This post demonstrates how to use Medusa-1, the first version of the framework, to speed up an LLM by fine-tuning it on HAQM SageMaker AI and confirms the speed up with deployment and a simple load test. Medusa-1 achieves an inference speedup of around two times without sacrificing model quality, with the exact improvement varying based on model size and data used. In this post, we demonstrate its effectiveness with a 1.8 times speedup observed on a sample dataset.