AWS Machine Learning Blog
Category: Foundation models
AWS Field Experience reduced cost and delivered low latency and high performance with HAQM Nova Lite foundation model
The AFX team’s product migration to the Nova Lite model has delivered tangible enterprise value by enhancing sales workflows. By migrating to the HAQM Nova Lite model, the team has not only achieved significant cost savings and reduced latency, but has also empowered sellers with a leading intelligent and reliable solution.
Reduce ML training costs with HAQM SageMaker HyperPod
In this post, we explore the challenges of large-scale frontier model training, focusing on hardware failures and the benefits of HAQM SageMaker HyperPod – a solution that minimizes disruptions, enhances efficiency, and reduces training costs.
Pixtral Large is now available in HAQM Bedrock
In this post, we demonstrate how to get started with the Pixtral Large model in HAQM Bedrock. The Pixtral Large multimodal model allows you to tackle a variety of use cases, such as document understanding, logical reasoning, handwriting recognition, image comparison, entity extraction, extracting structured data from scanned images, and caption generation.
Revolutionizing customer service: MaestroQA’s integration with HAQM Bedrock for actionable insight
In this post, we dive deeper into one of MaestroQA’s key features—conversation analytics, which helps support teams uncover customer concerns, address points of friction, adapt support workflows, and identify areas for coaching through the use of HAQM Bedrock. We discuss the unique challenges MaestroQA overcame and how they use AWS to build new features, drive customer insights, and improve operational inefficiencies.
Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on HAQM SageMaker AI
In this post, we demonstrate how to optimize hosting DeepSeek-R1 distilled models with Hugging Face Text Generation Inference (TGI) on HAQM SageMaker AI.
Benchmarking HAQM Nova and GPT-4o models with FloTorch
A recent evaluation conducted by FloTorch compared the performance of HAQM Nova models with OpenAI’s GPT-4o. In this post, we discuss the findings from this benchmarking in more detail.
Deploy DeepSeek-R1 distilled models on HAQM SageMaker using a Large Model Inference container
Deploying DeepSeek models on SageMaker AI provides a robust solution for organizations seeking to use state-of-the-art language models in their applications. In this post, we show how to use the distilled models in SageMaker AI, which offers several options to deploy the distilled versions of the R1 model.
Build a Multi-Agent System with LangGraph and Mistral on AWS
In this post, we explore how to use LangGraph and Mistral models on HAQM Bedrock to create a powerful multi-agent system that can handle sophisticated workflows through collaborative problem-solving. This integration enables the creation of AI agents that can work together to solve complex problems, mimicking humanlike reasoning and collaboration.
Pixtral-12B-2409 is now available on HAQM Bedrock Marketplace
In this post, we walk through how to discover, deploy, and use the Mistral AI Pixtral 12B model for a variety of real-world vision use cases.
Achieve ~2x speed-up in LLM inference with Medusa-1 on HAQM SageMaker AI
Researchers developed Medusa, a framework to speed up LLM inference by adding extra heads to predict multiple tokens simultaneously. This post demonstrates how to use Medusa-1, the first version of the framework, to speed up an LLM by fine-tuning it on HAQM SageMaker AI and confirms the speed up with deployment and a simple load test. Medusa-1 achieves an inference speedup of around two times without sacrificing model quality, with the exact improvement varying based on model size and data used. In this post, we demonstrate its effectiveness with a 1.8 times speedup observed on a sample dataset.