AWS Machine Learning Blog

Category: HAQM SageMaker

Build an AI-powered document processing platform with open source NER model and LLM on HAQM SageMaker

In this post, we discuss how you can build an AI-powered document processing platform with open source NER and LLMs on SageMaker.

Supercharge your LLM performance with HAQM SageMaker Large Model Inference container v15

Today, we’re excited to announce the launch of HAQM SageMaker Large Model Inference (LMI) container v15, powered by vLLM 0.8.4 with support for the vLLM V1 engine. This release introduces significant performance improvements, expanded model compatibility with multimodality (that is, the ability to understand and analyze text-to-text, images-to-text, and text-to-images data), and provides built-in integration with vLLM to help you seamlessly deploy and serve large language models (LLMs) with the highest performance at scale.

How Salesforce achieves high-performance model deployment with HAQM SageMaker AI

This post is a joint collaboration between Salesforce and AWS and is being cross-published on both the Salesforce Engineering Blog and the AWS Machine Learning Blog. The Salesforce AI Model Serving team is working to push the boundaries of natural language processing and AI capabilities for enterprise applications. Their key focus areas include optimizing large […]

Optimizing Mixtral 8x7B on HAQM SageMaker with AWS Inferentia2

This post demonstrates how to deploy and serve the Mixtral 8x7B language model on AWS Inferentia2 instances for cost-effective, high-performance inference. We’ll walk through model compilation using Hugging Face Optimum Neuron, which provides a set of tools enabling straightforward model loading, training, and inference, and the Text Generation Inference (TGI) Container, which has the toolkit for deploying and serving LLMs with Hugging Face.

Reduce ML training costs with HAQM SageMaker HyperPod

In this post, we explore the challenges of large-scale frontier model training, focusing on hardware failures and the benefits of HAQM SageMaker HyperPod – a solution that minimizes disruptions, enhances efficiency, and reduces training costs.

Model customization, RAG, or both: A case study with HAQM Nova

The introduction of HAQM Nova models represent a significant advancement in the field of AI, offering new opportunities for large language model (LLM) optimization. In this post, we demonstrate how to effectively perform model customization and RAG with HAQM Nova models as a baseline. We conducted a comprehensive comparison study between model customization and RAG using the latest HAQM Nova models, and share these valuable insights.

Llama 4 family of models from Meta are now available in SageMaker JumpStart

Today, we’re excited to announce the availability of Llama 4 Scout and Maverick models in HAQM SageMaker JumpStart. In this blog post, we walk you through how to deploy and prompt a Llama-4-Scout-17B-16E-Instruct model using SageMaker JumpStart.

Advanced tracing and evaluation of generative AI agents using LangChain and HAQM SageMaker AI MLFlow

In this post, I show you how to combine LangChain’s LangGraph, HAQM SageMaker AI, and MLflow to demonstrate a powerful workflow for developing, evaluating, and deploying sophisticated generative AI agents. This integration provides the tools needed to gain deep insights into the generative AI agent’s performance, iterate quickly, and maintain version control throughout the development process.

Reinforcement learning from human feedback (RLHF) vs. AI feedback (RLAIF)

Fine-tune large language models with reinforcement learning from human or AI feedback

In this post, we introduce a state-of-the-art method to fine-tune LLMs by reinforcement learning, reviewed the pros and cons of RLHF vs. RLAIF vs. DPO, and saw how to scale LLM fine-tuning efforts with RLAIF. We also see how to implement an end-to-end RLAIF pipeline on SageMaker using the Hugging Face Transformer and TRL libraries, and using either off-the-shelf toxicity reward models to align responses during PPO or by directly prompting an LLM to generate quantitative reward feedback during PPO.

How Lumi streamlines loan approvals with HAQM SageMaker AI

Lumi is a leading Australian fintech lender empowering small businesses with fast, flexible, and transparent funding solutions. They use real-time data and machine learning (ML) to offer customized loans that fuel sustainable growth and solve the challenges of accessing capital. This post explores how Lumi uses HAQM SageMaker AI to meet this goal, enhance their transaction processing and classification capabilities, and ultimately grow their business by providing faster processing of loan applications, more accurate credit decisions, and improved customer experience.