AWS Machine Learning Blog
Category: HAQM SageMaker AI
Build an AI-powered document processing platform with open source NER model and LLM on HAQM SageMaker
In this post, we discuss how you can build an AI-powered document processing platform with open source NER and LLMs on SageMaker.
How Salesforce achieves high-performance model deployment with HAQM SageMaker AI
This post is a joint collaboration between Salesforce and AWS and is being cross-published on both the Salesforce Engineering Blog and the AWS Machine Learning Blog. The Salesforce AI Model Serving team is working to push the boundaries of natural language processing and AI capabilities for enterprise applications. Their key focus areas include optimizing large […]
Reduce ML training costs with HAQM SageMaker HyperPod
In this post, we explore the challenges of large-scale frontier model training, focusing on hardware failures and the benefits of HAQM SageMaker HyperPod – a solution that minimizes disruptions, enhances efficiency, and reduces training costs.
Advanced tracing and evaluation of generative AI agents using LangChain and HAQM SageMaker AI MLFlow
In this post, I show you how to combine LangChain’s LangGraph, HAQM SageMaker AI, and MLflow to demonstrate a powerful workflow for developing, evaluating, and deploying sophisticated generative AI agents. This integration provides the tools needed to gain deep insights into the generative AI agent’s performance, iterate quickly, and maintain version control throughout the development process.
How Lumi streamlines loan approvals with HAQM SageMaker AI
Lumi is a leading Australian fintech lender empowering small businesses with fast, flexible, and transparent funding solutions. They use real-time data and machine learning (ML) to offer customized loans that fuel sustainable growth and solve the challenges of accessing capital. This post explores how Lumi uses HAQM SageMaker AI to meet this goal, enhance their transaction processing and classification capabilities, and ultimately grow their business by providing faster processing of loan applications, more accurate credit decisions, and improved customer experience.
Enhance deployment guardrails with inference component rolling updates for HAQM SageMaker AI inference
In this post, we discuss the challenges faced by organizations when updating models in production. Then we deep dive into the new rolling update feature for inference components and provide practical examples using DeepSeek distilled models to demonstrate this feature. Finally, we explore how to set up rolling updates in different scenarios.
Unleash AI innovation with HAQM SageMaker HyperPod
In this post, we show how SageMaker HyperPod, and its new features introduced at AWS re:Invent 2024, is designed to meet the demands of modern AI workloads, offering a persistent and optimized cluster tailored for distributed training and accelerated inference at cloud scale and attractive price-performance.
How to run Qwen 2.5 on AWS AI chips using Hugging Face libraries
In this post, we outline how to get started with deploying the Qwen 2.5 family of models on an Inferentia instance using HAQM Elastic Compute Cloud (HAQM EC2) and HAQM SageMaker using the Hugging Face Text Generation Inference (TGI) container and the Hugging Face Optimum Neuron library. Qwen2.5 Coder and Math variants are also supported.
Optimize hosting DeepSeek-R1 distilled models with Hugging Face TGI on HAQM SageMaker AI
In this post, we demonstrate how to optimize hosting DeepSeek-R1 distilled models with Hugging Face Text Generation Inference (TGI) on HAQM SageMaker AI.
Deploy DeepSeek-R1 distilled models on HAQM SageMaker using a Large Model Inference container
Deploying DeepSeek models on SageMaker AI provides a robust solution for organizations seeking to use state-of-the-art language models in their applications. In this post, we show how to use the distilled models in SageMaker AI, which offers several options to deploy the distilled versions of the R1 model.