AWS Machine Learning Blog

Category: HAQM SageMaker

AWS Inferentia and AWS Trainium deliver lowest cost to deploy Llama 3 models in HAQM SageMaker JumpStart

Today, we’re excited to announce the availability of Meta Llama 3 inference on AWS Trainium and AWS Inferentia based instances in HAQM SageMaker JumpStart. The Meta Llama 3 models are a collection of pre-trained and fine-tuned generative text models. HAQM Elastic Compute Cloud (HAQM EC2) Trn1 and Inf2 instances, powered by AWS Trainium and AWS […]

Revolutionize Customer Satisfaction with tailored reward models for your business on HAQM SageMaker

As more powerful large language models (LLMs) are used to perform a variety of tasks with greater accuracy, the number of applications and services that are being built with generative artificial intelligence (AI) is also growing. With great power comes responsibility, and organizations want to make sure that these LLMs produce responses that align with […]

Simple guide to training Llama 2 with AWS Trainium on HAQM SageMaker

Large language models (LLMs) are making a significant impact in the realm of artificial intelligence (AI). Their impressive generative abilities have led to widespread adoption across various sectors and use cases, including content generation, sentiment analysis, chatbot development, and virtual assistant technology. Llama2 by Meta is an example of an LLM offered by AWS. Llama […]

Fine-tune and deploy language models with HAQM SageMaker Canvas and HAQM Bedrock

Imagine harnessing the power of advanced language models to understand and respond to your customers’ inquiries. HAQM Bedrock, a fully managed service providing access to such models, makes this possible. Fine-tuning large language models (LLMs) on domain-specific data supercharges tasks like answering product questions or generating relevant content. In this post, we show how HAQM […]

Cohere Command R and R+ are now available in HAQM SageMaker JumpStart

This blog post is co-written with Pradeep Prabhakaran from Cohere.  Today, we are excited to announce that Cohere Command R and R+ foundation models are available through HAQM SageMaker JumpStart to deploy and run inference. Command R/R+ are the state-of-the-art retrieval augmented generation (RAG)-optimized models designed to tackle enterprise-grade workloads. In this post, we walk through how […]

Databricks DBRX is now available in HAQM SageMaker JumpStart

Today, we are excited to announce that the DBRX model, an open, general-purpose large language model (LLM) developed by Databricks, is available for customers through HAQM SageMaker JumpStart to deploy with one click for running inference. The DBRX LLM employs a fine-grained mixture-of-experts (MoE) architecture, pre-trained on 12 trillion tokens of carefully curated data and […]

Solution architecture

Deploy a Hugging Face (PyAnnote) speaker diarization model on HAQM SageMaker as an asynchronous endpoint

Speaker diarization, an essential process in audio analysis, segments an audio file based on speaker identity. This post delves into integrating Hugging Face’s PyAnnote for speaker diarization with HAQM SageMaker asynchronous endpoints. We provide a comprehensive guide on how to deploy speaker segmentation and clustering solutions using SageMaker on the AWS Cloud.

Evaluate the text summarization capabilities of LLMs for enhanced decision-making on AWS

Organizations across industries are using automatic text summarization to more efficiently handle vast amounts of information and make better decisions. In the financial sector, investment banks condense earnings reports down to key takeaways to rapidly analyze quarterly performance. Media companies use summarization to monitor news and social media so journalists can quickly write stories on […]

Fine tuning workflow

Improve LLM performance with human and AI feedback on HAQM SageMaker for HAQM Engineering

The HAQM EU Design and Construction (HAQM D&C) team is the engineering team designing and constructing HAQM warehouses. The team navigates a large volume of documents and locates the right information to make sure the warehouse design meets the highest standards. In the post A generative AI-powered solution on HAQM SageMaker to help HAQM EU […]

Accelerate ML workflows with HAQM SageMaker Studio Local Mode and Docker support

We are excited to announce two new capabilities in HAQM SageMaker Studio that will accelerate iterative development for machine learning (ML) practitioners: Local Mode and Docker support. ML model development often involves slow iteration cycles as developers switch between coding, training, and deployment. Each step requires waiting for remote compute resources to start up, which […]