AWS Machine Learning Blog

Category: HAQM SageMaker

Ground truth curation and metric interpretation best practices for evaluating generative AI question answering using FMEval

Ground truth curation and metric interpretation best practices for evaluating generative AI question answering using FMEval

In this post, we discuss best practices for working with Foundation Model Evaluations Library (FMEval) in ground truth curation and metric interpretation for evaluating question answering applications for factual knowledge and quality.

Deploy HAQM SageMaker pipelines using AWS Controllers for Kubernetes

Deploy HAQM SageMaker pipelines using AWS Controllers for Kubernetes

In this post, we show how ML engineers familiar with Jupyter notebooks and SageMaker environments can efficiently work with DevOps engineers familiar with Kubernetes and related tools to design and maintain an ML pipeline with the right infrastructure for their organization. This enables DevOps engineers to manage all the steps of the ML lifecycle with the same set of tools and environment they are used to.

Effectively manage foundation models for generative AI applications with HAQM SageMaker Model Registry

Effectively manage foundation models for generative AI applications with HAQM SageMaker Model Registry

In this post, we explore the new features of Model Registry that streamline foundation model (FM) management: you can now register unzipped model artifacts and pass an End User License Agreement (EULA) acceptance flag without needing users to intervene.

How Thomson Reuters Labs achieved AI/ML innovation at pace with AWS MLOps services

How Thomson Reuters Labs achieved AI/ML innovation at pace with AWS MLOps services

In this post, we show you how Thomson Reuters Labs (TR Labs) was able to develop an efficient, flexible, and powerful MLOps process by adopting a standardized MLOps framework that uses AWS SageMaker, SageMaker Experiments, SageMaker Model Registry, and SageMaker Pipelines. The goal being to accelerate how quickly teams can experiment and innovate using AI and machine learning (ML)—whether using natural language processing (NLP), generative AI, or other techniques. We discuss how this has helped decrease the time to market for fresh ideas and helped build a cost-efficient machine learning lifecycle.

Use LangChain with PySpark to process documents at massive scale with HAQM SageMaker Studio and HAQM EMR Serverless

Use LangChain with PySpark to process documents at massive scale with HAQM SageMaker Studio and HAQM EMR Serverless

In this post, we explore how to build a scalable and efficient Retrieval Augmented Generation (RAG) system using the new EMR Serverless integration, Spark’s distributed processing, and an HAQM OpenSearch Service vector database powered by the LangChain orchestration framework. This solution enables you to process massive volumes of textual data, generate relevant embeddings, and store them in a powerful vector database for seamless retrieval and generation.

Best practices for prompt engineering with Meta Llama 3 for Text-to-SQL use cases

Best practices for prompt engineering with Meta Llama 3 for Text-to-SQL use cases

In this post, we explore a solution that uses the vector engine ChromaDB and Meta Llama 3, a publicly available foundation model hosted on SageMaker JumpStart, for a Text-to-SQL use case. We shared a brief history of Meta Llama 3, best practices for prompt engineering with Meta Llama 3 models, and an architecture pattern using few-shot prompting and RAG to extract the relevant schemas stored as vectors in ChromaDB.

Get started with NVIDIA NIM Inference Microservices on HAQM SageMaker

Accelerate Generative AI Inference with NVIDIA NIM Microservices on HAQM SageMaker

In this post, we provide a walkthrough of how customers can use generative artificial intelligence (AI) models and LLMs using NVIDIA NIM integration with SageMaker. We demonstrate how this integration works and how you can deploy these state-of-the-art models on SageMaker, optimizing their performance and cost.

Provide a personalized experience for news readers using HAQM Personalize and HAQM Titan Text Embeddings on HAQM Bedrock

Provide a personalized experience for news readers using HAQM Personalize and HAQM Titan Text Embeddings on HAQM Bedrock

In this post, we show how you can recommend breaking news to a user using AWS AI/ML services. By taking advantage of the power of HAQM Personalize and HAQM Titan Text Embeddings on HAQM Bedrock, you can show articles to interested users within seconds of them being published.

Snowflake Arctic models are now available in HAQM SageMaker JumpStart

Today, we are excited to announce that the Snowflake Arctic Instruct model is available through HAQM SageMaker JumpStart to deploy and run inference. In this post, we walk through how to discover and deploy the Snowflake Arctic Instruct model using SageMaker JumpStart, and provide example use cases with specific prompts.