AWS Machine Learning Blog
Category: Learning Levels
Dynamic metadata filtering for HAQM Bedrock Knowledge Bases with LangChain
HAQM Bedrock Knowledge Bases has a metadata filtering capability that allows you to refine search results based on specific attributes of the documents, improving retrieval accuracy and the relevance of responses. These metadata filters can be used in combination with the typical semantic (or hybrid) similarity search. In this post, we discuss using metadata filters with HAQM Bedrock Knowledge Bases.
Customize DeepSeek-R1 distilled models using HAQM SageMaker HyperPod recipes – Part 1
In this two-part series, we discuss how you can reduce the DeepSeek model customization complexity by using the pre-built fine-tuning workflows (also called “recipes”) for both DeepSeek-R1 model and its distilled variations, released as part of HAQM SageMaker HyperPod recipes. In this first post, we will build a solution architecture for fine-tuning DeepSeek-R1 distilled models and demonstrate the approach by providing a step-by-step example on customizing the DeepSeek-R1 Distill Qwen 7b model using recipes, achieving an average of 25% on all the Rouge scores, with a maximum of 49% on Rouge 2 score with both SageMaker HyperPod and SageMaker training jobs. The second part of the series will focus on fine-tuning the DeepSeek-R1 671b model itself.
Streamline work insights with the HAQM Q Business connector for Smartsheet
This post explains how to integrate Smartsheet with HAQM Q Business to use natural language and generative AI capabilities for enhanced insights. Smartsheet, the AI-enhanced enterprise-grade work management platform, helps users manage projects, programs, and processes at scale.
Level up your problem-solving and strategic thinking skills with HAQM Bedrock
In this post, we show how Anthropic’s Claude 3.5 Sonnet in HAQM Bedrock can be used for a variety of business-related cognitive tasks, such as problem-solving, critical thinking and ideation—to help augment human thinking and improve decision-making among knowledge workers to accelerate innovation.
Evaluate healthcare generative AI applications using LLM-as-a-judge on AWS
In this post, we demonstrate how to implement this evaluation framework using HAQM Bedrock, compare the performance of different generator models, including Anthropic’s Claude and HAQM Nova on HAQM Bedrock, and showcase how to use the new RAG evaluation feature to optimize knowledge base parameters and assess retrieval quality.
How to configure cross-account model deployment using HAQM Bedrock Custom Model Import
In this guide, we walk you through step-by-step instructions for configuring cross-account access for HAQM Bedrock Custom Model Import, covering both non-encrypted and AWS Key Management Service (AWS KMS) based encrypted scenarios.
ByteDance processes billions of daily videos using their multimodal video understanding models on AWS Inferentia2
At ByteDance, we collaborated with HAQM Web Services (AWS) to deploy multimodal large language models (LLMs) for video understanding using AWS Inferentia2 across multiple AWS Regions around the world. By using sophisticated ML algorithms, the platform efficiently scans billions of videos each day. In this post, we discuss the use of multimodal LLMs for video understanding, the solution architecture, and techniques for performance optimization.
How Rocket Companies modernized their data science solution on AWS
In this post, we share how we modernized Rocket Companies’ data science solution on AWS to increase the speed to delivery from eight weeks to under one hour, improve operational stability and support by reducing incident tickets by over 99% in 18 months, power 10 million automated data science and AI decisions made daily, and provide a seamless data science development experience.
Reducing hallucinations in LLM agents with a verified semantic cache using HAQM Bedrock Knowledge Bases
This post introduces a solution to reduce hallucinations in Large Language Models (LLMs) by implementing a verified semantic cache using HAQM Bedrock Knowledge Bases, which checks if user questions match curated and verified responses before generating new answers. The solution combines the flexibility of LLMs with reliable, verified answers to improve response accuracy, reduce latency, and lower costs while preventing potential misinformation in critical domains such as healthcare, finance, and legal services.
LLM continuous self-instruct fine-tuning framework powered by a compound AI system on HAQM SageMaker
In this post, we present the continuous self-instruct fine-tuning framework as a compound AI system implemented by the DSPy framework. The framework first generates a synthetic dataset from the domain knowledge base and documents for self-instruction, then drives model fine-tuning through SFT, and introduces the human-in-the-loop workflow to collect human and AI feedback to the model response, which is used to further improve the model performance by aligning human preference through reinforcement learning (RLHF/RLAIF).