AWS Machine Learning Blog
Multi-tenancy in RAG applications in a single HAQM Bedrock knowledge base with metadata filtering
This post demonstrates how HAQM Bedrock Knowledge Bases can help you scale your data management effectively while maintaining proper access controls on different management levels.
Effectively use prompt caching on HAQM Bedrock
Prompt caching, now generally available on HAQM Bedrock with Anthropic’s Claude 3.5 Haiku and Claude 3.7 Sonnet, along with Nova Micro, Nova Lite, and Nova Pro models, lowers response latency by up to 85% and reduces costs up to 90% by caching frequently used prompts across multiple API calls. This post provides a detailed overview of the prompt caching feature on HAQM Bedrock and offers guidance on how to effectively use this feature to achieve improved latency and cost savings.
Advanced tracing and evaluation of generative AI agents using LangChain and HAQM SageMaker AI MLFlow
In this post, I show you how to combine LangChain’s LangGraph, HAQM SageMaker AI, and MLflow to demonstrate a powerful workflow for developing, evaluating, and deploying sophisticated generative AI agents. This integration provides the tools needed to gain deep insights into the generative AI agent’s performance, iterate quickly, and maintain version control throughout the development process.
Prompting for the best price-performance
In this blog, we discuss how to optimize prompting in HAQM Nova for the best price-performance.
Evaluate models or RAG systems using HAQM Bedrock Evaluations – Now generally available
Today, we’re excited to announce the general availability of these evaluation features in HAQM Bedrock Evaluations, along with significant enhancements that make them fully environment-agnostic. In this post, we explore these new features in detail, showing you how to evaluate both RAG systems and models with practical examples. We demonstrate how to use the comparison capabilities to benchmark different implementations and make data-driven decisions about your AI deployments.
Fine-tune large language models with reinforcement learning from human or AI feedback
In this post, we introduce a state-of-the-art method to fine-tune LLMs by reinforcement learning, reviewed the pros and cons of RLHF vs. RLAIF vs. DPO, and saw how to scale LLM fine-tuning efforts with RLAIF. We also see how to implement an end-to-end RLAIF pipeline on SageMaker using the Hugging Face Transformer and TRL libraries, and using either off-the-shelf toxicity reward models to align responses during PPO or by directly prompting an LLM to generate quantitative reward feedback during PPO.
How Lumi streamlines loan approvals with HAQM SageMaker AI
Lumi is a leading Australian fintech lender empowering small businesses with fast, flexible, and transparent funding solutions. They use real-time data and machine learning (ML) to offer customized loans that fuel sustainable growth and solve the challenges of accessing capital. This post explores how Lumi uses HAQM SageMaker AI to meet this goal, enhance their transaction processing and classification capabilities, and ultimately grow their business by providing faster processing of loan applications, more accurate credit decisions, and improved customer experience.
How AWS Sales uses generative AI to streamline account planning
Every year, AWS Sales personnel draft in-depth, forward looking strategy documents for established AWS customers. These documents help the AWS Sales team to align with our customer growth strategy and to collaborate with the entire sales team on long-term growth ideas for AWS customers. In this post, we showcase how the AWS Sales product team built the generative AI account plans draft assistant.
Shaping the future: OMRON’s data-driven journey with AWS
OMRON Corporation is a leading technology provider in industrial automation, healthcare, and electronic components. In their Shaping the Future 2030 (SF2030) strategic plan, OMRON aims to address diverse social issues, drive sustainable business growth, transform business models and capabilities, and accelerate digital transformation. At the heart of this transformation is the OMRON Data & Analytics Platform (ODAP), an innovative initiative designed to revolutionize how the company harnesses its data assets. This post explores how OMRON Europe is using HAQM Web Services (AWS) to build its advanced ODAP and its progress toward harnessing the power of generative AI.
AI Workforce: using AI and Drones to simplify infrastructure inspections
Inspecting wind turbines, power lines, 5G towers, and pipelines is a tough job. It’s often dangerous, time-consuming, and prone to human error. This post is the first in a three-part series exploring AI Workforce, the AWS AI-powered drone inspection system. In this post, we introduce the concept and key benefits. The second post dives into the AWS architecture that powers AI Workforce, and the third focuses on the drone setup and integration.