AWS Machine Learning Blog

Category: Learning Levels

Evaluation Workflow

Evaluate HAQM Bedrock Agents with Ragas and LLM-as-a-judge

In this post, we introduced the Open Source Bedrock Agent Evaluation framework, a Langfuse-integrated solution that streamlines the agent development process. We demonstrated how this evaluation framework can be integrated with pharmaceutical research agents. We used it to evaluate agent performance against biomarker questions and sent traces to Langfuse to view evaluation metrics across question types.

Combine keyword and semantic search for text and images using HAQM Bedrock and HAQM OpenSearch Service

In this post, we walk you through how to build a hybrid search solution using OpenSearch Service powered by multimodal embeddings from the HAQM Titan Multimodal Embeddings G1 model through HAQM Bedrock. This solution demonstrates how you can enable users to submit both text and images as queries to retrieve relevant results from a sample retail image dataset.

Accuracy evaluation framework for HAQM Q Business – Part 2

In the first post of this series, we introduced a comprehensive evaluation framework for HAQM Q Business, a fully managed Retrieval Augmented Generation (RAG) solution that uses your company’s proprietary data without the complexity of managing large language models (LLMs). The first post focused on selecting appropriate use cases, preparing data, and implementing metrics to […]

Use HAQM Bedrock Intelligent Prompt Routing for cost and latency benefits

Today, we’re happy to announce the general availability of HAQM Bedrock Intelligent Prompt Routing. In this blog post, we detail various highlights from our internal testing, how you can get started, and point out some caveats and best practices. We encourage you to incorporate HAQM Bedrock Intelligent Prompt Routing into your new and existing generative AI applications.

HAQM Bedrock Prompt Optimization Drives LLM Applications Innovation for Yuewen Group

Today, we are excited to announce the availability of Prompt Optimization on HAQM Bedrock. With this capability, you can now optimize your prompts for several use cases with a single API call or a click of a button on the HAQM Bedrock console. In this blog post, we discuss how Prompt Optimization improves the performance of large language models (LLMs) for intelligent text processing task in Yuewen Group.

Build an automated generative AI solution evaluation pipeline with HAQM Nova

In this post, we explore the importance of evaluating LLMs in the context of generative AI applications, highlighting the challenges posed by issues like hallucinations and biases. We introduced a comprehensive solution using AWS services to automate the evaluation process, allowing for continuous monitoring and assessment of LLM performance. By using tools like the FMeval Library, Ragas, LLMeter, and Step Functions, the solution provides flexibility and scalability, meeting the evolving needs of LLM consumers.

Build a FinOps agent using HAQM Bedrock with multi-agent capability and HAQM Nova as the foundation model

Build a FinOps agent using HAQM Bedrock with multi-agent capability and HAQM Nova as the foundation model

In this post, we use the multi-agent feature of HAQM Bedrock to demonstrate a powerful and innovative approach to AWS cost management. By using the advanced capabilities of HAQM Nova FMs, we’ve developed a solution that showcases how AI-driven agents can revolutionize the way organizations analyze, optimize, and manage their AWS costs.

The future of quality assurance: Shift-left testing with QyrusAI and HAQM Bedrock

In this post, we explore how QyrusAI and HAQM Bedrock are revolutionizing shift-left testing, enabling teams to deliver better software faster. HAQM Bedrock is a fully managed service that allows businesses to build and scale generative AI applications using foundation models (FMs) from leading AI providers. It enables seamless integration with AWS services, offering customization, security, and scalability without managing infrastructure.

Racing beyond DeepRacer: Debut of the AWS LLM League

The AWS LLM League was designed to lower the barriers to entry in generative AI model customization by providing an experience where participants, regardless of their prior data science experience, could engage in fine-tuning LLMs. Using HAQM SageMaker JumpStart, attendees were guided through the process of customizing LLMs to address real business challenges adaptable to their domain.