Generative AI | AWS Machine Learning Blog

Reduce ML training costs with HAQM SageMaker HyperPod

In this post, we explore the challenges of large-scale frontier model training, focusing on hardware failures and the benefits of HAQM SageMaker HyperPod – a solution that minimizes disruptions, enhances efficiency, and reduces training costs.

Model customization, RAG, or both: A case study with HAQM Nova

The introduction of HAQM Nova models represent a significant advancement in the field of AI, offering new opportunities for large language model (LLM) optimization. In this post, we demonstrate how to effectively perform model customization and RAG with HAQM Nova models as a baseline. We conducted a comprehensive comparison study between model customization and RAG using the latest HAQM Nova models, and share these valuable insights.

Pixtral Large is now available in HAQM Bedrock

In this post, we demonstrate how to get started with the Pixtral Large model in HAQM Bedrock. The Pixtral Large multimodal model allows you to tackle a variety of use cases, such as document understanding, logical reasoning, handwriting recognition, image comparison, entity extraction, extracting structured data from scanned images, and caption generation.

Boost team productivity with HAQM Q Business Insights

In this post, we explore HAQM Q Business Insights capabilities and its importance for organizations. We begin with an overview of the available metrics and how they can be used for measuring user engagement and system effectiveness. Then we provide instructions for accessing and navigating this dashboard.

Multi-LLM routing strategies for generative AI applications on AWS

Organizations are increasingly using multiple large language models (LLMs) when building generative AI applications. Although an individual LLM can be highly capable, it might not optimally address a wide range of use cases or meet diverse performance requirements. The multi-LLM approach enables organizations to effectively choose the right model for each task, adapt to different […]

Build an enterprise synthetic data strategy using HAQM Bedrock

In this post, we explore how to use HAQM Bedrock for synthetic data generation, considering these challenges alongside the potential benefits to develop effective strategies for various applications across multiple industries, including AI and machine learning (ML).

Multi-tenancy in RAG applications in a single HAQM Bedrock knowledge base with metadata filtering

This post demonstrates how HAQM Bedrock Knowledge Bases can help you scale your data management effectively while maintaining proper access controls on different management levels.

Effectively use prompt caching on HAQM Bedrock

Prompt caching, now generally available on HAQM Bedrock with Anthropic’s Claude 3.5 Haiku and Claude 3.7 Sonnet, along with Nova Micro, Nova Lite, and Nova Pro models, lowers response latency by up to 85% and reduces costs up to 90% by caching frequently used prompts across multiple API calls. This post provides a detailed overview of the prompt caching feature on HAQM Bedrock and offers guidance on how to effectively use this feature to achieve improved latency and cost savings.

Advanced tracing and evaluation of generative AI agents using LangChain and HAQM SageMaker AI MLFlow

In this post, I show you how to combine LangChain’s LangGraph, HAQM SageMaker AI, and MLflow to demonstrate a powerful workflow for developing, evaluating, and deploying sophisticated generative AI agents. This integration provides the tools needed to gain deep insights into the generative AI agent’s performance, iterate quickly, and maintain version control throughout the development process.

Evaluate models or RAG systems using HAQM Bedrock Evaluations – Now generally available

Today, we’re excited to announce the general availability of these evaluation features in HAQM Bedrock Evaluations, along with significant enhancements that make them fully environment-agnostic. In this post, we explore these new features in detail, showing you how to evaluate both RAG systems and models with practical examples. We demonstrate how to use the comparison capabilities to benchmark different implementations and make data-driven decisions about your AI deployments.

Category: Generative AI