AWS Machine Learning Blog

Category: Launch

Supercharge your LLM performance with HAQM SageMaker Large Model Inference container v15

Today, we’re excited to announce the launch of HAQM SageMaker Large Model Inference (LMI) container v15, powered by vLLM 0.8.4 with support for the vLLM V1 engine. This release introduces significant performance improvements, expanded model compatibility with multimodality (that is, the ability to understand and analyze text-to-text, images-to-text, and text-to-images data), and provides built-in integration with vLLM to help you seamlessly deploy and serve large language models (LLMs) with the highest performance at scale.

Harness the power of MCP servers with HAQM Bedrock Agents

Today, MCP is providing agents standard access to an expanding list of accessible tools that you can use to accomplish a variety of tasks. In this post, we show you how to build an HAQM Bedrock agent that uses MCP to access data sources to quickly build generative AI applications.

Simplify multimodal generative AI with HAQM Bedrock Data Automation

HAQM Bedrock Data Automation in public preview, offers a unified experience for developers of all skillsets to easily automate the extraction, transformation, and generation of relevant insights from documents, images, audio, and videos to build generative AI–powered applications. In this post, we demonstrate how to use HAQM Bedrock Data Automation in the AWS Management Console and the AWS SDK for Python (Boto3) for media analysis and intelligent document processing (IDP) workflows.

Use HAQM Bedrock tooling with HAQM SageMaker JumpStart models

In this post, we explore how to deploy AI models from SageMaker JumpStart and use them with HAQM Bedrock’s powerful features. Users can combine SageMaker JumpStart’s model hosting with Bedrock’s security and monitoring tools. We demonstrate this using the Gemma 2 9B Instruct model as an example, showing how to deploy it and use Bedrock’s advanced capabilities.

A guide to HAQM Bedrock Model Distillation (preview)

This post introduces the workflow of HAQM Bedrock Model Distillation. We first introduce the general concept of model distillation in HAQM Bedrock, and then focus on the important steps in model distillation, including setting up permissions, selecting the models, providing input dataset, commencing the model distillation jobs, and conducting evaluation and deployment of the student models after model distillation.

John Snow Labs Medical LLMs are now available in HAQM SageMaker JumpStart

Today, we are excited to announce that John Snow Labs’ Medical LLM – Small and Medical LLM – Medium large language models (LLMs) are now available on HAQM SageMaker Jumpstart. For medical doctors, this tool provides a rapid understanding of a patient’s medical journey, aiding in timely and informed decision-making from extensive documentation. This summarization capability not only boosts efficiency but also makes sure that no critical details are overlooked, thereby supporting optimal patient care and enhancing healthcare outcomes.

Build cost-effective RAG applications with Binary Embeddings in HAQM Titan Text Embeddings V2, HAQM OpenSearch Serverless, and HAQM Bedrock Knowledge Bases

Today, we are happy to announce the availability of Binary Embeddings for HAQM Titan Text Embeddings V2 in HAQM Bedrock Knowledge Bases and HAQM OpenSearch Serverless. This post summarizes the benefits of this new binary vector support and gives you information on how you can get started.

Introducing Stable Diffusion 3.5 Large in HAQM SageMaker JumpStart

We are excited to announce the availability of Stability AI’s latest and most advanced text-to-image model, Stable Diffusion 3.5 Large, in HAQM SageMaker JumpStart. In this post, we provide an implementation guide for subscribing to Stable Diffusion 3.5 Large in SageMaker JumpStart, deploying the model in HAQM SageMaker Studio, and generating images using text-to-image prompts.

Track, allocate, and manage your generative AI cost and usage with HAQM Bedrock

HAQM Bedrock has launched a capability that organizations can use to tag on-demand models and monitor associated costs. Organizations can now label all HAQM Bedrock models with AWS cost allocation tags, aligning usage to specific organizational taxonomies such as cost centers, business units, and applications.