AWS Machine Learning Blog

Category: Compute

High-level design of the solution

Create a generative AI–powered custom Google Chat application using HAQM Bedrock

AWS offers powerful generative AI services, including HAQM Bedrock, which allows organizations to create tailored use cases such as AI chat-based assistants that give answers based on knowledge contained in the customers’ documents, and much more. Many businesses want to integrate these cutting-edge AI capabilities with their existing collaboration tools, such as Google Chat, to […]

Automate HAQM Bedrock batch inference: Building a scalable and efficient pipeline

Although batch inference offers numerous benefits, it’s limited to 10 batch inference jobs submitted per model per Region. To address this consideration and enhance your use of batch inference, we’ve developed a scalable solution using AWS Lambda and HAQM DynamoDB. This post guides you through implementing a queue management system that automatically monitors available job slots and submits new jobs as slots become available.

Deploy a serverless web application to edit images using HAQM Bedrock

In this post, we explore a sample solution that you can use to deploy an image editing application by using AWS serverless services and generative AI services. We use HAQM Bedrock and an HAQM Titan FM that allow you to edit images by using prompts.

Create a multimodal chatbot tailored to your unique dataset with HAQM Bedrock FMs

Create a multimodal chatbot tailored to your unique dataset with HAQM Bedrock FMs

In this post, we show how to create a multimodal chat assistant on HAQM Web Services (AWS) using HAQM Bedrock models, where users can submit images and questions, and text responses will be sourced from a closed set of proprietary documents.

Accelerate pre-training of Mistral’s Mathstral model with highly resilient clusters on HAQM SageMaker HyperPod

In this post, we present to you an in-depth guide to starting a continual pre-training job using PyTorch Fully Sharded Data Parallel (FSDP) for Mistral AI’s Mathstral model with SageMaker HyperPod.

Building an efficient MLOps platform with OSS tools on HAQM ECS with AWS Fargate

Building an efficient MLOps platform with OSS tools on HAQM ECS with AWS Fargate

In this post, we show you how Zeta Global, a data-driven marketing technology company, has built an efficient MLOps platform to streamline the end-to-end ML workflow, from data ingestion to model deployment, while optimizing resource utilization and cost efficiency.

Introducing HAQM EKS support in HAQM SageMaker HyperPod

Introducing HAQM EKS support in HAQM SageMaker HyperPod

This post is designed for Kubernetes cluster administrators and ML scientists, providing an overview of the key features that SageMaker HyperPod introduces to facilitate large-scale model training on an EKS cluster.

HAQM EC2 P5e instances are generally available

HAQM EC2 P5e instances are generally available

In this post, we discuss the core capabilities of HAQM Elastic Compute Cloud (HAQM EC2) P5e instances and the use cases they’re well-suited for. We walk you through an example of how to get started with these instances and carry out inference deployment of Meta Llama 3.1 70B and 405B models on them.

Accelerate performance using a custom chunking mechanism with HAQM Bedrock

This post explores how Accenture used the customization capabilities of Knowledge Bases for HAQM Bedrock to incorporate their data processing workflow and custom logic to create a custom chunking mechanism that enhances the performance of Retrieval Augmented Generation (RAG) and unlock the potential of your PDF data.