AWS Machine Learning Blog
Category: Expert (400)
Accelerate performance using a custom chunking mechanism with HAQM Bedrock
This post explores how Accenture used the customization capabilities of Knowledge Bases for HAQM Bedrock to incorporate their data processing workflow and custom logic to create a custom chunking mechanism that enhances the performance of Retrieval Augmented Generation (RAG) and unlock the potential of your PDF data.
Improve RAG accuracy with fine-tuned embedding models on HAQM SageMaker
This post demonstrates how to use HAQM SageMaker to fine tune a Sentence Transformer embedding model and deploy it with an HAQM SageMaker Endpoint. The code from this post and more examples are available in the GitHub repo.
Accelerated PyTorch inference with torch.compile on AWS Graviton processors
Originally PyTorch used an eager mode where each PyTorch operation that forms the model is run independently as soon as it’s reached. PyTorch 2.0 introduced torch.compile to speed up PyTorch code over the default eager mode. In contrast to eager mode, the torch.compile pre-compiles the entire model into a single graph in a manner that’s optimal for […]
Pre-training genomic language models using AWS HealthOmics and HAQM SageMaker
Pre-train HyenaDNA, a transformer model exceeding 1M tokens, using HealthOmics storage and SageMaker’s managed training environment to catalyze breakthroughs in precision medicine, agriculture, and biotechnology.
Fine-tune large multimodal models using HAQM SageMaker
Large multimodal models (LMMs) integrate multiple data types into a single model. By combining text data with images and other modalities during training, multimodal models such as Claude3, GPT-4V, and Gemini Pro Vision gain more comprehensive understanding and improved ability to process diverse data types. The multimodal approach allows models to handle a wider range […]
Evaluation of generative AI techniques for clinical report summarization
In this post, we provide a comparison of results obtained by two such techniques: zero-shot and few-shot prompting. We also explore the utility of the RAG prompt engineering technique as it applies to the task of summarization.
Evaluating LLMs is an undervalued part of the machine learning (ML) pipeline.
Build a Hugging Face text classification model in HAQM SageMaker JumpStart
HAQM SageMaker JumpStart provides a suite of built-in algorithms, pre-trained models, and pre-built solution templates to help data scientists and machine learning (ML) practitioners get started on training and deploying ML models quickly. You can use these algorithms and models for both supervised and unsupervised learning. They can process various types of input data, including […]
Information extraction with LLMs using HAQM SageMaker JumpStart
Large language models (LLMs) have unlocked new possibilities for extracting information from unstructured text data. Although much of the current excitement is around LLMs for generative AI tasks, many of the key use cases that you might want to solve have not fundamentally changed. Tasks such as routing support tickets, recognizing customers intents from a […]
Improve LLM performance with human and AI feedback on HAQM SageMaker for HAQM Engineering
The HAQM EU Design and Construction (HAQM D&C) team is the engineering team designing and constructing HAQM warehouses. The team navigates a large volume of documents and locates the right information to make sure the warehouse design meets the highest standards. In the post A generative AI-powered solution on HAQM SageMaker to help HAQM EU […]
Integrate HyperPod clusters with Active Directory for seamless multi-user login
HAQM SageMaker HyperPod is purpose-built to accelerate foundation model (FM) training, removing the undifferentiated heavy lifting involved in managing and optimizing a large training compute cluster. With SageMaker HyperPod, you can train FMs for weeks and months without disruption. Typically, HyperPod clusters are used by multiple users: machine learning (ML) researchers, software engineers, data scientists, […]