AWS Machine Learning Blog

Category: Technical How-to

Example solution architecture

Enterprise-grade natural language to SQL generation using LLMs: Balancing accuracy, latency, and scale

In this post, the AWS and Cisco teams unveil a new methodical approach that addresses the challenges of enterprise-grade SQL generation. The teams were able to reduce the complexity of the NL2SQL process while delivering higher accuracy and better overall performance.

Build an AI-powered document processing platform with open source NER model and LLM on HAQM SageMaker

In this post, we discuss how you can build an AI-powered document processing platform with open source NER and LLMs on SageMaker.

Host concurrent LLMs with LoRAX

In this post, we explore how Low-Rank Adaptation (LoRA) can be used to address these challenges effectively. Specifically, we discuss using LoRA serving with LoRA eXchange (LoRAX) and HAQM Elastic Compute Cloud (HAQM EC2) GPU instances, allowing organizations to efficiently manage and serve their growing portfolio of fine-tuned models, optimize costs, and provide seamless performance for their customers.

Optimizing Mixtral 8x7B on HAQM SageMaker with AWS Inferentia2

This post demonstrates how to deploy and serve the Mixtral 8x7B language model on AWS Inferentia2 instances for cost-effective, high-performance inference. We’ll walk through model compilation using Hugging Face Optimum Neuron, which provides a set of tools enabling straightforward model loading, training, and inference, and the Text Generation Inference (TGI) Container, which has the toolkit for deploying and serving LLMs with Hugging Face.

Building an AIOps chatbot with HAQM Q Business custom plugins

In this post, we demonstrate how you can use custom plugins for HAQM Q Business to build a chatbot that can interact with multiple APIs using natural language prompts. We showcase how to build an AIOps chatbot that enables users to interact with their AWS infrastructure through natural language queries and commands. The chatbot is capable of handling tasks such as querying the data about HAQM Elastic Compute Cloud (HAQM EC2) ports and HAQM Simple Storage Service (HAQM S3) buckets access settings.

Model customization, RAG, or both: A case study with HAQM Nova

The introduction of HAQM Nova models represent a significant advancement in the field of AI, offering new opportunities for large language model (LLM) optimization. In this post, we demonstrate how to effectively perform model customization and RAG with HAQM Nova models as a baseline. We conducted a comprehensive comparison study between model customization and RAG using the latest HAQM Nova models, and share these valuable insights.

Workflow Diagram: 1. Import your user, item, and interaction data into HAQM Personalize. 2. Train an HAQM Personalize “Top pics for you” recommender. 3. Get the top recommended movies for each user. 4. Use a prompt template, the recommended movies, and the user demographics to generate the model prompt. 5. Use HAQM Bedrock LLMs to generate personalized outbound communication with the prompt. 6. Share the personalize outbound communication with each of your users.

Generate user-personalized communication with HAQM Personalize and HAQM Bedrock

In this post, we demonstrate how to use HAQM Personalize and HAQM Bedrock to generate personalized outreach emails for individual users using a video-on-demand use case. This concept can be applied to other domains, such as compelling customer experiences for ecommerce and digital marketing use cases.