AWS Partner Network (APN) Blog

Tag: Arcee AI

Running GenAI Inference with AWS Graviton and Arcee AI Models

The growing demand for generative AI (GenAI) applications has led to a corresponding demand for compute resources that can run these workloads efficiently. In this post we share a step-by-step guide for optimizing GenAI inference workloads using AWS Graviton-based instances. We walk you through downloading Arcee AI SLMs, applying quantization techniques, and deploying models for efficient inference on AWS Graviton instances.