Posted On: Apr 22, 2020

HAQM SageMaker customers can now select Inf1 instances when deploying their machine learning models for real-time inference. HAQM SageMaker is a fully managed service that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. Using Inf1 instances on HAQM SageMaker, customers can run large scale machine learning and deep learning inference applications such as image recognition, speech recognition, natural language processing, personalization, forecasting, and fraud detection with high performance and significantly lower costs. 

Inf1 instances are built from the ground up to support machine learning inference applications and feature up to 16 AWS Inferentia chips, machine learning chips designed and built by AWS to optimize cost for deep learning inference. The Inferentia chips are coupled with the latest custom 2nd generation Intel® Xeon® Scalable processors and 100Gbps networking to provide high-performance and the lowest cost in the industry for ML inference applications. With 1 to 16 AWS Inferentia chips per instance, Inf1 instances can scale in performance up to 2000 Tera Operations per Second (TOPS) and deliver up to 3x higher throughput and up to 45% lower cost per inference compared to the AWS GPU based instances. The large on-chip memory on AWS Inferentia chips used in Inf1 instances allows caching of machine learning models directly on the chip eliminating the need to access outside memory resources during inference, and enabling low latency and inference throughput. To learn more about Inf1 instances, visit the product pages.  

Inf1 instances in HAQM SageMaker are now available in the N. Virginia and Oregon AWS regions in the US and are available in four sizes: ml.inf1.xlarge, ml.inf1.2xlarge, ml.inf1.6xlarge, and ml.inf1.24xlarge. Machine learning models developed using TensorFlow and MxNet frameworks can be deployed on Inf1 instances in HAQM SageMaker for real-time inference. To use Inf1 instances in HAQM SageMaker, you can compile your trained models using HAQM SageMaker Neo and select the Inf1 instances to deploy the compiled model on HAQM SageMaker.  

Visit the HAQM SageMaker developer guide for more information and HAQM SageMaker examples in Github HAQM SageMaker examples in Github to learn more about how to deploy machine learning models on Inf1 instances in HAQM SageMaker.  

Modified 8/27/2021 – In an effort to ensure a great experience, expired links in this post have been updated or removed from the original post.