Posted On: Mar 18, 2020

You can now use HAQM Elastic Inference to accelerate inference and reduce inference costs for PyTorch models in HAQM SageMaker, HAQM EC2 and HAQM ECS. Enhanced PyTorch libraries for EI are available automatically in HAQM SageMaker, AWS Deep Learning AMIs, and AWS Deep Learning Containers, so you can deploy your PyTorch models in production with minimal code changes. Elastic Inference supports TorchScript compiled models on PyTorch. In order to use Elastic Inference with PyTorch, you must convert your PyTorch models into TorchScript and use the Elastic Inference API for inference. Today, PyTorch joins TensorFlow and Apache MXNet as a deep learning framework that is supported by Elastic Inference.

Elastic Inference allows you to attach just the right amount of GPU-powered acceleration to any HAQM SageMaker instance, EC2 instance, or ECS task to reduce the cost of running deep learning inference by up to 75%.

PyTorch for Elastic Inference is supported in regions where HAQM Elastic Inference is available. For more information, see Using PyTorch Models with Elastic Inference in the developer guide and our blog post, “Reduce ML inference costs on HAQM SageMaker for PyTorch models using HAQM Elastic Inference“.