Posted On: Jan 18, 2019
HAQM Elastic Inference is a service that lets you attach accelerators to any HAQM SageMaker or HAQM EC2 instance type to speed up deep learning inference workloads. Elastic Inference accelerators provide you with the low latency, high throughput benefits of GPU acceleration at a much lower cost (up to 75%). You can use Elastic Inference to deploy TensorFlow, Apache MXNet, and ONNX models for inference.
HAQM Elastic Inference now supports the latest version of TensorFlow 1.12. It provides EIPredictor, a new easy-to-use Python API function for deploying TensorFlow models using HAQM Elastic Inference accelerators. EIPredictor allows for easy experimentation and lets you compare performance with and without HAQM Elastic Inference. To learn more about running TensorFlow models using HAQM Elastic Inference please see this blog post.
To learn about HAQM Elastic Inference visit the web page and the documentation user guide.