AWS Machine Learning Blog

Category: HAQM Elastic Inference

Reduce inference costs on HAQM EC2 for PyTorch models with HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. You can now use HAQM Elastic Inference to accelerate inference and reduce inference costs for PyTorch models in both HAQM SageMaker and HAQM EC2. PyTorch is a popular deep learning framework that uses dynamic computational graphs. This allows you to […]

Increasing performance and reducing the cost of MXNet inference using HAQM SageMaker Neo and HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. When running deep learning models in production, balancing infrastructure cost versus model latency is always an important consideration. At re:Invent 2018, AWS introduced HAQM SageMaker Neo and HAQM Elastic Inference, two services that can make models more efficient for deep […]

Reduce ML inference costs on HAQM SageMaker for PyTorch models using HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. Today, we are excited to announce that you can now use HAQM Elastic Inference to accelerate inference and reduce inference costs for PyTorch models in both HAQM SageMaker and HAQM EC2. PyTorch is a popular deep learning framework that uses […]

Optimizing TensorFlow model serving with Kubernetes and HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. This post offers a dive deep into how to use HAQM Elastic Inference with HAQM Elastic Kubernetes Service. When you combine Elastic Inference with EKS, you can run low-cost, scalable inference workloads with your preferred container orchestration system. Elastic Inference […]

Serving deep learning at Curalate with Apache MXNet, AWS Lambda, and HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. This is a guest blog post by Jesse Brizzi, a computer vision research engineer at Curalate. At Curalate, we’re always coming up with new ways to use deep learning and computer vision to find and leverage user-generated content (UGC) and […]

Optimizing costs in HAQM Elastic Inference with TensorFlow

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. HAQM Elastic Inference allows you to attach low-cost GPU-powered acceleration to HAQM EC2 and HAQM SageMaker instances, and reduce the cost of running deep learning inference by up to 75 percent. The EIPredictorAPI makes it easy to use Elastic Inference. In this post, […]

Running Java-based deep learning with MXNet and HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. The new release of MXNet 1.4 for HAQM Elastic Inference now includes Java and Scala support. Apache MXNet is an open source deep learning framework used to build, train, and deploy deep neural networks. HAQM Elastic Inference (EI) is a […]

Launch EI accelerators in minutes with the HAQM Elastic Inference setup tool for EC2

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. The HAQM Elastic Inference (EI) setup tool is a Python script that enables you to quickly get started with EI. Elastic Inference allows you to attach low-cost GPU-powered acceleration to HAQM EC2 and HAQM SageMaker instances to reduce the cost of running […]

Reducing deep learning inference cost with MXNet and HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. HAQM Elastic Inference (HAQM EI) is a service that allows you to attach low-cost GPU-powered acceleration to HAQM EC2 and HAQM SageMaker instances. MXNet has supported HAQM EI since its initial release at AWS re:Invent 2018. In this blog post, […]

Model serving with HAQM Elastic Inference

Note: HAQM Elastic Inference is no longer available. Please see HAQM SageMaker for similar capabilities. HAQM Elastic Inference (EI) is a service that allows you to attach low-cost GPU-powered acceleration to HAQM EC2 and HAQM SageMaker instances. EI reduces the cost of running deep learning inference by up to 75%. Model Server for Apache MXNet […]