Posted On: Nov 18, 2016

You can now configure policies to automatically add (scale out) and terminate (scale in) nodes in your HAQM EMR cluster. HAQM EMR can programmatically scale out applications like Apache Spark and Apache Hive to utilize additional nodes for increased performance and scale in the number of nodes in your cluster to save costs when utilization is low. Your cluster can scale based on HAQM CloudWatch metrics provided by HAQM EMR, including YARN utilization metrics.

HAQM EMR’s scale down behavior is now configurable. Starting with release 5.1.0, HAQM EMR will now terminate nodes when scaling in your cluster as they approach the instance hour for HAQM EC2 billing, regardless of task completion. If you would like to use the previous default behavior, you can also configure your cluster to wait for all running tasks on a node to complete before termination, regardless of proximity to the instance hour boundary.

You can create or modify auto scaling policies from the HAQM EMR console, AWS Command Line Interface (CLI), or the AWS SDK with the HAQM EMR API. Auto Scaling can be enabled with HAQM EMR releases 4.x and 5.x, and scaling down at an hourly boundary is supported on release 5.1.0 and later. Please visit the HAQM EMR documentation for more information about Auto Scaling and scale down behavior.