Posted On: May 22, 2020

AWS has updated Real-Time Analytics with Spark Streaming, an AWS Solution that automatically deploys a highly available, cost-effective batch and real-time data analytics architecture on the AWS Cloud that leverages Apache Spark Streaming and HAQM Kinesis. This solution is designed to support custom Apache Spark Streaming applications, and leverages HAQM EMR for processing vast amounts of data across dynamically scalable HAQM Elastic Compute Cloud (HAQM EC2) instances.

The solution now includes an updated consumer application using the latest version of Spark and leverages modern features (such as Spark SQL and DataFrames), granular custom IAM policies, encryption at rest(default), flow logs to VPC, porting sample Spark streaming applications to Java (from Scala), and several maintenance upgrades such as updating Python to version 3.8 and updating HAQM EMR to version 5.29.0. To learn more about Real-Time Analytics with Spark Streaming on AWS, see the solution webpage.

Additional AWS Solutions offerings are available on the AWS Solutions webpage, where customers can browse solutions by product category or industry to find AWS-vetted, automated, turnkey reference implementations that address specific business needs.