AWS Big Data Blog
Tag: HAQM Cloudwatch
Deliver HAQM CloudWatch logs to HAQM OpenSearch Serverless
In this blog post, we will show how to use HAQM OpenSearch Ingestion to deliver CloudWatch logs to OpenSearch Serverless in near real-time. We outline a mechanism to connect a Lambda subscription filter with OpenSearch Ingestion and deliver logs to OpenSearch Serverless without explicitly needing a separate subscription filter for it.
Monitor Spark streaming applications on HAQM EMR
This post demonstrates how to implement a simple SparkListener, monitor and observe Spark streaming applications, and set up some alerts. The post also shows how to use alerts to set up automatic scaling on HAQM EMR clusters, based on your CloudWatch custom metrics.
Optimize HAQM EMR costs with idle checks and automatic resource termination using advanced HAQM CloudWatch metrics and AWS Lambda
Many customers use HAQM EMR to run big data workloads, such as Apache Spark and Apache Hive queries, in their development environment. Data analysts and data scientists frequently use these types of clusters, known as analytics EMR clusters. Users often forget to terminate the clusters after their work is done. This leads to idle running […]
Build and automate a serverless data lake using an AWS Glue trigger for the Data Catalog and ETL jobs
September 2022: This post was reviewed and updated with latest screenshots and instructions. Today, data is flowing from everywhere, whether it is unstructured data from resources like IoT sensors, application logs, and clickstreams, or structured data from transaction applications, relational databases, and spreadsheets. Data has become a crucial part of every business. This has resulted […]
Improve the Operational Efficiency of HAQM Elasticsearch Service Domains with Automated Alarms Using HAQM CloudWatch
A customer has been successfully creating and running multiple HAQM Elasticsearch Service (HAQM ES) domains to support their business users’ search needs across products, orders, support documentation, and a growing suite of similar needs. The service has become heavily used across the organization. This led to some domains running at 100% capacity during peak times, while others began to run low on storage space. Because of this increased usage, the technical teams were in danger of missing their service level agreements. They contacted me for help.
This post shows how you can set up automated alarms to warn when domains need attention.
Dynamically Create Friendly URLs for Your HAQM EMR Web Interfaces
This solution provides a serverless approach to automatically assigning a friendly name for your EMR cluster for easy access to popular notebooks and other web interfaces.
Visualize and Monitor HAQM EC2 Events with HAQM CloudWatch Events and HAQM Kinesis Firehose
February 9, 2024: HAQM Kinesis Data Firehose has been renamed to HAQM Data Firehose. Read the AWS What’s New post to learn more. September 8, 2021: HAQM Elasticsearch Service has been renamed to HAQM OpenSearch Service. See details. Monitoring your AWS environment is important for security, performance, and cost control purposes. For example, by monitoring […]
Respond to State Changes on HAQM EMR Clusters with HAQM CloudWatch Events
Jonathan Fritz is a Senior Product Manager for HAQM EMR Customers can take advantage of the HAQM EMR API to create and terminate EMR clusters, scale clusters using Auto Scaling or manual resizing, and submit and run Apache Spark, Apache Hive, or Apache Pig workloads. These decisions are often triggered from cluster state-related information. Previously, […]
Analyze a Time Series in Real Time with AWS Lambda, HAQM Kinesis and HAQM DynamoDB Streams
This is a guest post by Richard Freeman, Ph.D., a solutions architect and data scientist at JustGiving. JustGiving in their own words: “We are one of the world’s largest social platforms for giving that’s helped 26.1 million registered users in 196 countries raise $3.8 billion for over 27,000 good causes.” Introduction As more devices, sensors […]