AWS Big Data Blog

Category: HAQM CloudWatch

HAQM EMR Serverless observability, Part 1: Monitor HAQM EMR Serverless workers in near real time using HAQM CloudWatch

We have launched job worker metrics in HAQM CloudWatch for EMR Serverless. This feature allows you to monitor vCPUs, memory, ephemeral storage, and disk I/O allocation and usage metrics at an aggregate worker level for your Spark and Hive jobs. This post is part of a series about EMR Serverless observability. In this post, we discuss how to use these CloudWatch metrics to monitor EMR Serverless workers in near real time.

Create a customizable cross-company log lake for compliance, Part I: Business Background

As builders, sometimes you want to dissect a customer experience, find problems, and figure out ways to make it better. That means going a layer down to mix and match primitives together to get more comprehensive features and more customization, flexibility, and freedom. In this post, we introduce Log Lake, a do-it-yourself data lake based on logs from CloudWatch and AWS CloudTrail.

Deliver HAQM CloudWatch logs to HAQM OpenSearch Serverless

In this blog post, we will show how to use HAQM OpenSearch Ingestion to deliver CloudWatch logs to OpenSearch Serverless in near real-time. We outline a mechanism to connect a Lambda subscription filter with OpenSearch Ingestion and deliver logs to OpenSearch Serverless without explicitly needing a separate subscription filter for it.

Backup and Restore - Pre

Disaster recovery strategies for HAQM MWAA – Part 1

In the dynamic world of cloud computing, ensuring the resilience and availability of critical applications is paramount. Disaster recovery (DR) is the process by which an organization anticipates and addresses technology-related disasters. For organizations implementing critical workload orchestration using HAQM Managed Workflows for Apache Airflow (HAQM MWAA), it is crucial to have a DR plan […]

Enable metric-based and scheduled scaling for HAQM Managed Service for Apache Flink

Thousands of developers use Apache Flink to build streaming applications to transform and analyze data in real time. Apache Flink is an open source framework and engine for processing data streams. It’s highly available and scalable, delivering high throughput and low latency for the most demanding stream-processing applications. Monitoring and scaling your applications is critical […]

Monitor data pipelines in a serverless data lake

AWS serverless services, including but not limited to AWS Lambda, AWS Glue, AWS Fargate, HAQM EventBridge, HAQM Athena, HAQM Simple Notification Service (HAQM SNS), HAQM Simple Queue Service (HAQM SQS), and HAQM Simple Storage Service (HAQM S3), have become the building blocks for any serverless data lake, providing key mechanisms to ingest and transform data […]

Centralize near-real-time governance through alerts on HAQM Redshift data warehouses for sensitive queries

HAQM Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that delivers powerful and secure insights on all your data with the best price-performance. With HAQM Redshift, you can analyze your data to derive holistic insights about your business and your customers. In many organizations, one or multiple HAQM Redshift data warehouses […]

Push HAQM EMR step logs from HAQM EC2 instances to HAQM CloudWatch logs

HAQM EMR is a big data service offered by AWS to run Apache Spark and other open-source applications on AWS to build scalable data pipelines in a cost-effective manner. Monitoring the logs generated from the jobs deployed on EMR clusters is essential to help detect critical issues in real time and identify root causes quickly. […]

Monitor AWS workloads without a single line of code with Logz.io and Kinesis Firehose

February 9, 2024: HAQM Kinesis Data Firehose has been renamed to HAQM Data Firehose. Read the AWS What’s New post to learn more. Observability data provides near real-time insights into the health and performance of AWS workloads, so that engineers can quickly address production issues and troubleshoot them before widespread customer impact. As AWS workloads […]