AWS Big Data Blog

Category: Management Tools

HAQM EMR Serverless observability, Part 1: Monitor HAQM EMR Serverless workers in near real time using HAQM CloudWatch

We have launched job worker metrics in HAQM CloudWatch for EMR Serverless. This feature allows you to monitor vCPUs, memory, ephemeral storage, and disk I/O allocation and usage metrics at an aggregate worker level for your Spark and Hive jobs. This post is part of a series about EMR Serverless observability. In this post, we discuss how to use these CloudWatch metrics to monitor EMR Serverless workers in near real time.

Create a customizable cross-company log lake for compliance, Part I: Business Background

As builders, sometimes you want to dissect a customer experience, find problems, and figure out ways to make it better. That means going a layer down to mix and match primitives together to get more comprehensive features and more customization, flexibility, and freedom. In this post, we introduce Log Lake, a do-it-yourself data lake based on logs from CloudWatch and AWS CloudTrail.

Deliver HAQM CloudWatch logs to HAQM OpenSearch Serverless

In this blog post, we will show how to use HAQM OpenSearch Ingestion to deliver CloudWatch logs to OpenSearch Serverless in near real-time. We outline a mechanism to connect a Lambda subscription filter with OpenSearch Ingestion and deliver logs to OpenSearch Serverless without explicitly needing a separate subscription filter for it.

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 2: Real-time monitoring using Grafana

Monitoring data pipelines in real time is critical for catching issues early and minimizing disruptions. AWS Glue has made this more straightforward with the launch of AWS Glue job observability metrics, which provide valuable insights into your data integration pipelines built on AWS Glue. However, you might need to track key performance indicators across multiple […]

Backup and Restore - Pre

Disaster recovery strategies for HAQM MWAA – Part 1

In the dynamic world of cloud computing, ensuring the resilience and availability of critical applications is paramount. Disaster recovery (DR) is the process by which an organization anticipates and addresses technology-related disasters. For organizations implementing critical workload orchestration using HAQM Managed Workflows for Apache Airflow (HAQM MWAA), it is crucial to have a DR plan […]

Enable metric-based and scheduled scaling for HAQM Managed Service for Apache Flink

Thousands of developers use Apache Flink to build streaming applications to transform and analyze data in real time. Apache Flink is an open source framework and engine for processing data streams. It’s highly available and scalable, delivering high throughput and low latency for the most demanding stream-processing applications. Monitoring and scaling your applications is critical […]

Monitor data pipelines in a serverless data lake

AWS serverless services, including but not limited to AWS Lambda, AWS Glue, AWS Fargate, HAQM EventBridge, HAQM Athena, HAQM Simple Notification Service (HAQM SNS), HAQM Simple Queue Service (HAQM SQS), and HAQM Simple Storage Service (HAQM S3), have become the building blocks for any serverless data lake, providing key mechanisms to ingest and transform data […]

Centralize near-real-time governance through alerts on HAQM Redshift data warehouses for sensitive queries

HAQM Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that delivers powerful and secure insights on all your data with the best price-performance. With HAQM Redshift, you can analyze your data to derive holistic insights about your business and your customers. In many organizations, one or multiple HAQM Redshift data warehouses […]

Simplify AWS Glue job orchestration and monitoring with HAQM MWAA

Organizations across all industries have complex data processing requirements for their analytical use cases across different analytics systems, such as data lakes on AWS, data warehouses (HAQM Redshift), search (HAQM OpenSearch Service), NoSQL (HAQM DynamoDB), machine learning (HAQM SageMaker), and more. Analytics professionals are tasked with deriving value from data stored in these distributed systems […]