AWS Big Data Blog

Category: Monitoring and observability

Correlate telemetry data with HAQM OpenSearch Service and HAQM Managed Grafana

In this post, we show you how to use HAQM OpenSearch Service and HAQM Managed Grafana to correlate the various observability signals that improve root cause analysis, thereby resulting in reduced Mean Time to Resolution (MTTR). We also provide a reference solution that can be used at scale for proactive monitoring of enterprise applications to avoid a problem before they occur.

How FINRA established real-time operational observability for HAQM EMR big data workloads on HAQM EC2 with Prometheus and Grafana

FINRA performs big data processing with large volumes of data and workloads with varying instance sizes and types on HAQM EMR. HAQM EMR is a cloud-based big data environment designed to process large amounts of data using open source tools such as Hadoop, Spark, HBase, Flink, Hudi, and Presto. In this post, we talk about our challenges and show how we built an observability framework to provide operational metrics insights for big data processing workloads on HAQM EMR on HAQM Elastic Compute Cloud (HAQM EC2) clusters.

HAQM EMR Serverless observability, Part 1: Monitor HAQM EMR Serverless workers in near real time using HAQM CloudWatch

We have launched job worker metrics in HAQM CloudWatch for EMR Serverless. This feature allows you to monitor vCPUs, memory, ephemeral storage, and disk I/O allocation and usage metrics at an aggregate worker level for your Spark and Hive jobs. This post is part of a series about EMR Serverless observability. In this post, we discuss how to use these CloudWatch metrics to monitor EMR Serverless workers in near real time.

Configure monitoring, limits, and alarms in HAQM Redshift Serverless to keep costs predictable

HAQM Redshift Serverless makes it simple to run and scale analytics in seconds. It automatically provisions and intelligently scales data warehouse compute capacity to deliver fast performance, and you pay only for what you use. Just load your data and start querying right away in the HAQM Redshift Query Editor or in your favorite business […]

Monitor Apache HBase on HAQM EMR using HAQM Managed Service for Prometheus and HAQM Managed Grafana

HAQM EMR provides a managed Apache Hadoop framework that makes it straightforward, fast, and cost-effective to run Apache HBase. Apache HBase is a massively scalable, distributed big data store in the Apache Hadoop ecosystem. It is an open-source, non-relational, versioned database that runs on top of the Apache Hadoop Distributed File System (HDFS). It’s built […]

Monitor AWS workloads without a single line of code with Logz.io and Kinesis Firehose

February 9, 2024: HAQM Kinesis Data Firehose has been renamed to HAQM Data Firehose. Read the AWS What’s New post to learn more. Observability data provides near real-time insights into the health and performance of AWS workloads, so that engineers can quickly address production issues and troubleshoot them before widespread customer impact. As AWS workloads […]

Microservice observability with HAQM OpenSearch Service part 2: Create an operational panel and incident report

In the first post in our series , we discussed setting up a microservice observability architecture and application troubleshooting steps using log and trace correlation with HAQM OpenSearch Service. In this post, we discuss using PPL to create visualizations in operational panels, and creating a simple incident report using notebooks. To try out the solution […]

Stream HAQM EMR on EKS logs to third-party providers like Splunk, HAQM OpenSearch Service, or other log aggregators

Spark jobs running on HAQM EMR on EKS generate logs that are very useful in identifying issues with Spark processes and also as a way to see Spark outputs. You can access these logs from a variety of sources. On the HAQM EMR virtual cluster console, you can access logs from the Spark History UI. […]

Architecture Diagram

Query and visualize HAQM Redshift operational metrics using the HAQM Redshift plugin for Grafana

Grafana is a rich interactive open-source tool by Grafana Labs for visualizing data across one or many data sources. It’s used in a variety of modern monitoring stacks, allowing you to have a common technical base and apply common monitoring practices across different systems. HAQM Managed Grafana is a fully managed, scalable, and secure Grafana-as-a-service […]