AWS Cloud Operations Blog
Category: HAQM Managed Service for Prometheus
Introducing vended logs for HAQM Managed Service for Prometheus
Customers are using HAQM Managed Service for Prometheus to monitor and alert on their container metrics. HAQM Managed Service for Prometheus ships with Alert Manager, the open source alert routing component in Prometheus. Alert manager routes alerts to HAQM Simple Notification Service (HAQM SNS). However, there are some common reasons why alert manager may fail […]
Monitoring Windows desktops on HAQM WorkSpaces using HAQM Managed Service for Prometheus and HAQM Managed Grafana
Many Organizations leverage HAQM WorkSpaces as a virtual cloud-based Windows desktop as a solution (DAAS) to replace their existing traditional desktop solution to shift the cost and effort of maintaining laptops and desktops to a cloud pay-as-you-go model. Customers using HAQM WorkSpaces would need the support of managed services to monitor their workspaces environment operations. […]
Viewing HAQM CloudWatch metrics with HAQM Managed Service for Prometheus and HAQM Managed Grafana
Monitoring AWS services comprising of a customer workload with HAQM CloudWatch is important for resiliency of a workload. Customers can bring their CloudWatch data alongside their existing Prometheus data sources to improve their ability to join or query across for a holistic view of their systems. The HAQM Managed Service for Prometheus is a serverless […]
Auto-scaling HAQM EC2 using HAQM Managed Service for Prometheus and alert manager
Customers want to migrate their existing Prometheus workloads to the cloud and utilize all that the cloud offers. AWS has services like HAQM EC2 Auto Scaling, which lets you scale out HAQM Elastic Compute Cloud (HAQM EC2) instances based on metrics like CPU or memory utilization. Applications that use Prometheus metrics can easily integrate into […]
Viewing custom metrics from statsd with HAQM Managed Service for Prometheus and HAQM Managed Grafana
Monitoring applications based on custom metrics is important for a resilient system. One of the mechanisms to generate custom metrics from applications is statsd – a NodeJs process to collect custom application performance metrics periodically. However, statsd doesn’t provide long-term storage, rich querying, visualization, or an alerting solution. HAQM Managed Service for Prometheus and HAQM […]
Viewing collectd statistics with HAQM Managed Service for Prometheus and HAQM Managed Service for Grafana
Monitoring systems are essential for a resilient solution. A popular tool to monitor Linux-based physical or virtual machines is collectd – a daemon to collect system and application performance metrics periodically. However, collectd doesn’t provide long-term storage for metrics, rich querying, visualization, or an alerting solution. The HAQM Managed Service for Prometheus is a serverless […]
Introducing HAQM EKS Observability Accelerator
Some of the details in this blog post are now outdated. For the latest information on the AWS Observability Accelerator please see Announcing AWS Observability Accelerator to configure comprehensive observability for HAQM EKS. Also explore the GitHub repository where you can find more details on how to get started. Observability is critical for any application […]
Monitor Istio on EKS using HAQM Managed Prometheus and HAQM Managed Grafana
Service Meshes are an integral part of the Kubernetes environment that enables secure, reliable, and observable communication. Istio is an open-source service mesh that provides advanced network features without requiring any changes to the application code. These capabilities include service-to-service authentication, monitoring, and more. Istio generates detailed telemetry for all service communications within a mesh. This telemetry […]
Introducing vended metrics for HAQM Managed Service for Prometheus
Today, I’m happy to announce that HAQM Managed Service for Prometheus now vends usage metrics to HAQM CloudWatch. These metrics can be used to help you gain better visibility into your HAQM Managed Service for Prometheus workspace. Let’s dive in to see how you could use these new Prometheus usage metrics in CloudWatch. I‘ve set […]
Monitoring HAQM EMR on EKS with HAQM Managed Prometheus and HAQM Managed Grafana
Apache Spark is an open-source lightning-fast cluster computing framework built for distributed data processing. With the combination of Cloud, Spark delivers high performance for both batch and real-time data processing at a petabyte scale. Spark on Kubernetes is supported from Spark 2.3 onwards, and it gained a lot of traction among enterprises for high performance and […]