AWS Cloud Operations Blog
Category: HAQM CloudWatch
Observe your Azure and AWS workloads simultaneously with HAQM CloudWatch
Overview Effective operation of cloud applications and services demands a strong focus on monitoring and observability. It’s critical for your teams to define, capture, and analyze metrics, ensuring operational visibility and extracting actionable insights from logs. In many companies, technical teams share integrated systems to monitor the services or infrastructure they manage. Shared observability systems […]
What’s new in AWS Observability at re:Invent 2023
Let’s recap the week at AWS re:Invent 2023 with a round-up of the AWS Observability launches across HAQM CloudWatch, HAQM Managed Grafana, and HAQM Managed Service for Prometheus. From automatic instrumentation and operation of applications in CloudWatch, to agentless scraping of Prometheus metrics in Managed Service for Prometheus, read on to learn about the features […]
Four APM features to elevate your observability experience
Application performance monitoring (or APM) is the practice of taking key application performance indicators to ensure system availability, improve system performance, and improve the end-user experience. This week we announced HAQM CloudWatch Application Signals, a new set of features built-in to HAQM CloudWatch to help you speed up troubleshooting, reduce application disruptions, and operational costs, […]
Leverage generative AI to create custom dashboard widgets in HAQM CloudWatch using HAQM CodeWhisperer
Observability describes how well you can understand what is happening in a system, often by instrumenting it to collect metrics, logs, and traces. To achieve operational excellence and meet business objectives, you need to understand how your systems are performing. In order to accomplish this, many customers use HAQM CloudWatch to get real-time monitoring, alerts […]
Analyzing HAQM Lex conversation log data with HAQM Managed Grafana
To support business and internal processes, organizations are increasing their use of conversational interfaces. They offer opportunities for more availability, improved service levels, and reduced costs. As these conversational services become more important, so, does the need to monitor performance and effectiveness of these interfaces with analytics and dashboards. This analysis is used to drive […]
Monitoring GPU workloads on HAQM EKS using AWS managed open-source services
As machine learning (ML) workloads continue to grow in popularity, many customers are looking to run them on Kubernetes with graphics processing unit (GPU) support. HAQM Elastic Compute Cloud (HAQM EC2) instances powered by NVIDIA GPUs deliver the scalable performance needed for fast ML training and cost-effective ML inference. Monitoring GPU utilization gives valuable information for researchers working […]
Announcing HAQM CloudWatch Container Insights with Enhanced Observability for HAQM EKS on EC2
HAQM CloudWatch Container Insights is a fully managed monitoring and observability service that provides DevOps engineers, developers, SREs, and IT managers with out-of-the-box visibility into their containerized applications and microservice environments. With HAQM CloudWatch Container Insights, you can monitor, isolate, and diagnose issues in your Kubernetes clusters with minimal effort. It delivers infrastructure telemetry like […]
Lowering MTTR with HAQM CloudWatch and AWS X-Ray
Customers running microservice-based workloads in a serverless environment frequently have issues with troubleshooting incidents as the data they need can be distributed across hundreds or thousands of components. In this blog post, I will demonstrate how you can reduce the mean time to resolution (MTTR, or the average time it takes to repair or mitigate […]
Observe dynamic sites with HAQM CloudWatch Synthetics and AWS Systems Manager Parameter Store
Overview Maintaining and improving end user experience is key and as your business grows, the number of endpoints you need to observe can grow quickly. It can become more challenging and time consuming to build multiple canaries to observe them. This solution is designed to show how you can use a consistent and automated approach […]
Observability using native HAQM CloudWatch and AWS X-Ray for serverless modern applications
Introduction In this blog post, we will share how you can use AWS-native observability tools to measure the current state of your modern serverless applications and how to get started with the minimal effort. We will review tools like HAQM CloudWatch and AWS X-Ray and explore how these services can help you instrument your application […]