AWS Cloud Operations Blog
Category: Monitoring and observability
Using HAQM CloudWatch with HAQM EventBridge for cross-account event monitoring
We often talk about event driven architectures where an event is something that happens within your application or architecture. It could be a new file received by your application or when there is an alert triggered by high CPU utilization. We can act on these events by scanning the file contents or scaling out more […]
Automating the installation and configuration of Prometheus using Systems Manager documents
As organizations migrate workloads to the cloud, they want to ensure their teams spend more time on tasks that move the organization forward and less time managing infrastructure. Installing patches and configuring software is what AWS calls undifferentiated heavy lifting, or the hard IT work that doesn’t add value to the mission of the organization. […]
Collecting Apache Flink metrics in the HAQM CloudWatch agent
Apache Flink is a distributed stream processing engine. You can run Flink on HAQM EMR as a YARN application. You can view Flink metrics through its web UI, but what if you want to react to them? In this blog post, I’ll show you how to use the CloudWatch agent to collect Flink metrics into […]
Use AWS CloudWatch Contributor Insights to monitor CIS AWS Foundations Benchmark controls
Contributor Insights is a feature of AWS CloudWatch that can be used to analyze log data to create time series that displays contributor data. This will help you understand who or what is impacting your system and application performance by identifying top talkers, pinpointing outliers, finding the heaviest traffic patterns, and ranking the top system […]
Introducing CloudWatch Resource Health to monitor your EC2 hosts
Today, AWS announced HAQM CloudWatch Resource Health, a fully managed solution that customers can use to automatically discover, manage, and visualize the health and performance of HAQM Elastic Compute Cloud (HAQM EC2) hosts across their applications. Resource Health provides a centralized view of your EC2 hosts by performance dimensions such as CPU or memory utilization. […]
Using VPC endpoints for AWS X-Ray
Today, AWS X-Ray announces the general availability of VPC endpoint support, which makes it possible for you to establish a private connection between your VPC and AWS X-Ray. Applications running in your VPC can now communicate with AWS X-Ray to send trace data without going through the public internet. In this post, I will show […]
Monitoring your EC2 server fleet with advanced CloudWatch agent capabilities
Customers who are running fleets of HAQM Elastic Compute Cloud (HAQM EC2) instances use advanced monitoring techniques to observe their operational performance. Capabilities like aggregated and custom dimensions help customers categorize and customize their metrics across server fleets for fast and efficient decision making. Customers need visibility not only into infrastructure metrics (like CPU and […]
How Wealthfront utilizes AWS X-Ray to analyze and debug distributed applications
This blog post was written by Harichandan Pulagam, a Data Engineer at Wealthfront In this blog post, we describe how Wealthfront used AWS X-Ray to streamline the development and operations of a distributed application. About Wealthfront Wealthfront’s mission is to build a financial system that favors people, not institutions. They strive to provide better experiences […]
Delete HAQM CloudWatch Synthetics dependent resources when you delete a CloudFormation stack
HAQM CloudWatch Synthetics allows you to monitor application endpoints more easily. It runs tests on your endpoints every minute, and alerts you if your application endpoints don’t behave as expected. These tests can be customized to check for availability, latency, transactions, broken or dead links, page load errors, load latencies for UI assets, complex wizard […]
Sending CloudFront standard logs to CloudWatch Logs for analysis
HAQM CloudFront is a fast content delivery network (CDN) service that securely delivers data, videos, applications, and APIs to customers globally with low latency, high transfer speeds, all within a developer-friendly environment. CloudFront standard logs (also known as access logs) give you visibility into requests that are made to a CloudFront distribution. The logs can […]