AWS Cloud Operations Blog

Tag: Infrastructure Monitoring

Automate insights for your EC2 fleets across AWS accounts and regions

Automate insights for your EC2 fleets across AWS accounts and regions

Introduction Gaining insights and managing large HAQM Elastic Compute Cloud (HAQM EC2) fleet that is spread across multiple accounts and regions can be a challenging task. It’s crucial to have a quick and efficient method to identify which instances are managed by AWS Systems Manager (SSM) and gather detailed information about the instances that are […]

Monitor your Lambda function and get notified with AWS Chatbot

Monitor your Lambda function and get notified with AWS Chatbot

AWS Lambda is a serverless compute service that helps you run code without provisioning or managing hardware. You can run AWS Lambda function to execute a code in response to triggers such as changes in data or system state. For example, you can use HAQM S3 to trigger AWS Lambda to process data immediately after […]

Monitoring your EC2 server fleet with advanced CloudWatch agent capabilities

Monitoring your EC2 server fleet with advanced CloudWatch agent capabilities

Customers who are running fleets of HAQM Elastic Compute Cloud (HAQM EC2) instances use advanced monitoring techniques to observe their operational performance. Capabilities like aggregated and custom dimensions help customers categorize and customize their metrics across server fleets for fast and efficient decision making. Customers need visibility not only into infrastructure metrics (like CPU and […]

Manage HAQM EC2 instance clock accuracy using HAQM Time Sync Service and HAQM CloudWatch – Part 2

In part 1 of this series, I cover important concepts about measuring the accuracy of time on HAQM EC2 instances . I discussed calculating ClockErrorBound (?) and using its value as a range between which system time is accurate. In this part, I walk through the process of using HAQM CloudWatch to measure and monitor […]

Manage HAQM EC2 instance clock accuracy using HAQM Time Sync Service and HAQM CloudWatch – Part 1

This two-part series discusses the measurement and management of time accuracy on HAQM EC2 instances. Part 1 covers the important concepts related to system and reference time. Part 2 covers the mechanism of measure, monitor, and maintain accurate system time on EC2 instances. A large and diverse set of customer workloads depends on the observed […]