AWS Cloud Operations Blog
Category: Intermediate (200)
How BT uses HAQM CloudWatch to monitor millions of devices
In this guest post, Ciaran Kearney, Data Engineer at multinational telecommunications company BT discusses how BT built a monitoring solution using HAQM CloudWatch dashboards, composite alarms, and embedded metric format to support the monitoring of millions of devices. Customers with high-cardinality monitoring use cases often face challenges when it comes to implementing observability. Monitoring high-cardinality workloads […]
Create immutable servers using EC2 Image Builder and AWS CodePipeline
When you run an application on multiple HAQM Elastic Compute Cloud (HAQM EC2) instances, you want to avoid differences between the instances because they can cause unpredictable behavior and make it hard to troubleshoot and solve issues. The best way to prevent differences is to replace your instances whenever you want to make a change—to […]
Viewing permission issues with service-linked roles
Each AWS service requires explicit access to resources, endpoints, and objects that reside in the domain of another service. This is referred to as the permission boundary. Services like AWS Config, HAQM Macie, and AWS GuardDuty require an AWS Identity and Access Management (IAM) role that grants access to resources outside of its control. Understanding […]
Deliver ML-powered operational insights to your on-call teams via PagerDuty with HAQM DevOps Guru
HAQM DevOps Guru, now in preview, is an ML-powered cloud operations service that assists you in improving application availability. It’s easy to set up and use, and leverages machine learning models informed by years of operational expertise in building, scaling, and maintaining highly available applications at HAQM.com. DevOps Guru continuously analyzes streams of disparate data […]
Manage your HAQM EC2 macOS instances with AWS Systems Manager
Are you using macOS for developing, building, testing, and signing applications for Apple devices? To all the thriving community of millions of developers worldwide building applications on Apple platforms, we at AWS bring you the first ever macOS based compute environments in the public cloud. Yes, you read that right! You can now run macOS […]
How to aggregate and visualize AWS Health events using AWS Organizations and HAQM Elasticsearch Service
September 8, 2021: HAQM Elasticsearch Service has been renamed to HAQM OpenSearch Service. See details. In this post, I show you how to aggregate AWS Health events centrally from all accounts in your organization using AWS Organizations, AWS Lambda, and AWS Health API, and then build automation to ingest and visualize the operations data using […]
Cross-Region application monitoring using HAQM CloudWatch Synthetics and AWS CloudFormation
Customers need a way to find problems with their application before the real end users encounter them. They need to predict how their application will perform in supported geographies and isolate the root cause of any detected bottlenecks. Synthetic monitoring allows customers to emulate business processes or user transactions from different geographies and monitor their […]
Build a scheduler as a service with HAQM CloudWatch Events, HAQM EventBridge, and AWS Lambda
There are multiple ways to build a scheduler as a service in AWS. In this blog post, we provide step-by-step instructions for building a scheduler as a service with HAQM CloudWatch Events and HAQM EventBridge with AWS Lambda. We also demonstrate how to build a dynamic API scheduler using EventBridge and Lambda. CloudWatch Events deliver […]
How The Washington Post’s Arc XP uses CloudWatch Metrics Explorer to reduce costs
In this post, it is described how The Washington Post’s Arc XP uses Metrics Explorer to monitor their global SaaS platform and reduce costs
Secure monitoring of user workflow experience using HAQM CloudWatch Synthetics and AWS Secrets Manager
Customers often need an easy way to monitor the URLs, API endpoints, and critical GUI workflows of their web applications in a secure fashion. Monitoring helps keep the service available by detecting performance bottlenecks and operational issues as soon as they arise. Customers also want to be alerted when availability and latency issues occur so […]