AWS Cloud Operations Blog
Category: Intermediate (200)
Collecting Apache Flink metrics in the HAQM CloudWatch agent
Apache Flink is a distributed stream processing engine. You can run Flink on HAQM EMR as a YARN application. You can view Flink metrics through its web UI, but what if you want to react to them? In this blog post, I’ll show you how to use the CloudWatch agent to collect Flink metrics into […]
Introducing CloudWatch Resource Health to monitor your EC2 hosts
Today, AWS announced HAQM CloudWatch Resource Health, a fully managed solution that customers can use to automatically discover, manage, and visualize the health and performance of HAQM Elastic Compute Cloud (HAQM EC2) hosts across their applications. Resource Health provides a centralized view of your EC2 hosts by performance dimensions such as CPU or memory utilization. […]
Behind the scenes as AWS AppConfig builds a Lambda extension
In this blog post, I will share why the AWS AppConfig team built an AWS Lambda extension (hint: customers wanted it), the effort required to build it (hint: it was easy), and the outcomes of building our Lambda extension (hint: lots). I will cover the technical and business aspects of building a Lambda extension and […]
Using VPC endpoints for AWS X-Ray
Today, AWS X-Ray announces the general availability of VPC endpoint support, which makes it possible for you to establish a private connection between your VPC and AWS X-Ray. Applications running in your VPC can now communicate with AWS X-Ray to send trace data without going through the public internet. In this post, I will show […]
Reinventing automated operations (Part II)
The first post in this series, Reinventing automated operations (Part I), covered the importance of operations in the cloud and how deferring the creation of an operations plan can slow down your migration. In this post, I’ll share the primary mechanism of iterative improvement (aka flywheel) that AWS Managed Services (AMS) uses to increase operational […]
Detecting and remediating process issues on EC2 instances using HAQM CloudWatch and AWS Systems Manager
Customers want to have visibility into processes running inside their HAQM Elastic Compute Cloud (HAQM EC2) instances. Critical processes and services in these instances can crash unexpectedly and when they do, it’s crucial for customers to be notified so they can maintain continued business operations. There are multiple ways to see if a service is […]
How Wealthfront utilizes AWS X-Ray to analyze and debug distributed applications
This blog post was written by Harichandan Pulagam, a Data Engineer at Wealthfront In this blog post, we describe how Wealthfront used AWS X-Ray to streamline the development and operations of a distributed application. About Wealthfront Wealthfront’s mission is to build a financial system that favors people, not institutions. They strive to provide better experiences […]
Creating contacts, escalation plans, and response plans in AWS Systems Manager Incident Manager
Many of our customers need an effective incident management and response solution to achieve operational excellence and performance efficiency. Transparency between those who are affected by the incident and those who respond to the incident is key to any incident management process. Finding the right team to mitigate the impact of application or workload incidents […]
AWS Systems Manager Incident Manager integration with HAQM CloudWatch
This is the second post in a two-part series about AWS Systems Manager Incident Manager. In the first post, we covered onboarding steps like creating contacts, an escalation plan, and a response plan in Incident Manager. In this post, we discuss the integration between Incident Manager and HAQM CloudWatch and how Incident Manager components manage an […]
Cost optimization with SQL BYOL using license included Windows instance on HAQM EC2 Dedicated Hosts
Do you want to bring your eligible SQL Server licenses to use on AWS? Do you have SQL Server licenses but not accompanying Windows Server licenses? Are you worried that you do not have Software Assurance for SQL Server? You can now run license included Windows Server instances on HAQM EC2 Dedicated Hosts, which makes […]