Containers
Analyzing Java applications performance with async-profiler in HAQM EKS
This blog was authored by Sascha Möllering, Principal Specialist Solutions Architect Containers and Yuriy Bezsonov, Senior Partner Solutions Architect.
Introduction
Container startup performance presents a significant challenge for Java applications running on Kubernetes, particularly during scaling events and recovery scenarios. Although Java remains a top choice for both legacy modernization and new microservices development, especially with frameworks such as Spring Boot or Quarkus, containerized Java applications often face unique challenges around startup times and runtime behavior.
Performance profiling in containerized Java applications has long presented significant challenges. Async-profiler, a lightweight sampling solution from the HAQM Corretto team, offers an interesting approach for Java workloads running on HAQM Elastic Kubernetes Service (HAQM EKS). Eliminating traditional Safepoint bias issues (more information about Java Safepoint can be found in this post) enables more accurate performance analysis.
In this post we explore practical implementations for both on-demand and continuous profiling scenarios, using the Mountpoint for HAQM S3 Container Storage Interface (CSI) driver to efficiently manage profiling data in your Kubernetes environment.
Solution overview
To avoid depending upon the JVM reaching a safepoint before profiling, async-profiler features HotSpot-specific APIs to collect stack traces. It works with OpenJDK and other Java runtimes based on the HotSpot JVM.
async-profiler
can trace the following kinds of events:
- CPU cycles.
- Hardware and Software performance counters such as cache misses, branch misses, page faults, context switches, etc.
- Allocations in Java Heap.
- Contented lock attempts, such as both Java object monitors and ReentrantLocks.
- Wall-clock time (also called wall time), which is the time it takes to run a block of code.
We use UnicornStore as an example of a Java application to be profiled, as shown in the following figure. The UnicornStore is a Spring Boot 3 Java Application that provides RESTful API. It stores data in a relational database running on HAQM Aurora Serverless with HAQM Relational Database Service (HAQM RDS) for Postgres engine and afterward publishes an event about performed actions to HAQM EventBridge .
We use HAQM EKS Auto Mode to deploy a ready-to-use Kubernetes cluster.

Figure 1: UnicornStore application architecture
Prerequisites
The solution is based on the infrastructure as a code (IaC) of “Java on AWS Immersion Day”, which streamlines the setup of the environment. You only need an AWS account and AWS CloudShell to bootstrap the environment.
Walkthrough
The following steps walk you through this solution.
Setting up the environment
You can use the following setup to create a solution infrastructure with Visual Studio Code for the Web and with all the necessary tools installed:
- Navigate to CloudShell in the AWS console.
- Deploy AWS CloudFormation. You can also deploy the template directly in the CloudFormation console using the file from the provided link.
- Wait until the command finishes successfully. The deployment takes about 15-20 minutes.
After successful creation of the CloudFormation stacks, you can access the VS Code
instance using the IdeUrl
and the IdePassword
from the output of the preceding command.
All following commands must be run in the Terminal
window of the VS Code
instance.
Instrumenting a container image with a profiler
In this post you use wall-clock profiling. For functions specifically, wall-clock time measures the total duration from when the function starts until it completes. This measurement encompasses all delays, such as time spent waiting for locks to release and threads to synchronize. When wall-clock time exceeds CPU time, it suggests your code is spending time in a waiting state. A significant gap between these times often points to potential resource bottlenecks in your application.
Conversely, when CPU time closely matches wall-clock time, it indicates computationally heavy code, where the processor is actively working for most of the running period. These CPU-bound code segments that take considerable time to run may benefit from performance optimization efforts.
- Add profiler binaries to a container image. Use multi-stage build to build container images:
- Build and push a new container image to the HAQM Elastic Container Registry (HAQM ECR):
- Deploy the Java application to the EKS cluster:
The deployment takes about 3-5 minutes.
On-demand profiling
Now you can start on-demand profiling and benchmark the Java application under load.
1. Start on-demand profiling in the container and get the status (line 1), create load with Artillery for one minute with 200 concurrent POST request of createUnicorn (line 3), create a folder for profiling results, and then stop the profiling when the benchmarking is finished (line 4):
2. The resulting file is stored in the container /home/spring/profiling/
folder. Copy the resulting file to the development instance. Save the Summary report
output from benchmarking tool for further comparison.
You can download the resulting file to your computer using right-click on the file name. Choose Download ...
and open the file in a browser.

Figure 2: Download on-demand profiling results
Analyzing the profiling results and performing optimization
As the result you get a Flame Graph.
1. Choose the Search
button in the top left corner and search for the UnicornController.createUnicorn
method.
A significant portion of the run time is waiting for a database connection. You can improve this by increasing the number of database connections in the Java application properties file, as shown in the following figure.

Figure 3: Initial on-demand profiling results
2. Change the number of database connections from 1 to 10:
Assuming that you have addressed what causes waiting time in the requests by changing the datasource value to 10, you can benchmark it with profiling and compare the results.
3. Build and push a new version container image to the HAQM ECR container registry:
4. Redeploy the application to the EKS cluster:
5. Repeat the benchmark test with profiling using the same command or with the following script, and download the results:
6. Open the downloaded file in a browser and search for UnicornController.createUnicorn
as shown in the following figure.

Figure 4: Improved on-demand profiling results
You can see that the waiting time for getting a connection is significantly decreased. The benchmarks for response time were improved, too, as shown in the following figure.

Figure 5: On-demand results comparison
Setting up the infrastructure to store results of continuous profiling
On-demand profiling helps find problems with the application, but those issues can happen at any time. It is a great advantage to be able to go back in time and investigate the state of a Java application when the issue happened. To achieve that, you can setup continuous profiling, create trace files, and store them to HAQM S3.
1. Create an S3 bucket to store the results of continuous profiling:
2. Create an AWS Identity and Access Management (IAM) policy to allow pods with the Java application to access the S3 bucket:
For the access to the S3 bucket, use the Mountpoint for HAQM S3 Container Storage Interface (CSI) driver, which is deployed as an HAQM EKS add-on. This driver allows the Java application running in the EKS cluster to put files to HAQM S3 through a file system interface. Built on Mountpoint for HAQM S3, the Mountpoint CSI driver presents an S3 bucket as a storage volume accessible by containers in a Kubernetes cluster.
At the time of writing, there is no solution like the HAQM EBS CSI driver for HAQM Elastic Container Service (HAQM ECS). However, the documentation describes how a similar approach can be implemented for an ECS cluster with HAQM Elastic Compute Cloud (HAQM EC2) instances.
3. Associate IAM OIDC provider with the EKS cluster and create an IAM role for the add-on:
We use IAM roles for service accounts (IRSA) and not EKS Pod Identities due to the current limitations of Mountpoint for HAQM S3 CSI driver. Refer to the official installation guide for the updated installation procedure.
4. Install the Mountpoint for HAQM S3 CSI driver as the add-on to the EKS cluster:
5. Create PersistentVolume and PersistentVolumeClaim to access the S3 bucket from pods:
6. Deploy the manifests for persistent objects to the EKS cluster:
Instrumenting a container image and deployment for continuous profiling
1. Override a command and arguments from the Dockerfile
in the deployment and start profiling using the launching as agent instruction. With this approach you can change profiling parameters without a need to rebuild a container image. async-profiler
starts with the Java application and creates call stacks each minute. Add commands to deployment.yaml:
2. Deploy the manifests to the EKS cluster and restart the deployment:
3. Check the state of the Java application pod and profiler:
Analyzing the results continuous profiling
1. Create a load for five minutes:
Each minute the profiler creates profile-YYYYMMDD-HHMISS.txt
in the S3 bucket path corresponding to a pod name, as shown in the following figure.

Figure 6: Continuous profiling results on HAQM S3
You can convert any of those files to Flame Graph using converter.jar
from async-provider.
2. Create a folder for profiling stacks and copy the files from the S3 bucket:
POD_NAME=$(kubectl get pods -n unicorn-store-spring | grep Running | awk '{print $1}')
mkdir -p ~/environment/unicorn-store-spring/stacks/$POD_NAME
aws s3 cp s3://$S3PROFILING/$POD_NAME ~/environment/unicorn-store-spring/stacks/$POD_NAME/ --recursive
3. Download async-provider
to the development instance
cd ~/environment/unicorn-store-spring
wget http://github.com/async-profiler/async-profiler/releases/download/v3.0/async-profiler-3.0-linux-x64.tar.gz
mkdir ~/environment/unicorn-store-spring/async-profiler
tar -xvzf ./async-profiler-3.0-linux-x64.tar.gz -C ~/environment/unicorn-store-spring/async-profiler --strip-components=1
rm ./async-profiler-3.0-linux-x64.tar.gz
4. Choose one of the files and convert to FlameGraph, for example, the first file:
cd ~/environment/unicorn-store-spring
STACK_FILE=$(find ~/environment/unicorn-store-spring/stacks/$POD_NAME -type f -printf '%T+ %p\n' | sort | head -n 1| cut -d' ' -f2-)
java -cp ./async-profiler/lib/converter.jar FlameGraph $STACK_FILE ./profile.html
5. Download profile.html
to your computer, open it with a browser and Search
for UnicornController.createUnicorn
, as shown in the following figure.

Figure 7: Continuous profiling graph
This approach allows you to analyze the state of a Java application during specific period in time, such as the startup phase.
Cleaning up
1. To avoid incurring future charges, delete deployed AWS resources with the commands in the VS Code
terminal:
~/java-on-aws/infrastructure/scripts/cleanup/eks.sh
aws cloudformation delete-stack --stack-name eksctl-unicorn-store-addon-iamserviceaccount-kube-system-s3-csi-driver-sa
aws s3 rm s3://$S3PROFILING --recursive
aws s3 rb s3://$S3PROFILING
2. Close the tab with VS Code
, open CloudShell and run the commands to finish cleaning up:
The deletion of the stack can take about 20 minutes.
Delete the S3 bucket that you used to deploy AWS CloudFormation template. Check the remained resources and stacks and delete them manually if necessary.
Conclusion
In this post we demonstrated the use of async-profiler
with HAQM EKS either on-demand or in a continuous profiling mode. We initially set up the infrastructure with an EKS cluster and instrumented the UnicornStore Java application container image with async-profiler
. We have built and uploaded the container image to HAQM ECR, deployed it to the EKS cluster, and run on-demand profiling under the load. Moreover, we created an HAQM S3 bucket and connected it to persistent volumes in the EKS cluster using Mountpoints for HAQM S3 with the corresponding CSI driver. After successful deployment of a pod based on the created container image, the results of the continuous profiling of the Java application were stored in an S3 bucket.
With the help of async-profiler
we found a bottleneck in the Java application and eliminated it. We also created a solution that helps to continuously create profiling data for the further analysis.
If you want to dive deeper in the internals of profiling with async-profiler, then we recommend this three hour playlist to learn about all of the features.
We hope we have given you some ideas on how you can profile your existing Java application using async-profiler
. Feel free to submit enhancements to the sample application in the source repository.