AWS Cloud Operations Blog
Visualizing metrics across HAQM Managed Service for Prometheus workspaces using HAQM Managed Grafana
This post provides step-by-step instructions for aggregating and visualizing your HAQM Elastic Kubernetes Service (HAQM EKS) monitoring metrics using HAQM Managed Service for Prometheus and HAQM Managed Grafana. As part of this solution, promxy a Prometheus proxy, is deployed to enable a single Grafana data source to query multiple Prometheus workspaces. Please note that this solution uses an open source project for which AWS support doesn’t exist. It is assumed that you will perform all necessary security assessments before using this solution in production.
HAQM EKS is a managed Kubernetes service that makes it easy to run Kubernetes on AWS and on-premises. HAQM Managed Service for Prometheus is a Prometheus-compatible monitoring and alerting service that makes it easy to monitor containerized applications and infrastructure at scale. HAQM Managed Grafana is a fully managed service that enables you to analyze your metrics, logs, and traces without having to provision servers, configure and update software, or do the heavy lifting involved in securing and scaling Grafana in production. For help setting up your EKS cluster, HAQM Managed Service for Prometheus workspaces, and HAQM Managed Grafana workspace used in this post, please reference the AWS Observability Workshop.
Overview of solution
With HAQM EKS for container deployment and management, HAQM Managed Service for Prometheus for container monitoring, and HAQM Managed Grafana for data visualization, you can deploy, monitor, and visualize your containerized applications. However, Grafana dashboards that span multiple Prometheus workspaces require additional setup and configuration because separate queries for each workspace must be created.
Promxy is an open-source utility that acts as a Prometheus proxy, enabling a single query to retrieve data from multiple Prometheus workspaces. This utility simplifies dashboards and data source management in Grafana.
To implement this solution, complete the following steps. You will dive deep into each of these steps in the following sections.
- HAQM EKS cluster preparation
- Application load balancer controller deployment
- NGINX controller deployment
- Promxy authentication
- Promxy deployment
- HAQM Manage Grafana configuration
HAQM EKS is not a mandatory component in the architecture. Other platforms can be used for this deployment, including HAQM Elastic Compute Cloud (HAQM EC2) or HAQM Elastic Container Service (HAQM ECS).
After this process, the following monitoring architecture will be in place. A data source created in HAQM Managed Grafana points to the application load balancer. The load balancer sends requests to NGINX, which performs basic authentication and forwards the request to promxy. Promxy connects to multiple HAQM Managed Service for Prometheus workspaces to obtain Prometheus metrics for display in the Grafana dashboard.
Prerequisites
For this walkthrough, you should have the following prerequisites in place:
- An AWS account
- Multiple existing HAQM Managed Service for Prometheus workspaces
- Metrics from multiple HAQM EKS clusters sending to the workspaces via a Prometheus server or AWS Distro for OpenTelemetry (ADOT) Collector
- An existing HAQM Managed Grafana workspace
- An integrated development environment (IDE) with AWS Command Line Interface (AWS CLI) and Helm installed
- An existing HAQM EKS cluster for promxy deployment
In this walkthrough, we use an AWS Cloud9 IDE to run commands. You can use any IDE but ensure you have the AWS CLI and Helm installed.
HAQM EKS cluster preparation
To use AWS Identity and Access Management (IAM) roles for service accounts, you need an IAM OIDC identity provider for your cluster. First, retrieve your OIDC Connect Issuer URL for your cluster (for instructions, refer to Step 1 in Create an IAM OIDC provider for your cluster). Next, create an OIDC identity provider by issuing the following command:
eksctl utils associate-iam-oidc-provider –cluster <cluster_name>
–approve
Substitute your HAQM EKS cluster name for <cluster_name>
.
Application load balancer controller deployment
Application load balancing on HAQM EKS is accomplished using the AWS Load Balancer Controller, which manages the deployment of elastic load balancers for a Kubernetes cluster. The controller automatically provisions an application load balancer when a Kubernetes ingress resource is created. This ingress resource is created as part of the promxy deployment described later. This section includes information and examples of the commands run to install the controller; for detailed instructions, refer to Installing the AWS Load Balancer Controller add-on. Please note the use of the promxy namespace in place of kube-system
.
Create the IAM policy
Create the iam_policy.json
file first, as described in Step 1a of Installing the AWS Load Balancer Controller add-on. Next, create the IAM policy associated with the Kubernetes service account role using the following command.
Create the IAM role and service account
An IAM role can be associated with a Kubernetes service account to provide AWS permissions to the containers in any pod that uses the service account. The following command creates the service account and IAM role:
Replace <cluster_name>
with your EKS cluster name and <account_id>
with your AWS account ID.
Deploy the application load balancer controller
Install the AWS application load balancer controller using Helm:
Replace <cluster_name>
with your EKS cluster name.
If you have not previously added the eks-charts repository, you can add it by following steps 5a and 5b in Installing the AWS Load Balancer Controller add-on.
Ingress-NGINX controller deployment
The promxy utility doesn’t provide an authentication mechanism. Therefore, you use the Ingress-NGINX Controller, which provides various authentication methods. This deployment utilizes basic authentication for accessing promxy. This ingress resource is created as part of the promxy deployment described later.
Deploy the Ingress-NGINX controller
Install the Ingress-NGINX controller helm chart. Please note that the service type must be set to NodePort
to prevent the controller from automatically creating an AWS classic load balancer when an ingress resource is created. Instead, this solution uses an application load balancer created by the application load balancer controller.
Promxy authentication
NGINX provides authentication, but first, you must create a secret containing the user information.
Create the htpasswd file
Utilize the htpasswd
command to generate a file named auth, containing the user name, promxy-admin
, and associated password.
Convert htpasswd into a secret
Convert the file into a Kubernetes secret:
Examine the secret to confirm it’s created correctly:
You get the following output:
Promxy deployment
Complete the following steps to deploy promxy.
Create an IAM role for promxy
In this step, you create an IAM role to give promxy permission to query metrics from HAQM Managed Service for Prometheus workspaces.
- On the IAM console, choose Roles in the navigation pane.
- Choose Create role and choose Custom trust policy.
- Replace the custom trust policy with the following policy. Update the trust policy’s account number, Region, and OIDC ID.
- Choose Next and add the AWS managed policy
HAQMPrometheusQueryAccess
.
- Choose Next.
- Enter the IAM role name,
promxy-prometheus-access-role
, and choose Create role. - Copy the IAM role HAQM Resource Name (ARN) for later use, which looks like
arn:aws:iam::
<account number>:role/promxy-prometheus-access-role.
Clone the promxy GitHub repository
Changes and additions to the promxy GitHub repository files are required. Therefore, run the following commands to clone the promxy repository to the local file system:
Create supplementary promxy files
In this step, you create supplementary files to modify the default promxy deployment and create the application load balancer and NGINX ingress resources.
Promxy override values
Some of the default promxy configuration needs to be changed. Create a new file named promxy_override_values.yaml
in the ~/ekspromxy
directory. Replace the account number and Region with the appropriate values for your installation. Update the path_prefix with the correct HAQM Managed Service for Prometheus workspace ID.
Please note that the entire static_configs section is duplicated and each contains a separate workspace ID. This section should be replicated for every HAQM Managed Service for Prometheus workspace to which promxy will proxy requests (in the example, two workspaces are defined). Finally, update the certificate name if you’re using https to connect to promxy.
ALB ingress resource
Create a new file named ingress_alb.yaml
in the ~/ekspromxy/promxy/deploy/k8s/helm-charts/promxy/templates
directory, which defines the application load balancer ingress resource:
NGINX ingress resource
Create a new file named ingress_nginx.yaml
in the ~/ekspromxy/promxy/deploy/k8s/helm-charts/promxy/templates
directory, which defines the NGINX ingress resource:
Update promxy files
Promxy doesn’t provide the ability to authenticate, which HAQM Managed Service for Prometheus requires. Signature Version 4 (SigV4) is the process of adding authentication information to AWS API requests sent by HTTP. To utilize SigV4, you deploy an AWS SigV4 Proxy Kubernetes sidecar container that signs the requests from promxy and forwards them to HAQM Managed Service for Prometheus.
Update the existing deployment.yaml
file in the ~/ekspromxy/promxy/deploy/k8s/helm-charts/promxy/templates
directory to deploy the sidecar:
The bottom of the deployment.yaml
file looks like the following code after the update.
Install the promxy helm chart
Install the promxy helm chart in the promxy namepace using the previously created override file:
You will see the pods and services created for the two controllers and promxy:
Run the following command to obtain the application load balancer URL:
HAQM Managed Grafana configuration
In this step, you create the HAQM Managed Grafana data source.
- On the HAQM Managed Grafana console, choose your Grafana workspace URL, and log in.
- Choose Configuration and then Data sources.
- Choose Add date source and choose Prometheus.
- For Name, enter a name.
- For URL, enter the URL obtained in the prior step. Be sure to prefix the URL with http or https, depending on what was specified in the ALB ingress resource definition.
- Choose Basic auth and then enter the user name and password created and stored in the Kubernetes secret.
- Choose Save & test, and the message “Data source is working” will be displayed.
The final step is to create a dashboard to display Prometheus metrics. In the following dashboard, two HAQM EKS clusters send metrics to two HAQM Managed Service for Prometheus workspaces. Configuring the HAQM Managed Grafana data source to point to promxy enables the ability query metrics from all HAQM Managed Service for Prometheus workspaces. HAQM EKS pod memory and CPU metrics aggregated within an HAQM EKS cluster are displayed on the left, and metrics aggregated across HAQM EKS clusters from multiple HAQM Managed Service for Prometheus workspaces are on the right.
Cleaning up
After testing this solution, remember to complete the following steps to avoid incurring charges to your AWS account.
Uninstall promxy
Uninstall the promxy helm chart and delete the local promxy GitHub repository:
Delete the IAM role that was created for promxy:
Delete the Kubernetes secret
Delete the file generated using htpasswd and the Kubernetes secret that was created for basic authentication within NGINX:
Uninstall the controllers
Uninstall both helm charts for the NGINX and application load balancer controllers:
Delete the EKS service account, role, and policy
Finally, delete the HAQM EKS service account, role, and policy used by the application load balancer controller:
Remember to replace <cluster_name>
with the name of your cluster and <policy_arn>
with the ARN of the IAM policy you created. The eksctl command automatically removes the associated IAM role.
Conclusion
This post demonstrates how to visualize and aggregate metrics from multiple HAQM Managed Service for Prometheus workspaces in an HAQM Managed Grafana dashboard using a single data source. To accomplish this, you utilized an open-source tool, promxy, to connect to each HAQM Managed Service for Prometheus workspace. HAQM Managed Grafana was then able to pull metrics from each workspace using a single data source connecting to promxy. Promxy was configured to run within an HAQM EKS cluster, along with an application load balancer and NGINX controller. The application load balancer controller automatically created an AWS Application Load Balancer in front of promxy and the NGINX controller provided basic authentication.
To learn more and get hands on experience with HAQM EKS, HAQM Managed Service for Prometheus, and HAQM Managed Grafana, explore the EKS Workshop.
About the authors: