Containers
Multi-cluster cost monitoring for HAQM EKS using Kubecost and HAQM Managed Service for Prometheus
Introduction
HAQM Managed Service for Prometheus is a Prometheus-compatible service that monitors and provides alerts on containerized applications and infrastructure at scale. In the previous post, Integrating Kubecost with HAQM Managed Service for Prometheus, we discussed how you can integrate Kubecost with HAQM Managed Service for Prometheus (AMP) to get granular visibility into your HAQM Elastic Kubernetes Service (HAQM EKS) cluster costs, letting you aggregate costs by the majority of Kubernetes contexts, starting from the cluster level down to the container level. The integration helps customers monitor a single HAQM EKS cluster without worrying about scaling the Prometheus instance. However, the complexity increases when your infrastructure grows to the size of multiple HAQM EKS clusters running across numerous regions and AWS accounts. You need to retrieve or gather the cost data from multiple endpoints to track the costs and generate reports of multiple HAQM EKS clusters for your show back or chargeback purposes. This is a time consuming and complicated process.
As part of AWS’ partnership with Kubecost, we are excited to announce this subsequent integration with HAQM Managed Service for Prometheus to help customers effectively monitor their Kubernetes costs without worrying about scaling the Prometheus instance. With the HAQM EKS optimized Kubecost bundle or with the Kubecost Enterprise License, AWS customers can now get a unified view into Kubernetes costs across multiple HAQM EKS cluster. In this post, you’ll learn how to set up cost monitoring across multiple HAQM EKS clusters in a federated view with Kubecost and HAQM Managed Service for Prometheus.
Solution overview
The architecture of this integration is similar to HAQM EKS cost monitoring with Kubecost, which is described in the previous post, with some enhancements as follows:
In this integration, an additional AWS SigV4 container is added to the cost-analyzer pod, which acts as a proxy to help query metrics from HAQM Managed Service for Prometheus using the AWS SigV4 signing process. It enables password-less authentication to reduce the risk of exposing your AWS credentials.
When the HAQM Managed Service for Prometheus integration is enabled, the bundled Prometheus server in the Kubecost Helm Chart is configured in the remote_write mode. The bundled Prometheus server sends the collected metrics to HAQM Managed Service for Prometheus using the AWS SigV4 signing process. All metrics and data are stored in HAQM Managed Service for Prometheus, and Kubecost queries the metrics directly from HAQM Managed Service for Prometheus instead of the bundled Prometheus. It helps customers to not worry about maintaining and scaling the local Prometheus instance.
There are two architectures you can deploy:
- The Quick-Start architecture supports the setup of up to 100 clusters.
- The Federated architecture supports the setup of over 100 clusters.
Quick-Start architecture
The infrastructure can manage up to 100 clusters. The following architecture diagram illustrates the small-scale infrastructure setup:
To support the large-scale infrastructure that has over 100 clusters, Kubecost uses HAQM Simple Storage Service (HAQM S3) to improve the query performance efficiently. On top of the HAQM Prometheus Workspace, Kubecost stores the Kubecost’s extract, transform, and load (ETL) data in a central HAQM S3 bucket. Kubecost’s ETL data is a computed cache based on Prometheus’s metrics, from which customers can perform all possible Kubecost queries. By storing the ETL data on an HAQM S3 bucket, this integration offers resiliency to your cost allocation data, improves the performance, and enables high availability architecture for your Kubecost setup.
The following architecture diagram illustrates the large-scale infrastructure setup:
Walkthrough
Prerequisites
- You have an existing AWS account.
- You have AWS Identity and Access Management (AWS IAM) credentials to create HAQM Managed Service for Prometheus and AWS IAM roles programmatically.
- You have an existing HAQM EKS cluster with OpenID Connect (OIDC) enabled.
- Your HAQM EKS clusters have HAQM Elastic Block Store (HAQM EBS) Container Storage Interface CSI driver installed
Create HAQM Managed Service for Prometheus workspace
Step 1: run the following command to get the information of your current EKS cluster:
Step 2: run the following command to create a new HAQM Managed Service for Prometheus workspace
The HAQM Managed Service for Prometheus workspace should be created in a few seconds. Run the following command to get the workspace ID:
Set up the environment
Step 1: set environment variables for integrating Kubecost with HAQM Managed Service for Prometheus
Run the following command to set environment variables for integrating Kubecost with HAQM Managed Service for Prometheus
Step 2: set up HAQM S3 bucket, AWS IAM policy, and Kubernetes secret for storing Kubecost ETL files
Note: You can ignore this step 2 for the small-scale infrastructure setup
a. Create Object store HAQM S3 bucket to store Kubecost ETL metrics:
Run the following command in your workspace:
b. Create AWS IAM Policy to grant access to the HAQM S3 bucket.
The following policy is for demo purposes only. You may need to consult your security team and make appropriate changes depending on your organization’s requirements.
Run the following command in your workspace:
c. Create Kubernetes secret to allow Kubecost to write ETL files to the HAQM S3 bucket.
Run the following command in your workspace:
Step 3: set up IRSA to allow Kubecost and Prometheus to read and write metrics from HAQM Managed Service for Prometheus
These following commands help to automate the following tasks:
- Create an AWS IAM role with the AWS managed IAM policy and trusted policy for the following service accounts: kubecost-cost-analyzer-amp, kubecost-prometheus-server-amp.
- Modify current Kubernetes service accounts with annotation to attach a new AWS IAM role.
Run the following command in your workspace:
For more information, you can check AWS documentation for AWS IAM roles for service accounts and learn more about HAQM Managed Service for Prometheus managed policy at Identity-based policy examples for HAQM Managed Service for Prometheus
Integrating Kubecost with HAQM Managed Service for Prometheus
Prepare the configuration file
Run the following command to create a file called config-values.yaml, which contains the defaults that Kubecost uses for connecting to your HAQM Managed Service for Prometheus workspace.
Primary cluster
Run this command to install Kubecost and integrate it with the HAQM Managed Service for Prometheus workspace as the primary:
The installation steps are similar to PRIMARY CLUSTER, except you don’t need to follow the steps in the section Create HAQM Managed Service for Prometheus workspace, and you need to update these environment variables below to match with your ADDITIONAL CLUSTERS. Please note that the AMP_WORKSPACE_ID and KC_BUCKET are the same as the Primary cluster.
Run this command to install Kubecost and integrate it with the HAQM Managed Service for Prometheus workspace as the additional cluster:
bash helm upgrade -i ${RELEASE} \ oci://public.ecr.aws/kubecost/cost-analyzer --version $VERSION \ --namespace ${RELEASE} --create-namespace \ -f http://tinyurl.com/kubecost-amazon-eks \ -f config-values.yaml \ -f http://raw.githubusercontent.com/kubecost/poc-common-configurations/main/etl-federation/agent-federated.yaml \ # Remove this line if you want to set up small-scale infrastructure --set global.amp.prometheusServerEndpoint=${QUERYURL} \ --set global.amp.remoteWriteService=${REMOTEWRITEURL} \ --set kubecostProductConfigs.clusterName=${YOUR_CLUSTER_NAME} \ --set kubecostProductConfigs.projectID=${AWS_ACCOUNT_ID} \ --set prometheus.server.global.external_labels.cluster_id=${YOUR_CLUSTER_NAME} \ --set serviceAccount.create=false \ --set prometheus.serviceAccounts.server.create=false \ --set serviceAccount.name=kubecost-cost-analyzer-amp \ --set prometheus.serviceAccounts.server.name=kubecost-prometheus-server-amp \
--set federatedETL.federator.useMultiClusterDB=true \
Monitoring costs of your multi-cluster infrastructure
Expose Kubecost dashboard
After you install Kubecost on the primary cluster and all additional clusters, you can switch back to your primary cluster and run the following command to expose the Kubecost dashboard:
On your web browser, navigate to http://localhost:9090 to access the dashboard.
You can now start monitoring your HAQM EKS cluster cost and efficiency. Depending on your organization’s requirements and setup, there are several options to expose Kubecost for ongoing internal access. You can also check this AWS workshop to learn how to expose Kubecost using AWS Load Balancer Controller.
Using Kubecost dashboard
When you access Kubecost dashboard, the default Overview view shows you comprehensive information about all HAQM EKS clusters monitored by Kubecost with HAQM Managed Service for Prometheus (active) and a list of unmonitored HAQM EKS clusters (unmonitored). You can see it in the following example screenshot:
In the Monitor/Allocation view, Kubecost provides granular visibility of your multiple HAQM EKS clusters costs aggregated by different Kubernetes context such as namespaces, controllers, pods, or labels. This help you to understand which parts of your application or projects are contributing to HAQM EKS spend. The following screenshot shows an example of HAQM EKS cluster cost aggregated by Namespace.
Additionally, to monitor your AWS services costs in one platform, you can integrate Kubecost with your AWS Cost and Usage reports and enable Cloud Costs to see the costs of each AWS service across your AWS accounts. The following example screenshot shows the cost of each AWS service in the Monitor/Cloud Costs view.
Additional usage with HAQM Managed Service for Prometheus
Because all cost metrics emitted by Kubecost are centrally stored and managed in HAQM Managed Service for Prometheus for multiple HAQM EKS clusters, you can integrate with other observability tools supported by HAQM Managed Service for Prometheus to utilize that data. For example, you can write custom cost related PromQL queries and visualize it on HAQM Managed Grafana ,or use Alert Manager in multi-cluster mode. You can learn more about these integrations at Using AWS Observability Accelerator. To learn more about the HAQM Managed Service for Prometheus service quotas, you can refer to the documentation at HAQM Managed Service for Prometheus service quotas.
Cleaning up
Conclusion
In this post, we showed you how you can use Kubecost to monitor multi-cluster HAQM EKS environments using HAQM Managed Service for Prometheus as the metrics store so you don’t have to worry about managing your own infrastructure to store Kubecost data. In collaboration with Kubecost, we’re excited to release this new feature that allows you to monitor and track multiple HAQM EKS clusters costs in a single pane of glass. This setup offers rich features exclusively to HAQM EKS customers with no additional Kubecost license required, and includes Kubecost troubleshooting support. If you have Kubecost’s Enterprise license, additional features are enabled, such as Governance features that allow you to set budget rules for different projects or audit the costly deployments on your HAQM EKS cluster. The enterprise licenses are available from Kubecost or through AWS Marketplace. If you would like to learn more from the Kubecost team, contact them here.
Other useful resources for AWS Observability:
- Hands-on workshop for all AWS Observability services – One Observability Workshop
- Terraform based easy to use Observability setup for HAQM EKS – AWS Observability Accelerator
- AWS Observability Best Practices Guide
Linh Lam, Solutions Architect, Kubecost
Linh Lam is a Kubecost Solution Architect, ISV, focusing on integration and building solutions for customers. He is also passionate about application modernization, serverless, and container technology. Outside of work he enjoys hiking, camping, and building his home audio systems.