AWS Cloud Operations Blog

Monitoring Windows desktops on HAQM WorkSpaces using HAQM Managed Service for Prometheus and HAQM Managed Grafana

Many Organizations leverage HAQM WorkSpaces as a virtual cloud-based Windows desktop as a solution (DAAS) to replace their existing traditional desktop solution to shift the cost and effort of maintaining laptops and desktops to a cloud pay-as-you-go model. Customers using HAQM WorkSpaces would need the support of managed services to monitor their workspaces environment operations. A cloud-based managed open source monitoring solution, such as HAQM Managed Service for Prometheus and HAQM Managed Grafana, helps IT teams to quickly set up and operate a monitoring solution to save costs. Monitoring CPU, memory, network, or disk activity from HAQM WorkSpaces eliminates guesswork while troubleshooting in real time or after the issue for the HAQM WorkSpaces environment.

A managed monitoring solution for your HAQM WorkSpaces Windows OS environments yields the following organizational benefits:

  • Service desk staff can quickly identify and drill down to HAQM WorkSpaces issues that need investigation.
  • Service desk staff can investigate HAQM WorkSpaces issues after the event using the historical data in HAQM Managed Service for Prometheus.
  • Service desk teams can shorten or eliminate long calls that waste time questioning business users on HAQM WorkSpaces issues.

In this post, we’ll set up HAQM Managed Service for Prometheus, HAQM Managed Grafana, and a Prometheus server on HAQM Elastic Compute Cloud (HAQM EC2) to provide a monitoring solution for HAQM WorkSpaces. We’ll automate the deployment of Prometheus agents on any new HAQM WorkSpaces using Active Directory Group Policy Objects (GPO).

Solution Architecture

The following diagram demonstrates the solution to monitor your HAQM WorkSpaces environment using AWS native managed services, such as HAQM Managed Service for Prometheus and HAQM Managed Grafana. This solution will deploy a Prometheus server on an HAQM EC2 instance which polls Prometheus agents on your HAQM WorkSpaces Windows 10 instances periodically, and remote writes metrics to HAQM Managed Service for Prometheus. The EC2 is necessary to pull in the data from WorkSpaces and forward it to HAQM Managed Service for Prometheus. We’ll be using HAQM Managed Grafana to query and visualize metrics on your HAQM WorkSpaces infrastructure.

The following diagram demonstrates the solution to monitor your HAQM WorkSpaces environment using AWS native managed services, such as HAQM Managed Service for Prometheus and HAQM Managed Grafana.

Solution Walkthrough

These steps will deploy the solution in your environment.

Prerequisites

You’ll need the following to complete the steps in this post:

Creating AWS resources for our solution

Let’s start by setting a few environment variables in the AWS CLI:

AWS_REGION=us-east-1  #Change to your appropriate AWS region.
AMP_WORKSPACE_NAME=MYWorkspacename
WORKSPACES_VPCID=<VPC ID of your workspaces environment>
WORKSPACES_DIRECTORY=$(aws workspaces describe-workspace-directories \
  --query Directories[].DirectoryId \
  --output text)
WORKSPACES_USER=<valid active directory username>
WORKSPACES_BUNDLE=wsb-8vbljg4r6 #  Standard bundle with Windows 10 (Server 2016 based) (PCoIP)

We’ll utilize a default Windows WorkSpaces bundle for this solution. You can review and pick choices from available WorkSpaces bundles CLI command.

Next, we’ll be using prometheusmonitoring.sh shell scripts from the aws-o11y-recipes GitHub repository to create the required AWS resources required for this demonstration. Follow these commands:

git clone http://github.com/aws-observability/aws-o11y-recipes.git
cd docs/recipes/WorkSpaces-Monitoring-AMP-AMG/
# Runing script to create AWS resources for the solution.
chmod +x prometheusmonitoring.sh
. ./prometheusmonitoring.sh

The above automation script will create the following resources for our solution demonstration:

Wait for few minutes before the prometheusmonitoring.sh script completes. You can modify the prometheusmonitoring.sh accordingly for your needs for different CIDRs, existing VPCs, or other changes.

Testing HAQM Managed Service for Prometheus WorkSpace

To test whether HAQM Managed Service for Prometheus is ready for metrics, use awscurl. This tool enables you to send HTTPS requests through the command line with AWS Sigv4 authentication. Therefore, you must have AWS credentials set up locally with the correct permissions to query from HAQM Managed Service for Prometheus. For instructions on installing this tool, see awscurl.

AMP_WORKSPACE_ID=$(aws amp list-workspaces --alias $AMP_WORKSPACE_NAME \
   --region=${AWS_REGION} --query 'workspaces[0].[workspaceId]' --output text)
AMP_WORKSPACE_URL=$(aws amp describe-workspace --workspace-id $AMP_WORKSPACE_ID | jq .workspace.prometheusEndpoint -r)
AMP_QUERY=${AMP_WORKSPACE_URL}api/v1/query?query=up  

awscurl --service="aps" --region=$AWS_REGION $AMP_QUERY
# the output should look like this, no data is sent at this time so the result will be empty
{"status":"success","data":{"resultType":"vector","result":[]}}

Configuring Prometheus on the HAQM EC2 Server

First, run the following code to obtain your WorkSpaces IP addresses for the Prometheus server:

WORKSPACE_IP_FIELDS=$(aws workspaces describe-workspaces --query \
WorkSpaces[*].[IpAddress,UserName] | grep 172 | /usr/bin/sed \
  -e "s/\"//" -e "s/..$//" -e "s/$/:9182', /" -e "s/^//"   \
  -e 's/\n//' -e "s/172/'172/" )
WORKSPACE_IP=`echo $WORKSPACE_IP_FIELDS | /usr/bin/sed -e "s/.$//" \
  -e "s/$//"  -e "s/$/]/"  -e "s/^/- targets: [/"`

Next, we must configure Prometheus on the HAQM EC2 instance to remote write metrics to HAQM Managed Service for Prometheus. The following code will add the HAQM Managed Service for Prometheus URL to the /etc/prometheus/prometheus.yml file for remote writing metrics. You must connect to your EC2 Prometheus server and log on using the keypair created by our automation.

sudo su
AWS_REGION=us-east-1 #Change to your appropriate AWS region.
AMP_WORKSPACE_NAME=MYWorkspacename
AMP_WORKSPACE_ID=Populate this value from Above section
WORKSPACE_IP="Populate this value from Above section" # WORKSPACE_IP needs to be in quotes
cat > /etc/prometheus/prometheus.yml << EOF
## The following is a set of default values for prometheus server helm chart which enable remoteWrite to AMP
global:
  scrape_interval: 60s
  external_labels:
    monitor: 'prometheus'

# add the workspaces IP:9182 in the targets line separated by commas
scrape_configs:
    - job_name: 'prometheus'
      static_configs:
          $WORKSPACE_IP
          #- targets: ['localhost:9182']

#adjust as needed
remote_write:
  -
    url: http://aps-workspaces.${AWS_REGION}.amazonaws.com/workspaces/${AMP_WORKSPACE_ID}/api/v1/remote_write
    queue_config:
     max_samples_per_send: 1000
     max_shards: 200
     capacity: 2500
    sigv4:
     region: $AWS_REGION
EOF

# Review the `cat` output and verify the changes are saved. 
# Enable and Restart Prometheus.
# Next, You can logoff the EC2 server. You are now finished with the baseline EC2 server configuration.

cat /etc/prometheus/prometheus.yml
systemctl enable prometheus
systemctl restart prometheus

Active Directory Setup with Group policies to push Prometheus agents to HAQM WorkSpaces

Log in to your HAQM WorkSpaces created by the automation using HAQM WorkSpaces client or HAQM WorkSpaces web access. A Group Policy Object can be used to install the Windows Prometheus Exporter agent. Download the latest Windows Exporter for use and save locally on the HAQM WorkSpaces that you’re using to prepare the GPO. This can be performed in Powershell with the following command, and the resultant file will be placed in C:\Temp.

(New-Object System.Net.WebClient).DownloadFile("http://github.com/prometheus-community/windows_exporter/releases/download/v0.18.1/windows_exporter-0.18.1-amd64.msi", "C:\Temp\windows_exporter-0.18.1-amd64.msi")  

Next, you’ll install the downloaded software on this HAQM WorkSpaces to use a template for all of the other WorkSpaces. Navigate to the location of the software and install by double-clicking on the .msi file.

Next, you’ll install the downloaded software on this HAQM WorkSpaces to use a template for all of the other WorkSpaces. Navigate to the location of the software and install by double-clicking on the .msi file.

Then, verify the Windows_exporter service is installed by checking the Windows Service listing. Type Services in the search box, and verify that the windows_exporter is listed as a service.

Continue by following these steps to create the Group Policy Object to install the Prometheus Windows Exporter to all HAQM WorkSpaces. Upload the Prometheus Windows Exporter MSI to a share accessible by all HAQM WorkSpaces. Note the location over the network. This is usually in a format such as \\servername\sharename. If you’re unfamiliar with setting up Windows Shares over Active Directory, then we encourage you to use HAQM FSx for Windows File Serverfor this by leveraging Use Case 2 in the post.

Access the Group Policy Management Tool with a Windows Admin permission account by opening the Group Policy Management tool. Create a new Group Policy by right-clicking and selecting new. Right-click on the new GPO, and select Edit.

Access the Group Policy Management Tool with a Windows Admin permission account by opening the Group Policy Management tool. Create a new Group Policy by right-clicking and selecting new. Right-click on the new GPO, and select Edit.

Navigate to Computer Configuration > Policies > Software Settings and create a new software installation with a right-click, and then select New Package.

Navigate to Computer Configuration > Policies > Software Settings > and create a new software installation with a right-click, and then select New Package.

Create a Software Installation configuration using the share that is accessible for every HAQM WorkSpaces in the Active Directory Domain. Browse to the share created previously and select the Windows Exporter MSI file. Note that the drive path is a network format of \\servername\sharename and not a local drive path (such as C:).

Create a Software Installation configuration using the share that is accessible for every HAQM WorkSpaces in the Active Directory Domain. Browse to the share created previously and select the Windows Exporter MSI file. Note that the drive path is a network format of \\servername\sharename and not a local drive path (such as C:).

You can see the Group Policy Management Editor shows the Share path in the Source path.

You can see the Group Policy Management Editor shows the Share path in the Source path.

Create a Services action to set an automatic (delayed start) and apply a restart service action to the Prometheus Exporter agent. Navigate to Computer Configuration > Preferences > Control Panel Settings > Services as shown in the following screenshot:

Create a Services action to set an automatic (delayed start) and apply a restart service action to the Prometheus Exporter agent. Navigate to Computer Configuration > Preferences > Control Panel Settings > Services as shown in the following screenshot:

Next, select New Service:

Next, select New Service:

Next, select the three dot button, and then select windows_exporter:

Next, select the three dot button, and then select windows_exporter:

Next, change the Startup to Automatic (Delayed Start):

Next, change the Startup to Automatic (Delayed Start):

Next, select the Recovery Tab and select Restart the Service three times as shown in this screenshot:

Next, select the Recovery Tab and select Restart the Service three times as shown in this screenshot:

Exit the GPO Editor by selecting the X in the top right. Right-click on the Organizational Unit (OU) that contains the HAQM WorkSpaces in the Group Policy Management tool, and then select Link an Existing GPO:

Exit the GPO Editor by selecting the X in the top right. Right-click on the Organizational Unit (OU) that contains the HAQM WorkSpaces in the Group Policy Management tool, and then select Link an Existing GPO:

Next, select the Group Policy that you just created for linking to the WorkSpaces OU, and select OK.

Next, select the Group Policy that you just created for linking to the WorkSpaces OU, and select OK.

Next, right-click and verify the GPO is Enforced and Link Enabled. This will make sure that the software is installed on WorkSpaces. Exit the Group Policy Management tool.

Next, right-click and verify the GPO is Enforced and Link Enabled. This will make sure that the software is installed on WorkSpaces. Exit the Group Policy Management tool.

After the GPO is linked to the WorkSpaces OU, HAQM WorkSpaces will automatically install the software on the next reboot or Group Policy replication cycle, as long as they are in the WorkSpaces OU in Active Directory.

If you wish to apply a group policy without waiting for replication, you can update a group policy on an HAQM WorkSpaces by running the following statement on the command line on your Windows 10 HAQM WorkSpaces Instance. The Group Policy updates will apply immediately, and the Windows 10 OS may need to be rebooted:

gpupdate /force

HAQM Managed Grafana Setup

Two steps are necessary for setting up AWS IAM Identity Center, setting up and logging in to HAQM Managed Grafana, and querying metrics from HAQM Managed Service for Prometheus workspace from the post. To set up Authentication and Authorization, follow the instructions in the HAQM Managed Grafana User Guide for enabling AWS IAM Identity Center. Second, setup the data source for HAQM Managed Service for Prometheus. You may also reference Monitor Istio on EKS using HAQM Managed Prometheus and HAQM Managed Grafana blog, starting from the AWS Single Sign-On (SSO) section for HAQM Managed Grafana setup.

Querying Windows Metrics

Let’s import a Grafana dashboard which lets us visualize metrics from the Windows 10 instances in WorkSpaces. Go to the plus sign on the left navigation bar, and select Import as shown in the following:

Let’s import a Grafana dashboard which lets us visualize metrics from the Windows 10 instances in WorkSpaces. Go to the plus sign on the left navigation bar, and select Import as shown in the following:

In the Import screen, type 12422 in the Import via grafana.com textbox and select the Prometheus data source in the drop-down at the bottom. Then, select Import. Once complete, you can see the Grafana dashboard showing metrics from the HAQM WorkSpaces through HAQM Managed Service for Prometheus data source as shown in the following. The WorkSpaces IP addresses will be selectable by a dropdown box in the upper-left.

In the Import screen, type 12422 in the Import via grafana.com textbox and select the Prometheus data source in the drop-down at the bottom. Then, select Import. Once complete, you can see the Grafana dashboard showing metrics from the HAQM WorkSpaces through HAQM Managed Service for Prometheus data source as shown in the following. The WorkSpaces IP addresses will be selectable by a dropdown box in the upper-left.

Troubleshooting

This solution has many components that require the correct networking to communicate. The following are some of the troubleshooting tips in case of any issues in this setup:

  • The network path must be open between the Prometheus EC2 Server, VPCs, HAQM WorkSpaces, and HAQM Service for Prometheus Service. Typical network troubleshooting tools should be used like VPC Reachability Analyzer, ping, and traceroute. Remember to verify Security groups and Network Access Control Lists (NACL) on your VPC on the network path.
  • Performance is memory intensive and you may need to adjust your HAQM EC2 type to reflect your specific environment. The AWS Compute Optimizer should be leveraged for your environment.
  • The EC2 Prometheus Server stores data from HAQM WorkSpaces. With several hundred WorkSpaces exporters, data storage can reach several gigabytes in the default configuration. The EC2 instance’s configuration retains four hours. . If you use more data than expected, then adjust the retention time and restart the systemd Prometheus service. The data is stored in the /var/lib/prometheus/ file path. The solution leverages the scalability of HAQM Managed Service for Prometheus to store and process the data.
  • To test whether HAQM Managed Service for Prometheus received the metrics, use awscurl. Please refer to the instructions in the Testing HAQM Managed Service for Prometheus WorkSpace for setup.
AMP_WORKSPACE_URL=$(aws amp describe-workspace --workspace-id $AMP_WORKSPACE_ID | jq .workspace.prometheusEndpoint -r)
AMP_QUERY=${AMP_WORKSPACE_URL}api/v1/query?query=up
  
awscurl --service="aps" --region=$AWS_REGION $AMP_QUERY

# the output should look like this and reference the IP of the workspace(s)

{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "__name__": "up",
                    "instance": "172.16.0.26:9182",
                    "job": "prometheus",
                    "monitor": "prometheus"
                },
                "value": [
                    1660084860.224,
                    "1"
                ]
            }
        ]
    }
}
  • To verify the Prometheus EC2 Server, check the server URL http://PROMETHEUSFORWARDERSERVER:9090/targets to see that targets are appearing. This verifies that the endpoints are configured and shows their status. This URL should be used from a Windows 10 WorkSpaces where the security group allows access to this port on the HAQM EC2 server.
  • To verify that HAQM Managed Service for Prometheus and HAQM Managed Grafana are working fine, you can check that the data is coming in from the forwarder server by doing manual queries in Grafana. To confirm that the data from the Prometheus EC2 forwarder is arriving in the HAQM Managed Service for Prometheus workspace by running a PromQL command in Grafana by using the following instructions:

Choose Explore from the menu on the left in the Grafana Console and use the Metrics Browser to find process_cpu_seconds_total, and then select the blue box labeled Use Query at the bottom. Make sure that your Data Source is selected in the top drop-down box to the right of the Explore icon.

Choose Explore from the menu on the left in the Grafana Console and use the Metrics Browser to find process_cpu_seconds_total, and then select the blue box labeled Use Query at the bottom. Make sure that your Data Source is selected in the top drop-down box to the right of the Explore icon.

Next, you should see data in the graph results. If you don’t see data after a few minutes, then there is a problem in the data flow which you might have to troubleshoot.

Next, you should see data in the graph results. If you don’t see data after a few minutes, then there is a problem in the data flow which you might have to troubleshoot:

Cleaning Up

You will continue to incur costs until you delete the infrastructure that you created for this post. Use the following commands to clean up the created AWS resources for this demonstration. We’ve created a cleanup.sh script located in the cloned repo. Use this script to remove all of the AWS services created by this solution. Make sure to verify that the variables used to create the solution are established before running the cleanup.

chmod +x cleanup.sh
. ./cleanup.sh

Second, delete the HAQM WorkSpaces environment you created using the quickstart. The steps are in the WorkSpaces documentation. If you wish to keep the HAQM WorkSpaces environment, delete the GPO from your Active Directory environment to avoid the installation of Prometheus exporter.

Finally, navigate to the HAQM Managed Grafana console to delete the created HAQM Managed Grafana workspace.

Conclusion

In this post, we demonstrated a solution to monitor your HAQM WorkSpaces environment using AWS native managed services, such as HAQM Managed Service for Prometheus and HAQM Managed Grafana. This solution deployed a Prometheus server on the HAQM EC2 instance, which polls the Prometheus agents on your HAQM WorkSpaces environment periodically and remote writes metrics to HAQM Managed Service for Prometheus. We used Active Directory group policies in this solution to make a seamless deployment of Prometheus Agents to new HAQM WorkSpaces. We also used HAQM Managed Grafana to query and visualize metrics on your HAQM WorkSpaces infrastructure. Learn even more about monitoring your EC2 instances using HAQM Managed Service for Prometheus. For more information and hands-on experience with HAQM Managed Grafana, check out the interactive and immersive One Observability Workshop.

About the authors

Elamaran Shanmugam

Elamaran (Ela) Shanmugam is a Sr. Container Specialist Solutions Architect with HAQM Web Services. Ela is a Container, Observability, and Multi-Account Architecture SME and helps AWS customers to design and build scalable, secure, and optimized container workloads on AWS. His passion is building and automating Infrastructure to allow customers to focus more on their business. He is based out of Tampa, Florida, and you can reach him on Twitter @IamElaShan

Kevin Cox

Kevin Cox is a Cloud Infrastructure Architect in the World Wide Public Sector (WWPS) NonProfit Health (NPH) practice. He has experience in broad areas of technology with a strong technical background, leadership experience, strategic and tactical focus, commercial, non-profit, and public sector understanding.