AWS Parallel Computing Service FAQs

General

Open all

AWS PCS is a managed service that makes it easy to run and scale high performance computing (HPC) workloads and build scientific and engineering models on AWS using Slurm. Use AWS PCS to build compute clusters that integrate AWS compute, storage, networking, and visualization. Run simulations or build scientific and engineering models. Streamline and simplify your cluster operations using built-in management and observability capabilities. Empower your users to focus on research and innovation by enabling them to run their applications and jobs in a familiar environment.

AWS PCS is currently available in the following Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).

AWS PCS currently supports Slurm, a popular open source job scheduler and workload manager.

Slurm is a popular open source scheduler for managing distributed HPC workloads.

AWS PCS works by provisioning a managed Slurm controller, operating the scaling logic, and launching compute nodes for you.

Without AWS PCS, you need to run a Slurm controller on a provisioned head node, launch several compute nodes, and manage fleet operations to scale capacity to match the demand present in your job queues. With AWS PCS, you can simply define your job queues and compute preferences. The service is built to manage the Slurm controller and handles fleet scaling in a highly available and secure configuration. This helps remove operational burden and allows you to focus on simulations or science instead of managing AWS infrastructure.

AWS PCS provisions HAQM Elastic Compute Cloud (HAQM EC2) instances in your account. This means you can take advantage of HAQM EC2 purchase options (On Demand, Spot) and pricing constructs (Instance Savings Plans, other discounts) and optimize that capacity through AWS PCS.

AWS PCS builds environments using services such as HAQM EC2, HAQM Elastic Block Store (HAQM EBS), Elastic Fabric Adapter (EFA), HAQM Elastic File System (HAQM EFS), HAQM FSx, NICE DCV, and HAQM Simple Storage Service (HAQM S3) to configure the compute, visualization, storage, and networking infrastructure to run HPC workloads on AWS.

AWS PCS uses service-linked roles and managed AWS Identity and Access Management (IAM) policies for fine-grained access control. It delivers metrics and application logs to HAQM CloudWatch and emits auditable events to AWS CloudTrail. The service supports LDAP-based user authentication and authorization for HAQM EC2 instances. It can integrate with EC2 Image Builder for HAQM Machine Image (AMI) build automation. Finally, the service supports AWS CloudFormation so you can deploy and manage AWS PCS clusters and associated infrastructure.

AWS PCS is designed for a wide range of scientific and engineering workloads such as computational fluid dynamics, weather modeling, finite element analysis, electronic design automation, and reservoir simulations. AWS PCS is built to support traditional HPC customers across verticals (such as mechanical, energy, aerospace, electronics, oil and gas, weather, and public sector) that run compute or data-intensive simulations to validate their models and designs.

Scientific and engineering modeling and simulation, and high performance data analytics (HPDA) workloads are a good fit for AWS PCS.

The AWS PCS SLA can be found here.

Features

Open all

AWS PCS supports nearly all of the EC2 instance types available in the Region in which you are using AWS PCS.

If you have a savings plan, it will automatically be applied to the EC2 instances that AWS PCS launches in your account. If you have one or more capacity reservations, you can configure AWS PCS to use them through API parameters.
 

Yes, you can use PCS to run workloads using GPUs, AWS Tranium, and AWS Inferentia instance types.

AWS PCS supports HAQM EFS, HAQM EBS, HAQM FSx for Lustre, HAQM FSx for NetApp ONTAP, HAQM FSx for OpenZFS, HAQM S3, Mountpoint for HAQM S3, and HAQM File Cache. You can also connect to your own self-managed storage resources. See the documentation.

AWS PCS supports a wide range of EC2 instances with advanced networking options, including use of EFA. The service supports isolated subnets, AWS PrivateLink, and HAQM Virtual Private Cloud (HAQM VPC) endpoints 

With AWS PCS, you can create compute and login node groups that launch EC2 instances either in a single Availability Zone or across multiple Availability Zones.

Yes, you can configure your AWS PCS compute node groups to work with directory services such as Microsoft Active Directory, Microsoft Entra ID, and OpenLDAP.

Yes. You can start with any AMI that meets the AWS PCS AMI specification and install the AWS PCS client on it. You can review the AWS PCS AMI specification in the documentation. We also provide a sample AMI that you can use to try out the service, as described in the documentation.

AWS PCS is compatible with HAQM Linux 2, Ubuntu 22.04, Red Hat Enterprise Linux 9 (RHEL9), and Rocky Linux 9.

Yes. You can build custom AMIs for AWS PCS based on HAQM Linux 2 and Ubuntu 22.04 Deep Learning AMIs (DLAMI).

Yes, AWS PCS sets AWS tags at both the cluster and compute node group level, so you can track historical HAQM EC2 spend at those granularities.

Yes, you can use an on-premises node as a login node in an AWS PCS cluster and have users directly submit jobs to their AWS PCS cluster to run workloads on AWS from there. AWS PCS does not currently support Slurm federated scheduling or multi-cluster operation.

CloudWatch provides monitoring of your AWS PCS cluster health and performance by collecting metrics from the cluster at intervals. You can access historical data and gain insights into your cluster's performance over time. With CloudWatch, you can also monitor the EC2 instances launched by AWS PCS to meet your scaling requirements.

Getting started

Open all

To get started, visit the AWS PCS console. You must have an AWS account to access this service. If you do not have an account, you will be prompted to create one. After signing in, visit the AWS PCS documentation page to access the getting started guide.