AWS HPC Blog
Category: Compute
Automate scheduling of jobs on AWS Batch and AWS Fargate with HAQM EventBridge
In this post we’ll show how to use AWS Batch, AWS Fargate and HAQM Event Bridge to create a job scheduling solution for containers that’s fully managed, serverless, and event-driven.
Improving NFL player health using machine learning with AWS Batch
In this post we’ll show you how the NFL used AWS to scale their ML workloads and produce the first comprehensive dataset of helmet impacts across multiple NFL seasons. They were able to reduce manual labor by 90% and the results beats human labelers in accuracy by 12%!
Diving Deeper into Fair-Share Scheduling in AWS Batch
Today we dive into details of AWS Batch fair share policies and show how they affect job placement. You’ll see the result of different share policies, and hear about practical use cases where you can benefit from fair share job queues in Batch.
Call for participation: RADIUSS Tutorial Series 2023
Lawrence Livermore National Laboratory (LLNL) and AWS are again joining forces to provide a training opportunity for emerging HPC tools and application. In this post you’ll find out the details of those tutorials, and find out how to participate.
Automate your clusters by creating self-documenting HPC with AWS ParallelCluster
Today we’re going to show you how you can automate cluster deployment and create self-documenting infrastructure at the same time, which leads to more repeatable results that are easier to manage (and replicate).
Running protein structure prediction at scale using a web interface for researchers
Today, we’ll show you our open-source sample implementation of a web frontend and cloud HPC backend to support researchers using AI tools like AlphaFold for drug discovery and design.
Instance sizes in the HAQM EC2 Hpc7 family – a different experience
Hpc7g is the first HAQM EC2 HPC instance offering with multiple instance sizes, but this is quite different from the experience of getting smaller instances from other non-HPC instance families. Today, we want to take a moment to explore why this is different, and how it helps.
Application deep-dive into the AWS Graviton3E-based HAQM EC2 Hpc7g instance
In this post we’ll show you application performance and scaling results from Hpc7g, a new instance powered by AWS Graviton3E across a wide range of HPC workloads and disciplines.
How SeatGeek simulates massive load with AWS Batch to prepare for big events
In this post we explore SeatGeek’s load testing system that simulates 50k simultaneous users. Originally built to prep SeatGeek for large-event traffic spikes, it now runs weekly to help them harden their code.
Customize Slurm settings with AWS ParallelCluster 3.6
With AWS ParallelCluster 3.6, you can directly specify Slurm settings in the cluster config file – improving reproducibility and another step towards self-documentation for your HPC infrastructure.