AWS Storage Blog
New on the Machine Learning blog: Speed up training on HAQM SageMaker using HAQM FSx for Lustre and HAQM EFS file systems
Deploying analytics applications and machine learning models requires storage that can scale in capacity and performance to handle workload demands with high throughput and low-latency file operations.
A common use case we’re seeing centers around data science teams doing some form of analytics (e.g machine learning, genomics). AWS offers two scalable, durable, highly available file solutions for big data and analytics workloads. HAQM EFS is a cloud-native, shared NFS storage solution for Linux-based applications, as well as ML frameworks and shared notebook systems. Customers like Faculty are leveraging EFS to scale their analytics workloads and are seeing increased agility to delivery insights faster.
HAQM FSx for Lustre is high-performance file system for processing HAQM S3 or on-premises data providing sub-millisecond access to your data and allows you to read and write data at speeds of up to hundreds of gigabytes per second of throughput and millions of IOPS. HAQM FSx for Lustre works natively with HAQM S3, making it easy for you to process cloud data sets with compute-intensive file systems. Conductor Technologies uses FSx Lustre for their cloud rendering platform bringing simplicity and scale as well as lower TCO to their VFX and animation studio customers.
This week, we’re excited about the AWS SageMaker team’s announcement that customers can now speed up machine learning training jobs by accessing data from both EFS and FSx for Lustre to inform decision making and improve their customer experiences.
Check out their blog post to learn more about AWS file storage solutions for machine learning workloads.