Posted On: Nov 22, 2023
The HAQM S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in HAQM S3. PyTorch is an open source machine learning framework widely used by AWS customers to build and train machine learning models. The HAQM S3 Connector for PyTorch automatically optimizes S3 read and list requests to improve data loading and checkpoint performance for your training workloads. Saving machine learning training model checkpoints is up to 40% faster with the HAQM S3 Connector for PyTorch than saving to HAQM EC2 instance storage.
The HAQM S3 Connector for PyTorch delivers a new implementation of PyTorch's dataset primitive that you can use to load training data from HAQM S3. It supports both map-style datasets for random data access patterns and also iterable-style datasets for sequential data access patterns. The HAQM S3 Connector for PyTorch also includes a checkpointing interface to save and load checkpoints directly to HAQM S3, without first saving to local storage and writing custom code to upload to HAQM S3.
HAQM S3 Connector for PyTorch is an open source project. To get started, visit the GitHub page.