[SEO Subhead]
This Guidance demonstrates how to automate data transfers to simplify management and enhance both accessibility and cost-effectiveness of archived data. It shows how to automatically restore, copy, and transfer HAQM Simple Storage Service (HAQM S3) Glacier vault archives to S3 buckets and desired storage classes, including S3 Glacier storage classes. This automation saves time and minimizes the likelihood of human error during data transfer, helping ensure a more reliable and consistent operation for managing archived data.
Note: [Disclaimer]
Architecture Diagram

[Architecture diagram description]
Step 1
Invoke a transfer workflow using an AWS Systems Manager document.
Step 2
The Systems Manager document starts an AWS Step Functions Orchestrator execution.
Step 3
The Step Functions Orchestrator execution initiates a nested Step Functions Get Inventory workflow to retrieve the inventory file.
Step 4
Upon completion of the inventory retrieval, the Guidance invokes the Initiate Retrieval nested Step Functions workflow.
Step 5
When a job is ready, HAQM Simple Storage Service (HAQM S3) Glacier sends a notification to an HAQM Simple Notification Service (HAQM SNS) topic, indicating job completion.
Step 6
The Guidance stores all job completion notifications in the HAQM Simple Queue Service (HAQM SQS) Notifications queue.
Step 7
When an archive job is ready, the HAQM SQS Notifications queue invokes the AWS Lambda Notifications Processor function. This Lambda function prepares the initial steps for archive retrieval.
Step 8
The Lambda Notifications Processor function places chunks retrieval messages in HAQM SQS Chunks Retrieval queue for chunk processing.
Step 9
The HAQM SQS Chunks Retrieval queue invokes the Lambda Chunk Retrieval function to process each chunk.
Step 10
The Lambda Chunk Retrieval function downloads the chunk from HAQM S3 Glacier.
Step 11
The Lambda Chunk Retrieval function uploads a multipart upload part to HAQM Simple Storage Service (HAQM S3).
Step 12
After a new chunk is downloaded, the Guidance stores chunk metadata in HAQM DynamoDB (for example, etag, checksum_sha_256, tree_checksum).
Step 13
The Lambda Chunk Retrieval function verifies whether all chunks for that archive have been processed. If so, it inserts an event into the HAQM SQS Validation queue to invoke the Lambda Validate function.
Step 14
The Lambda Validate function performs an integrity check against the tree hash in the inventory, calculates a checksum, and passes it to the into the close multipart upload call. If that hash is wrong, HAQM S3 rejects the request.
Step 15
DynamoDB Streams invokes the Lambda Metrics Processor function to update the transfer process metrics in DynamoDB.
Step 16
The Step Functions Orchestrator execution enters an async wait, pausing until the archive retrieval workflow concludes before initiating the Step Functions Cleanup workflow.
Step 17
The DynamoDB stream invokes the Lambda Async Facilitator function, which unlocks asynchronous waits in Step Functions.
Step 18
HAQM EventBridge rules periodically initiate Step Functions Extend Download Window and Update HAQM CloudWatch Dashboard workflows.
Step 19
Monitor the transfer progress using a CloudWatch dashboard.
Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
This Guidance automates the process of copying archives from HAQM S3 Glacier vaults to S3 buckets, reducing manual effort and the risk of errors to improve operational efficiency. Moving data to different HAQM S3 storage classes enables storage cost optimization based on access patterns and retention requirements. The pre-built CloudWatch dashboard visualizes copy operation progress, providing better visibility into the data transfer process and enabling effective monitoring and troubleshooting.
-
Security
Lambda is a serverless compute service, which helps reduce the attack surface and responsibilities associated with managing underlying infrastructure. This minimizes user involvement in managing and securing compute resources, improving the overall security posture.
-
Reliability
The pre-built CloudWatch dashboard provides visibility into the data transfer process, allowing you to monitor progress and identify potential issues or bottlenecks. This enhanced visibility enables you to quickly detect and address reliability-related problems, helping ensure successful completion of data transfers. By using a serverless compute service that automatically scales and manages the underlying infrastructure, you can reduce the risk of infrastructure-related failures or performance degradation.
-
Performance Efficiency
Lambda functions are triggered based on events, such as the initiation of the data transfer process. The event-driven nature of Lambda functions optimizes performance by only executing necessary compute resources when required. This helps reduce overall resource utilization and improving efficiency. The automatic scaling and management of underlying infrastructure helps ensure that necessary compute resources are allocated on-demand.
-
Cost Optimization
By allowing users to move data to different HAQM S3 storage classes, this Guidance enables storage cost optimization based on access patterns and retention requirements. This helps reduce overall storage costs by placing frequently accessed data in performance-optimized storage classes while moving less frequently accessed data to more cost-effective storage classes. Lambda helps optimize costs by only charging for compute time used, rather than requiring users to manage and pay for underlying infrastructure.
-
Sustainability
Lambda reduces energy consumption and carbon footprint associated with managing and maintaining underlying infrastructure. Serverless computing leads to more efficient resource utilization and potentially lower energy usage compared to traditional server-based architectures.
Related Content

[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running HAQM EC2 instances or using HAQM S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between HAQM or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.