Posted On: Nov 29, 2017

HAQM Glacier Select is a new way to query archived data in HAQM Glacier. Glacier Select allows queries to run directly on data stored in HAQM Glacier, retrieving only the data you need out of your archives to use for analytics. This allows you to reduce total cost of ownership while massively extending your data lake into cost-effective archive storage.

With HAQM Glacier Select, you can now provide a SQL query and an HAQM Glacier archive where you want the query to be applied. You specify how soon you need results based on three options: Expedited Retrievals take 1-5 minutes, Standard Retrievals take 3-5 hours, and Bulk Retrievals take up to 12 hours. You are notified when a query is complete with HAQM Simple Notification Service (SNS), and you can specify the HAQM S3 bucket where you want the output results to be stored.

Using HAQM Glacier Select, you can now perform operations like auditing and pattern matching easily, over large amounts of data, which may be archived in HAQM Glacier. For example, you can use HAQM Glacier Select to find and retrieve only records matching a particular account or only billing data for a particular customer. You can also integrate HAQM Glacier Select APIs in your application, where it can be used to expand query over archive capability to many more use cases like machine learning and Big Data.

HAQM Glacier Select is generally available today in all AWS commercial regions where HAQM Glacier is offered. To learn more about HAQM Glacier Select, visit the HAQM Glacier details page.