AWS Storage Blog
Category: Advanced (300)
Design patterns for multi-tenant access control on HAQM S3
Large organizations and software as a service (SaaS) platforms often share storage resources across multiple users, groups, or tenants. The design pattern chosen to implement this shared storage can significantly impact how access permissions are managed at scale. This decision is key because it directly affects platforms’ security and ease of scale. A well thought […]
Archiving relational databases to HAQM S3 Glacier storage classes for cost optimization
Many customers are growing their data footprints rapidly, with significantly more data stored in their relational database management systems (RDBMS) than ever before. Additionally, organizations subject to data compliance including the Health Insurance Portability and Accountability Act (HIPAA), the Payment Card Industry Data Security Standard (PCI-DSS) and General Data Protection Regulation (GDPR) are often required […]
Cost-optimized log aggregation and archival in HAQM S3 using s3tar
According to a study by the International Data Corporation (IDC), the global datasphere is expected to grow from 33 zettabytes (ZB) in 2018 to 175 ZB by 2025, a staggering five-fold increase. Organizations that leverage distributed architectures generate a significant portion of their data footprint from observability data, including application logs, metrics, and traces, which […]
Designing for multi-account scenarios using AWS Disaster Recovery Service
Disaster recovery (DR) plays an important role in the overall business continuity strategy of an organization. When implementing a DR solution, you must understand business drivers along with any governance, security, and operational requirements that influence the final solution. For example, organizations may have a requirement to maintain different accounts for security isolation, control cost […]
Backing up Oracle databases to HAQM S3 at scale
In today’s data-driven world, safeguarding critical information stored in Oracle databases is crucial for enterprises. Companies struggle to efficiently backing up vast amounts of data from hundreds of databases powering enterprise resource planning (ERP) systems and critical applications. These backups must be secure, durable, and easily restorable to ensure business continuity, guard against ransomware, and […]
Enhance business continuity within an Availability Zone using AWS Elastic Disaster Recovery
At HAQM Web Services (AWS), we recommend running workloads across multiple Availability Zones (AZ) for high availability and fault tolerance. However, there are certain situations where users need to run their workloads in a single AZ. These include legacy or commercial off the shelf (COTS) applications that don’t support deployments across multiple AZ, workloads that […]
Analyzing HAQM S3 Metadata with HAQM Athena and HAQM QuickSight
UPDATE (1/27/2025): HAQM S3 Metadata is generally available. Object storage provides virtually unlimited scalability, but managing billions, or even trillions, of objects can pose significant challenges. How do you know what data you have? How can you find the right datasets at the right time? By implementing a robust metadata management strategy, you can answer these […]
Build a managed transactional data lake with HAQM S3 Tables
UPDATE (12/19/2024): Added guidance for HAQM EMR setup. Customers commonly use Apache Iceberg today to manage ever-growing volumes of data. Apache Iceberg’s relational database transaction capabilities (ACID transactions) help customers deal with frequent updates, deletions, and the need for transactional consistency across datasets. However, getting the most out of Apache Iceberg tables and running it […]
How HAQM S3 Tables use compaction to improve query performance by up to 3 times
Today businesses managing petabytes of data must optimize storage and processing to drive timely insights while being cost-effective. Customers often choose Apache Parquet for improved storage and query performance. Additionally, customers use Apache Iceberg to organize Parquet datasets to take advantage of its database-like features such as schema evolution, time travel, and ACID transactions. Customers […]
Manage costs for replicated delete markers in a disaster recovery setup on HAQM S3
Many businesses recognize the critical importance of safeguarding their essential data from potential disasters such as fires, floods, or ransomware events. Designing an effective disaster recovery (DR) strategy includes thoughtfully evaluating and selecting cost-effective solutions that fulfill compliance requirements. By using HAQM S3 features such as S3 object tags, S3 Versioning, and S3 Lifecycle, you can […]