AWS Database Blog
12 things you should know about HAQM DocumentDB (with MongoDB compatibility)
This blog post was last reviewed and updated February, 2022.
HAQM DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. You can use the same MongoDB 3.6 or 4.0 or 5.0 application code, drivers, and tools to run, manage, and scale workloads on HAQM DocumentDB without having to worry about managing the underlying infrastructure. As a document database, HAQM DocumentDB makes it easy to store, query, and index JSON data.
AWS built HAQM DocumentDB to uniquely solve your challenges around availability, reliability, durability, scalability, backup, and more. In doing so, we built several novel and unique capabilities to remove undifferentiated heavy lifting and help reduce costs. This post introduces you to 12 HAQM DocumentDB capabilities you may not be aware of that can help you build and scale your MongoDB workloads on HAQM DocumentDB.
1. Modern, cloud-native architecture
HAQM DocumentDB was built from the ground up with a cloud-native database architecture. Its unique architecture separates storage and compute so that each layer can scale independently. HAQM DocumentDB uses a purpose-built, distributed, fault-tolerant, self-healing storage system that is highly available and durable by replicating data six ways across three AWS Availability Zones (AZs). For more information, see the video AWS re:Invent 2019: HAQM DocumentDB deep dive on YouTube. The following diagrams shows the separation of compute and storage in the HAQM DocumentDB architecture and how data is replicated six ways across three AZs.
2. Scale compute in minutes, regardless of data size
Because the storage volume is separated from the compute instances, the compute instances don’t rely on attached storage that is unique to the instance. Each instance in the cluster mounts the distributed storage volume; therefore, when new instances are added, no copying of data is required. That is advantageous to you because you can add an additional replica instance to your cluster or scale up instances in minutes to increase throughput up to millions of reads per second, regardless of data size. Similarly, you can scale down and scale in just as easily, without impacting the performance of your other instances.
3. Automatic, no impact, inexpensive backups
Unlike traditional database architectures, backups aren’t at the compute layer, which can affect database performance. Instead, HAQM DocumentDB backups are handled by the storage layer and are continually streamed to HAQM S3. With HAQM DocumentDB, taking a snapshot doesn’t affect database performance, so you can take snapshots when you need to and avoid impacting the performance of your production database.
In HAQM DocumentDB, continuous backup is enabled by default, providing 1 day of point-in-time restore (PITR). You can’t disable backup, and you can increase the backup retention period for PITR to 35 days. Additionally, you can take manual snapshots for long-term archival at any time. To offset the cost of enabling 1 day of backups by default, HAQM DocumentDB doesn’t charge for backup storage of up to 100% of your total cluster storage for a Region. Additional backups cost $0.02/GB per month. Furthermore, because backups happen at the storage layer, not at the compute layer, backups don’t use your compute resource or incur I/O costs.
4. Autoscaling storage and I/Os
When you provision an HAQM DocumentDB cluster, you don’t need to specify how much storage or I/Os you need for your cluster. HAQM DocumentDB uses a unique storage system that automatically scales from 10 GB up to 64 TB of data per cluster in 10 GB increments. Autoscaling of storage and I/Os helps you save time and money by not having to worry about capacity planning or over-provisioning storage infrastructure.
5. Scaling reads on replicas
In HAQM DocumentDB, the storage layer handles data replication and durability. Unlike traditional database architectures, replica instances in HAQM DocumentDB aren’t data bearing and don’t participate in a replication protocol to achieve quorum for durability. As a result, you can scale reads on your replica instances to get more performance from the compute resources you’re paying for and achieve high availability. For more information, see Connecting to HAQM DocumentDB as a Replica Set.
6. Implicit transactions
In HAQM DocumentDB, all CRUD statements (findAndModify
, update
, insert
, delete
) guarantee atomicity and consistency, even for operations that modify multiple documents. This behavior is different than MongoDB 3.6, which only provides atomic guarantees for commands that modify a single document. The following code shows example operations in HAQM DocumentDB that modify multiple documents that satisfy both atomic and consistent behaviors:
7. DMS for migrations to HAQM DocumentDB
AWS Database Migration Service (DMS) helps you migrate databases to HAQM DocumentDB quickly and securely. You can use AWS DMS to easily migrate your on-premises or EC2 MongoDB databases to HAQM DocumentDB with virtually no downtime. For more information, see AWS Database Migration Service. For more information about migrations, see Migrating to HAQM DocumentDB.
8. Highly durable, single-instance clusters for development and testing
HAQM DocumentDB is highly durable by default. Because the storage handles its durability, and storage isn’t a function of how many instances you have in a cluster, you can create a single-instance cluster that’s still highly durable. Single-instance clusters are useful to save costs for dev and test workloads. For information about reducing costs, see Cost Optimization.
9. Broad set of compliance certifications and security controls
HAQM DocumentDB provides numerous security controls. First, HAQM DocumentDB supports role-based access control (RBAC), so you can create users and attach built-in roles to restrict what operations the user can perform. HAQM DocumentDB is a VPC-only service. HAQM VPC lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources, like an HAQM DocumentDB cluster, in your own virtual network that you define. HAQM DocumentDB allows you to encrypt your databases using keys you create and control through AWS KMS. On a cluster running with HAQM DocumentDB encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster. By default, connections between a client and HAQM DocumentDB are encrypted-in-transit with TLS.
HAQM DocumentDB meets the highest security standards and makes it easy for you to verify AWS security and meet your own regulatory and compliance obligations. HAQM DocumentDB is assessed to comply with PCI DSS, ISO 9001, 27001, 27017, and 27018, SOC 1, 2 and 3, and Health Information Trust Alliance (HITRUST) Common Security Framework (CSF) certification, in addition to being HIPAA eligible. AWS compliance reports are available for download in AWS Artifact.
10. Starting and stopping HAQM DocumentDB clusters
HAQM DocumentDB enables you to stop and start clusters to help save on costs. This makes it easy and affordable to use clusters for development and test purposes where the cluster isn’t required to be running all the time. When you stop a cluster, you bring the compute, and the cost, down to zero. For more information, see Stopping and Starting an HAQM DocumentDB Cluster.
11. Profiling for slow queries
You can use the profiler in HAQM DocumentDB to log the execution time and details of queries performed on your cluster to HAQM CloudWatch Logs. The profiler is useful for monitoring the slowest operations on your cluster to help you improve individual query performance and overall cluster performance. For more information, see Profiling HAQM DocumentDB Operations.
12. Per-second pricing
HAQM DocumentDB instances are billed in 1-second increments. With transparent on-demand pricing and no up-front commitment required, HAQM DocumentDB’s per-second billing provides additional granularity, so you only pay for the capacity you use. For more information, see HAQM DocumentDB (with MongoDB compatibility) pricing.
Summary
As a fully-managed database service, AWS built HAQM DocumentDB to uniquely solve your challenges around availability, reliability, durability, scalability, backup, and more. This post introduced you to 12 HAQM DocumentDB capabilities you may not be aware of that can help you build and scale your MongoDB workloads on HAQM DocumentDB.
To get started with HAQM DocumentDB, see Getting Started with HAQM DocumentDB (with MongoDB compatibility); Part 2 – using AWS Cloud9. To learn more about migrating to HAQM DocumentDB, see the migration guide and a demo of a live migration.
About the Authors
Joseph Idziorek is a Principal Product Manager at HAQM Web Services.
Jeff Duffy is a Sr NoSQL Specialist Solutions Architect at HAQM Web Services.