AWS Database Blog

Reduce cost and improve performance by migrating to HAQM DocumentDB 5.0

HAQM DocumentDB (with MongoDB compatibility) is a fully managed native JSON document database that makes it easy and cost effective to operate critical document workloads at virtually any scale without managing infrastructure. HAQM DocumentDB simplifies your architecture by providing built-in security best practices, continuous backups, and native integrations with other AWS services. You can enhance your applications with generative artificial intelligence (AI) and machine learning (ML) capabilities using vector search for HAQM DocumentDB and integration with HAQM SageMaker Canvas. As a document database, HAQM DocumentDB makes it straightforward to store, query, and index JSON data.

HAQM DocumentDB 5.0 introduced several new features and performance enhancements that can improve performance and manage database cost, making it a compelling choice for various workloads. The following are some of the key benefits of HAQM DocumentDB 5.0:

  • Compatibility with the MongoDB API version 5.0 – HAQM DocumentDB 5.0 offers compatibility with the MongoDB API version 5.0, allowing you to take advantage of the latest capabilities and enhancements of MongoDB APIs.
  • Vector search – HAQM DocumentDB 5.0 now supports vector search, a new capability that enables you to store, index, and query millions of vectors with millisecond response time. Vectors are numerical representations of unstructured data, such as text, created from machine learning (ML) models that help capture the semantic meaning of the underlying data. Vector search for HAQM DocumentDB can store vectors from HAQM Bedrock, HAQM SageMaker, and more.

    The following diagram shows how to generate vectors using embedding models for fields inside your documents, store vectors inside HAQM DocumentDB, and perform vector search.

  • Text search – With text search, you can perform text searches of specific terms or phrases on large string data using $text and $search operators, assign different significance levels to the indexed fields using weights, and sort the search results based on relevance using $meta operator.
  • I/O-Optimized storage configuration – HAQM DocumentDB 5.0 introduced I/O-Optimized storage configuration. This new storage configuration offers improved performance, increased write throughput, and reduced latency for demanding workloads, along with up to 40% cost savings for I/O-intensive applications.
  • Performance and indexing improvements – HAQM DocumentDB 5.0 introduced $in performance improvements and index scans with the $elemMatch operator, allowing for better query performance. HAQM DocumentDB also launched index improvements, enabling faster index builds on collections and the ability to view index build statuses. HAQM DocumentDB index builds can now be up to 14 times faster when using parallel workers compared to using a single worker.

    With partial index, you can create an index on a subset of documents that meet a specific filter criterion. This indexing allows for faster queries on a smaller subset of data, reducing query times and improving performance.

  • Client-side field-level encryption – With HAQM DocumentDB 5.0, you can encrypt sensitive data in your applications before it is sent to the database, enhancing data security and compliance.
  • Document compression – The new version introduced support for document compression, which allows you to lower storage and I/O costs by compressing the documents in your collections. You can enable document compression at a collection level for documents with a size of 2 KB and larger.
  • No-code ML with HAQM Sage Maker Canvas – HAQM DocumentDB now integrates with HAQM SageMaker Canvas to enable no-code ML with data stored in HAQM DocumentDB. You can build ML models for regression and forecasting needs and use foundation models for content summarization and generation using data stored in HAQM DocumentDB without writing a single line of code. Following diagram shows that how HAQM DocumentDB works with SageMaker Canvas.

Upgrade to HAQM DocumentDB 5.0

HAQM DocumentDB 5.0 is easy to get started with. If you are already using HAQM DocumentDB 3.6 and 4.0 clusters, you can upgrade to HAQM DocumentDB 5.0 using an in-place major version upgrade (MVU), which makes it straightforward to upgrade to the latest version without needing to create new clusters or rely on database migration tools.

Summary

The release of HAQM DocumentDB 5.0 introduced a range of new features and performance enhancements geared towards enhancing workload performance and reducing database costs. With support for MongoDB APIs versions 3.6, 4.0, and 5.0, allowing users to leverage the same MongoDB compatible drivers, applications, and tools with little or no changes. Furthermore, upgrading existing 3.6 or 4.0 clusters is made simple with the in-place upgrade feature, removing the need for data migration or cluster creation and providing a straightforward path to accessing the latest capabilities offered by HAQM DocumentDB 5.0.

As always, AWS welcomes your feedback, leave any thoughts or questions in the comments section.


About the Authors

Srinivas Margasahayam leads the worldwide DocumentDB specialist Solution Architects team. Srinivas has over 2 decades of experience in databases and in leading engineering and operations teams. At Yahoo! Srinivas was the director of Engineering and Operations teams where he oversaw platform operations. At HAQM, Srinivas has been leading Solution Architecture teams for over 8 years in helping customers through intricacies of cloud migration and optimizing workloads.

Anshu Vajpayee is a Senior DocumentDB Specialist Solutions Architect at HAQM Web Services (AWS). He has been helping customers to adopt NoSQL databases and modernize applications leveraging HAQM DocumentDB. Before joining AWS, he worked extensively with relational and NoSQL databases.