12 things you should know about Amazon DocumentDB (with MongoDB compatibility)

Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. You can use the same MongoDB 3.6 application code, drivers, and tools to run, manage, and scale workloads on Amazon DocumentDB without having to worry about managing the underlying infrastructure. As a document database, Amazon DocumentDB makes it easy to store, query, and index JSON data.

AWS built Amazon DocumentDB to uniquely solve your challenges around availability, reliability, durability, scalability, backup, and more. In doing so, we built several novel and unique capabilities to remove undifferentiated heavy lifting and help reduce costs. This post introduces you to 12 Amazon DocumentDB capabilities you may not be aware of that can help you build and scale your MongoDB workloads on Amazon DocumentDB.

1. Modern, cloud-native architecture

Amazon DocumentDB was built from the ground up with a cloud-native database architecture. Its unique architecture separates storage and compute so that each layer can scale independently. Amazon DocumentDB uses a purpose-built, distributed, fault-tolerant, self-healing storage system that is highly available and durable by replicating data six ways across three AWS Availability Zones (AZs). For more information, see the video AWS re:Invent 2019: Amazon DocumentDB deep dive on YouTube. The following diagrams shows the separation of compute and storage in the Amazon DocumentDB architecture and how data is replicated six ways across three AZs.

2. Scale in compute minutes, regardless of data size

Because the storage volume is separated from the compute instances, the compute instances don’t rely on attached storage that is unique to the instance. Each instance in the cluster mounts the distributed storage volume; therefore, when new instances are added, no copying of data is required. That is advantageous to you because you can add an additional replica instance to your cluster or scale up instances in minutes to increase throughput up to millions of reads per second, regardless of data size. Similarly, you can scale down and scale in just as easily, without impacting the performance of your other instances.

3. Automatic, no impact, inexpensive backups

Unlike traditional database architectures, backups aren’t at the compute layer, which can affect database performance. Instead, Amazon DocumentDB backups are handled by the storage layer and are continually streamed to Amazon S3. With Amazon DocumentDB, taking a snapshot doesn’t affect database performance, so you can take snapshots when you need to and avoid impacting the performance of your production database.

In Amazon DocumentDB, continuous backup is enabled by default, providing 1 day of point-in-time restore (PITR). You can’t disable backup, and you can increase the backup retention period for PITR to 35 days. Additionally, you can take manual snapshots for long-term archival at any time. To offset the cost of enabling 1 day of backups by default, Amazon DocumentDB doesn’t charge for backup storage of up to 100% of your total cluster storage for a Region. Additional backups cost $0.02/GB per month. Furthermore, because backups happen at the storage layer, not at the compute layer, backups don’t use your compute resource or incur I/O costs.

4. Autoscaling storage and I/Os

When you provision an Amazon DocumentDB cluster, you don’t need to specify how much storage or I/Os you need for your cluster. Amazon DocumentDB uses a unique storage system that automatically scales from 10 GB up to 64 TB of data per cluster in 10 GB increments. Autoscaling of storage and I/Os helps you save time and money by not having to worry about capacity planning or over-provisioning storage infrastructure.

5. Scaling reads on replicas

In Amazon DocumentDB, the storage layer handles data replication and durability. Unlike traditional database architectures, replica instances in Amazon DocumentDB aren’t data bearing and don’t participate in a replication protocol to achieve quorum for durability. As a result, you can scale reads on your replica instances to get more performance from the compute resources you’re paying for and achieve high availability. For more information, see Connecting to Amazon DocumentDB as a Replica Set.

6. Implicit transactions

In Amazon DocumentDB, all CRUD statements (findAndModify, update, insert, delete) guarantee atomicity and consistency, even for operations that modify multiple documents. This behavior is different than MongoDB 3.6, which only provides atomic guarantees for commands that modify a single document. The following code shows example operations in Amazon DocumentDB that modify multiple documents that satisfy both atomic and consistent behaviors:

db.miles.update(
   {"credit_card": {$eq: true}},
   {$mul: { "flight_miles.$[]": NumberInt(2) }},
   { multi: true }
)
 
db.miles.updateMany({"credit_card": {$eq: true}}, {$mul: { "flight_miles.$[]": NumberInt(2) }})
 
db.runCommand({
   update: "miles",
   updates: [
              {q: {"credit_card": {$eq: true}}, u: {$mul: { "flight_miles.$[]": NumberInt(2) }}, multi: true}
   ]
})
 
db.products.deleteMany( { "cost" : { $gt : 30.00 } } );
 
db.runCommand(
   {
      delete: "products",
      deletes: [ {q: { "cost" : { $gt : 30.00 } }, limit: 0 } ]
   }
)

7. Free DMS for migrations to Amazon DocumentDB

AWS Database Migration Service (DMS) helps you migrate databases to Amazon DocumentDB quickly and securely. You can use AWS DMS for free (for 6 months) to easily migrate your on-premises or EC2 MongoDB databases to Amazon DocumentDB with virtually no downtime. For more information, see AWS Database Migration Service: Free DMS. For more information about migrations, see Migrating to Amazon DocumentDB.

8. Highly durable, single-instance clusters for development and testing

Amazon DocumentDB is highly durable by default. Because the storage handles its durability, and storage isn’t a function of how many instances you have in a cluster, you can create a single-instance cluster that’s still highly durable. Single-instance clusters are useful to save costs for dev and test workloads. For information about reducing costs, see Cost Optimization.

9. Broad set of compliance certifications and security controls

Amazon DocumentDB provides numerous security controls. First, Amazon DocumentDB supports role-based access control (RBAC), so you can create users and attach built-in roles to restrict what operations the user can perform. Amazon DocumentDB is a VPC-only service. Amazon VPC lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources, like an Amazon DocumentDB cluster, in your own virtual network that you define. Amazon DocumentDB allows you to encrypt your databases using keys you create and control through AWS KMS. On a cluster running with Amazon DocumentDB encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster. By default, connections between a client and Amazon DocumentDB are encrypted-in-transit with TLS.

Amazon DocumentDB meets the highest security standards and makes it easy for you to verify AWS security and meet your own regulatory and compliance obligations. Amazon DocumentDB is assessed to comply with PCI DSS, ISO 9001, 27001, 27017, and 27018, SOC 1, 2 and 3, and Health Information Trust Alliance (HITRUST) Common Security Framework (CSF) certification, in addition to being HIPAA eligible. AWS compliance reports are available for download in AWS Artifact.

10. Starting and stopping Amazon DocumentDB clusters

Amazon DocumentDB enables you to stop and start clusters to help save on costs. This makes it easy and affordable to use clusters for development and test purposes where the cluster isn’t required to be running all the time. When you stop a cluster, you bring the compute, and the cost, down to zero. For more information, see Stopping and Starting an Amazon DocumentDB Cluster.

11. Profiling for slow queries

You can use the profiler in Amazon DocumentDB to log the execution time and details of queries performed on your cluster to Amazon CloudWatch Logs. The profiler is useful for monitoring the slowest operations on your cluster to help you improve individual query performance and overall cluster performance. For more information, see Profiling Amazon DocumentDB Operations.

12. Per-second pricing

Amazon DocumentDB instances are billed in 1-second increments. With transparent on-demand pricing and no up-front commitment required, Amazon DocumentDB’s per-second billing provides additional granularity, so you only pay for the capacity you use. For more information, see Amazon DocumentDB (with MongoDB compatibility) pricing.

Summary

As a fully-managed database service, AWS built Amazon DocumentDB to uniquely solve your challenges around availability, reliability, durability, scalability, backup, and more. This post introduced you to 12 Amazon DocumentDB capabilities you may not be aware of that can help you build and scale your MongoDB workloads on Amazon DocumentDB.

To get started with Amazon DocumentDB, see Getting Started with Amazon DocumentDB (with MongoDB compatibility); Part 2 – using AWS Cloud9. To learn more about migrating to Amazon DocumentDB, see the migration guide and a demo of a live migration.

 


About the Authors

 

Joseph Idziorek is a Principal Product Manager at Amazon Web Services.

 

 

 

 

Jeff Duffy is a Sr NoSQL Specialist Solutions Architect at Amazon Web Services.