Advertising technology (ad tech) companies use Amazon DynamoDB to store various kinds of marketing data, such as user profiles, user events, clicks, and visited links. Some of the uses include real-time bidding (RTB), ad targeting, and attribution. In this blog post, I identify the most common use cases and design patterns of ad tech companies that use DynamoDB.
These use cases require a high request rate (millions of requests per second), low and predictable latency, and reliability. Companies use caching through DynamoDB Accelerator (DAX) when they have high read volumes or need submillisecond read latency. Increasingly, ad tech companies deploy their RTB and ad targeting platforms in more than one geographic region, which requires data replication between AWS Regions.
As a fully managed service, DynamoDB allows ad tech companies to meet all of these requirements without having to invest resources in database operations. These companies also find DynamoDB to be cost effective because their database spending is reduced by migrating to DynamoDB. For example, when GumGum migrated their digital advertising platform to DynamoDB, they estimated their cost savings over the old database to be 65–70 percent.
Terminology used in this post
This post uses the following data modeling and design pattern terminology:
- 1:1 modeling: One-to-one relationship modeling using a partition key as the primary key.
- 1:M modeling: One-to-many relationship modeling using a partition key and a sort key as the primary key.
- Caching via DAX: The use of DAX as the read cache in front of DynamoDB helps reduce the latency of reads, as well as handle high read load on frequently accessed items in a cost-effective way.
Ad tech use cases and design patterns
|Use Case||Data Modeling or Design Pattern|
|User profile store in RTB and ad targeting||1:1 modeling, 1:M modeling|
|User events, clickstream, and impressions data store||1:M modeling|
|Metadata store for assets||1:1 modeling|
|Popular-item cache||Caching with DAX|
Use case: User profile store in real-time bidding (RTB) and ad targeting
RTB and ad targeting use cases require real-time response latency of 100 milliseconds or less. To ensure this low latency, ad tech companies have to manage the latency of each step in the processing flow, including database access. In addition, an RTB platform that serves impressions on the order of 100 billion per day requires a database that handles millions of requests per second and stores billions of user profiles and hundreds of terabytes of data. At such a scale, failing to answer even a small fraction of bids can amount to a million-dollar loss. For these reasons, ad tech companies such as AdRoll rely on DynamoDB to deliver single-digit millisecond latency at any scale.
Design patterns: User profiles are stored in a DynamoDB table using 1:1 or 1:M modeling. In the 1:1 modeling approach, user profiles are partitioned and accessed by user ID. In cases where more fine-grained access is needed, user profiles are segmented and stored as an item hierarchy using a 1:M modeling approach, with the user ID as the partition key and the segment as the sort key. The
Query API is used to aggregate and retrieve all segments of a user profile. If the sort key is used to further segment profile data into a hierarchy, a key condition expression with the
begins_with function is used for fine-grained access of parts of the profile that match the
Use case: User events, clickstream, and impressions data store
Ad tech companies store user events such as clicks and impressions in DynamoDB for fast access by user and time. For example, DataXu uses DynamoDB to store user events in their RTB platform’s attribution engine. They chose DynamoDB because it is a fully managed, cost-effective service that provides the kind of scale and performance they need. With DynamoDB, they do not have to manage scaling work such as adding nodes, and they easily scale to hundreds of millions of requests per day.
Design patterns: User events are stored as key-value pairs (a 1:M modeling pattern), with the user ID as the partition key and the timestamp as the sort key of the primary key. To save on write and read capacity and storage data can be stored as compressed binary payloads. Typically, data is stored for a limited time, such as one month. To remove data from DynamoDB after it is no longer needed, companies use Time To Live (TTL) at no additional cost to delete data automatically when it expires.
Use case: Metadata store for assets
Ad tech companies also use DynamoDB to store the metadata of assets such as images, pages, and links. For example, in addition to user profiles, GumGum uses DynamoDB in their ad-targeting platform to store metadata for images and pages. Their use case requires large data stores that handle high traffic with low latencies spanning multiple data centers over geographical boundaries. GumGum selected DynamoDB because it satisfies these criteria, and it is cost effective and serverless, allowing their developers to easily scale and maintain the platform.
In another example, Branch, a mobile marketing and linking platform, uses DynamoDB as the database in their deep-linking platform. This platform provides a unified solution for serving and managing deep links to product webpages as well as link analytics. The platform requires a high-performance, scalable, and reliable key-value database that stores tens of billions of links and the associated metadata, amounting to dozens of terabytes of data. Branch uses DynamoDB because it meets these requirements, is cost effective, and has a predictable cost model. As a fully managed service, DynamoDB removes the operational burden from the Branch operations team.
Design patterns: Metadata for various assets—such as pages, images, and deep links—is stored in a DynamoDB table, partitioned by asset (a 1:1 modeling pattern).
Use case: Popular-item cache
This use case goes hand in hand with the earlier “Metadata store for assets” use case. Using DAX with DynamoDB solves massive read spikes on hot items and reduces read latency. By reducing the read load on DynamoDB, DAX can also help you reduce how much you spend on DynamoDB.
Design patterns: In Branch’s use case, the deep-linking platform uses DynamoDB to store links and their associated metadata, and DAX to cache new and popular links. The data is stored in a table that is partitioned by unique link ID, which is used as the primary key (a 1:1 modeling pattern). Though all links have to be available for fast access, a small percentage of links is accessed at a high rate daily. Those links are cached in DAX for efficient, submillisecond latency access and to reduce read load and cost.
This blog post describes some of the most common ways that ad tech companies use DynamoDB. For more information about ad tech solutions on AWS, see AWS for Digital Marketing. Submit comments below, or start a new thread on the DynamoDB forum.
About the author
Edin Zulich is a senior NoSQL specialist solutions architect at AWS who helps customers in all industries design scalable and cost-effective solutions to challenging data management problems. Edin has been with AWS since 2016, and he has worked on and with distributed data technologies since 2005.
This blog post is the second in a series of industry-vertical posts. The first post is Amazon DynamoDB: Gaming use cases and design patterns.