Array Slice Cache

Cache, courtesy of http://upload.wikimedia.org/wikipedia/commons/a/a1/Cache_Coherency_Generic.png

The array slice cache is a document level cache that leverages the $slice operator in an update statement and allows us to operate on embedded arrays in documents. The operator lets us ensure that an embedded array only has a maximum of n entries in it.

A positive sliceAt will slice from the front and a negative will slice from the end. The first example has a negative sliceAt value. It takes:

var array =  [ 40, 50, 60 ]
var sliceAt = -5
var items = [ 80, 78, 86 ]
var result = [  50,  60,  80,  78,  86 ]

The second example has a positive sliceAt value. It pushes two values to the end of the array and slices it at 3 elements.

var array =  [ 89, 90 ]
var sliceAt = 3
var items = [ 100, 20 ]
var result = [  89,  90,  100 ]

The schema we will showcase here is a blog post document that contains the 10 latest comments inside an embedded array. This embedded array acts as a cache to avoid additional queries against the comments collection when retrieving the main blog web view.

Schema Observations

  • Allows for keeping a cache inside the document itself.
  • Lets the application optimize for single document reads.
  • Requires additional logic in your application to keep the cache current.

Schema

In our case, we’re are going to use a very simple blog post schema to show how the cache would work.

Let’s look at a post document from the posts collection with a single comment.

{
    "_id": ObjectId("54fd7392742abeef6186a68e")
  , "blogId": ObjectId("54fd7392742abeef6186a68e")
  , "createdOn": ISODate("2015-03-11T12:03:42.615Z")
  , "publishedOn": ISODate("2015-03-11T12:03:42.615Z")
  , "title": "Awesome blog post"
  , "content": "Testing this blog software"
  , "latestComments": [{
        "_id": ObjectId("54fd7392742abeef6186a68e")
      , "publishedOn": ISODate("2015-03-11T12:03:42.615Z")
      , "message": "Awesome post"
    }]
}

A simple comment document from the comments collection.

{
    "_id": ObjectId("54fd7392742abeef6186a68e")
  , "postId": ObjectId("54fd7392742abeef6186a68e")
  , "publishedOn": ISODate("2015-03-11T12:03:42.615Z")
  , "message": "Awesome post"
}

Operations

Add a comment

When a user enters a comment, it first gets written into the comments collection and then added to the latestComments embedded array cache in the blog post document. For this example, we are going assume, that we have a blog post in the posts collection with the field _id set to the value 1.

var postId = 1;
var posts = db.getSisterDB("blog").posts;
var comments = db.getSisterDB("blog").comments;

var comment = {
    "_id": ObjectId("54fd7392742abeef6186a68e")
  , "postId": postId
  , "publishedOn": ISODate("2015-03-11T12:03:42.615Z")
  , "message": "Awesome post"
};

comments.insert(comment);

posts.update({
    "_id": postId
  }, {
    latestComments: {
        $each: [comment]
      , $slice: -10
    }
  });

Example 1: Add comment to the latestComments cache

This will add the new comment to the end of the latestComments embedded array and slice it from the start at 10 elements. To display the ten latest comments in descending order, all we would need to do is read the blog post document and reverse the latestComments array to get the comments in the right order.

Indexes

We do not need any specific indexes in this example as we are always referencing documents by the _id field.

Scaling

Secondary Reads

The secondary reads will come with a latency penalty given that any changes to the cache might take some time to replicate across to the secondaries. However, depending on the applications’ needs, this latency might be acceptable. An example might be a Content Management System (CMS) where publishing an article might not need to be immediately available across the entire web site.

Here, secondary reads would allow us to offload reads from the primary, and distribute the load across multiple secondaries.

Sharding

For sharding it makes sense to select a shard key that places all blog posts on the same shard. For the comments you also want all the comment queries to be routed to the same shard for a specific blog post.

To do this efficiently, we should use a compound shard key that will contain a routing part (the blogId) as well as a rough time stamp (ObjectId). In our case, the logical key is {blogId: 1, _id: 1} for the posts collection.

var admin = db.getSisterDB("admin");
db.runCommand({enableSharding:'blogs'});
db.runCommand({
    shardCollection: 'blogs.posts'
  , key: {blogId:1, _id:1}
});

Example 2: Sharding key posts collection

Similarly for the comments collection we should group the comments together on the same shard so we can create a compound shard key {postId:1, _id:1} routing by postId.

var admin = db.getSisterDB("admin");
db.runCommand({enableSharding:'blogs'});
db.runCommand({
    shardCollection: 'blogs.comments'
  , key: {postId:1, _id:1}
});

Example 3: Sharding key comments collection

Performance

Performance is good here as the $slice update operator ensures we do mostly in place updates. In the case of the blog example, we might get more document moves as the size of latestComments might change due to different comment sizes. In cases where we use uniform sized documents, we will not incur any additional document moves.

tip

Preallocate

If we have uniform sized documents to be inserted into the document level line cache we can avoid document moves by pre-allocating the embedded document by filling it with empty documents of the expected size, meaning the document will not be moved around once we start pushing documents into the cache.

A simple exploration of the performance on a single machine with MongoDb 3.0 shows the difference between MMAP and WiredTiger for a narrow simulation using the schema simulation framework mongodb-schema-simulator.

Scenarios

https://github.com/christkv/mongodb-schema-simulator/blob/master/examples/scripts/single_or_replset/array_slice/cache_slice_scenario.js

https://github.com/christkv/mongodb-schema-simulator/blob/master/examples/scripts/single_or_replset/array_slice/pre_allocated_cache_slice_scenario.js

MongoDb runs locally on a MacBook Pro Retina 2015 with ssd and 16 gb ram. The simulation runs with the following parameters against a single mongodb instance under osx 10.10 Yosemite.

Pre-allocated slice cache

Parameters Value
processes 4
poolSize per process 50
type linear
Resolution in milliseconds 1000
Iterations run 25
Number of users 1000
Execution strategy slicetime

Slice cache no pre-allocation

Parameters Value
processes 4
poolSize per process 50
type linear
Resolution in milliseconds 1000
Iterations run 25
Number of users 1000
Execution strategy slicetime

MMAP

The MMAP engine is run using the default settings on MongoDB 3.0.1.

Pre-allocated cache

Pre-allocated slice cache

Statistics Value
Runtime 30.98 seconds
Mean 0.668 milliseconds
Standard Deviation 0.163 milliseconds
75 percentile 0.715 milliseconds
95 percentile 0.874 milliseconds
99 percentile 1.155 milliseconds
Minimum 0.361 milliseconds
Maximum 5.799 milliseconds

Slice cache no pre-allocation

Slice cache no pre-allocation

Statistics Value
Runtime 30.507 seconds
Mean 0.663 milliseconds
Standard Deviation 0.155 milliseconds
75 percentile 0.715 milliseconds
95 percentile 0.86 milliseconds
99 percentile 1.088 milliseconds
Minimum 0.365 milliseconds
Maximum 8.494 milliseconds

WiredTiger

The WiredTiger engine is run using the default settings on MongoDB 3.0.1.

Pre-allocated cache

Pre-allocated slice cache

Statistics Value
Runtime 30.606 seconds
Mean 0.756 milliseconds
Standard Deviation 0.199 milliseconds
75 percentile 0.812 milliseconds
95 percentile 1.085 milliseconds
99 percentile 1.418 milliseconds
Minimum 0.408 milliseconds
Maximum 5.627 milliseconds

Slice cache no pre-allocation

Slice cache no pre-allocation

Statistics Value
Runtime 30.033 seconds
Mean 1.012 milliseconds
Standard Deviation 0.4 milliseconds
75 percentile 1.234 milliseconds
95 percentile 1.783 milliseconds
99 percentile 2.185 milliseconds
Minimum 0.417 milliseconds
Maximum 8.582 milliseconds

The variance between MMAP and WiredTiger is minor in both cases. The small difference between pre-allocation and no pre-allocation is mostly due to the fact that we have a uniform object size in this test. If the cache document size is fairly big, it will impact WiredTiger since it needs to rewrite more data on each update.

Notes

This is an effective way cache results in the main document without incurring the side effects of unbound growing embedded documents. It’s a strategy that is useful when you wish to optimize for single document reads for specific web or application views. In this specific example, we can imagine a web page view showing a blog post with its ten latest comments.