Account Transactions

Metadata, courtesy of http://www.flickr.com/photos/68751915@N05/6629034769

MongoDB 3.0 support any notion of a transaction across one or more documents. It does, however, guarantee atomic write operations on single documents. This allows us to implement a Two-Phase commit strategy using double bookkeeping to simulate transactions. However, it’s important to note that due to only single document operations being atomic, MongoDB can only offer transaction-like semantics. It’s still possible for applications to return intermediate results during the two-phase commit or rollback.

In this chapter, we will use two collections to simulate a bank account system. The first collection is accounts and contains all the customer accounts, while the second one is transactions and is our bookkeeping collection. The goal is to transfer 100 from Joe to Peter using a two-phase commit.

Schema Observations

  • Simple.
  • Requires background processes to manage unfinished transactions.
  • Can still cause intermediate results during a two-phase commit or rollback.

Schema

In our simplified schema, we will only keep the basic necessary fields needed to perform the account transaction.

Let’s look at an example document representing an account.

{
    "_id": ObjectId("54fd7392742abeef6186a68e")
  , "name": "Joe Moneylender"
  , "balance": 1000
  , "pendingTransactions": []
}
Schema Attributes Description
name Name of the account owner (name is used for simplicities sake)
balance The account balance in a magic universal currency.
pendingTransactions Contains any pending transactions on the account

A transaction document example would look something like.

{
    "_id": ObjectId("54fd7392742abeef6186a68e")
  , "source": "Joe Moneylender"
  , "destination": "Peter Bum"
  , "balance": 200
  , "state": "initial"
}
Schema Attributes Description
source Name of the source bank account
destination Name of the destination bank account
balance Contains the amount of the transaction
state The state of the transaction. Can be one of initial, pending, committed, done and canceled

Operations

Performing a successful Transfer

Create two accounts for Joe and Peter, crediting them each with an initial balance of 1000.

var col = db.getSisterDB("bank").accounts;
col.insert({name: "Joe Moneylender", balance: 1000, pendingTransactions:[]});
col.insert({name: "Peter Bum", balance: 1000, pendingTransactions:[]});

Example 1: Create two accounts

We are going perform a transfer from Joe to Peter of 100. We start by creating a new transaction and setting its state to initial.

var col = db.getSisterDB("bank").transactions;
col.insert({source: "Joe", destination: "Peter", amount: 100, state: "intital"});

Example 2: Setup initial transaction

After successfully creating the initial transaction, we are going to attempt to apply it to both accounts. First we need to flip the transaction state to pending in order to signal that we are attempting to apply it to both accounts.

var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "initial"});
col.update({_id: transaction._id}, {$set: {state: "pending"}});

Example 3: Update transaction to pending

Next, we attempt to apply the transaction to the source account. The transaction variable represents the previously created transaction document.

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.source, pendingTransactions: {$ne: transaction._id}, balance: { $gte: amount}
  }, {
    $inc: {balance: -transaction.amount}, $push: {pendingTransactions: transaction._id}
  });

Example 4: Debit source account

The update will only succeed if the transaction is not already present on the account and there is enough money to cover the attempted transaction (there is no concept of credit in this world).

If the criteria are met, we push the transaction id to the pendingTransactions embedded array and deduct the amount from the account balance.

After applying the transaction successfully to the source account, we credit the destination account with the transfer amount requested.

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.source, pendingTransactions: {$ne: transaction._id}, balance: { $gte: amount}
  }, {
    $inc: {balance: -transaction.amount}, $push: {pendingTransactions: transaction._id}
  });

Example 5: Credit destination account

As with the source we ensure the transaction does not already exist on the destination account to guard against applying the same transaction multiple times.

Once we have successfully applied the transaction to the source and the destination accounts, our account documents will look something like the following (the pendingTransactions ObjectId entries will vary).

{
    "_id": ObjectId("54fd7392742abeef6186a68e")
  , "name": "Joe Moneylender"
  , "balance": 900
  , "pendingTransactions": [ObjectId("54fd7392742abeef6186a68e")]
}

Example 6: Source account

{
    "_id": ObjectId("54fd7392742abeef6186a68e")
  , "name": "Joe Moneylender"
  , "balance": 900
  , "pendingTransactions": [ObjectId("54fd7392742abeef6186a68e")]
}

Example 7: Destination account

After having applied the transaction to the account, we need to update the transaction state to committed.

var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "pending"});
col.update({_id: transaction._id}, {$set: {state: "committed"}});

Example 8: Update transaction to committed

Once the transaction is in committed state, we need to clean up the source and destination accounts by removing the transaction from both accounts. Let’s start by removing it from the source account.

var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});

Example 9: Remove the transaction from source account

After removing it from the source account, we remove it from the destination account.

var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});

Example 10: Remove the transaction from destination account

Finally, we set the transaction to the done state, signaling the transfer has been completed.

var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "pending"});
col.update({_id: transaction._id}, {$set: {state: "done"}});

Example 11: Update transaction to done

That wraps up the successful transfer of an amount between two accounts. Next, we need to look at potential failure scenarios.

Transaction failure between initial and pending

If there is a failure while flipping the transaction state to pending, the recovery process is to set the transaction state to canceled.

var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "initial"});
col.update({_id: transaction._id}, {$set: {state: "cancel"}});

Example 12: Cancel transaction

Transaction failure between pending and committed

We have a partially applied transaction, leaving the transaction in an unfinished state. In this case we need to reverse the transaction.

First reverse the debit of the source account.

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.source, pendingTransactions: {$in: [transaction._id]}
  }, {
    $inc: {balance: transaction.amount}, $pull: {pendingTransactions: transaction._id}
  });

Example 13: Reverse debit

Then, reverse the credit to the destination account.

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.source, pendingTransactions: {$in: [transaction._id]}
  }, {
    $inc: {balance: -transaction.amount}, $pull: {pendingTransactions: transaction._id}
  });

Example 14: Reverse credit

Finally, set the transaction itself to the canceled state.

var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "initial"});
col.update({_id: transaction._id}, {$set: {state: "cancel"}});

Example 15: Cancel transaction

Transaction failure between committed and done

If the transaction fails between the state of committed and done we need to clean up the state of the accounts.

var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});

Example 16: Remove the transaction from source account

After removing it from the source account remove it from the destination account.

var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});

Example 17: Remove the transaction from destination account

Finally retry changing the transaction state to done.

tip

Recovery Process

It’s helpful to have a process running in the background that will look for any transactions left in pending or committed state. To determine the time a transaction has been sitting in an interrupted state, it might be helpful to add a createAt timestamp to all transaction documents.

Undoing/Rolling back a transaction

For some cases you might need to undo (rollback) a transaction due to the application canceling the transaction or because it cannot be recovered (for example, if the one of the accounts does not exist during the transaction).

There are two points in the two-phase commit we can rollback.

  1. If you have applied the transaction to the accounts, you should not rollback. Instead, create a new transaction based on the existing transaction and switch the original source and destination fields.
  2. If you have created the transaction but have not yet applied it, you can use the following steps.

First set the transaction state to canceling

var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "canceling"}});

Example 18: Set transaction state to canceling

Next, let’s undo the transaction. Notice that a non-applied transaction means the transaction _id has not yet been removed from the pendingTransactions array. Let’s undo the transaction on the source account.

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.source, pendingTransactions: transaction._id
  }, {
    $inc: {balance: transaction.value}, $pull: {pendingTransactions: transaction._id}
  });

Example 19: Undo the source account transaction

Next undo the transaction on the destination account.

var col = db.getSisterDB("bank").accounts;
col.update({
    name: transaction.destination, pendingTransactions: transaction._id
  }, {
    $inc: {balance: -transaction.value} , $pull: {pendingTransactions: transaction._id}
  });

Example 20: Undo the destination account transaction

Finally set the transaction state to canceled

var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "canceled"}});

Example 21: Set transaction state to canceled

Concurrent Transaction Application

Let’s imagine applications A1 and A2 that both start processing the single transaction, T1 while the transaction is still in theinitial` state.

  1. A1 can apply T1 before A2 starts
  2. A2 will then apply T1 again because it does not appear as pending in the accounts documents.

You can avoid this by explicitly stating in the transaction which application is handling it.

var col = db.getSisterDB("bank").transactions;
col.findAndModify({
    query: {state: "initial", application: {$exists: 0}}
  , update: {$set: {state: "pending", application: "A1"}}
  , new: true});

Example 22: Ensuring application isolation

findAndModify will retrieve and update the document in one atomic operation. This guarantees that only a single application can tag a transaction that is being processed by it. In this case, a transaction in the initial state, is marked as being processed by A1 should the application field not exist.

If the transaction fails or needs to be rolled back, you can retrieve the pending transactions for the specific application, A1.

var col = db.getSisterDB("bank").transactions;
col.transactions.find({application: "A1", state: "pending"});

Example 23: Retrieve pending transactions for A1

Indexes

Since we are retrieving transactions by the _id field we do not need any additional indexes for the transactions collection. For the accounts collection we are retrieving the accounts by the name field.

var col = db.getSisterDB("bank").accounts;
col.ensureIndex({name:1});

Example 24: Create the bank account name index

If you have processes looking up transactions that are unfinished and are still in the committed or pending states, continuing to either apply the transactions to the accounts or to roll them back, an additional index on the transaction state field will make retrieval of transactions more efficient.

var col = db.getSisterDB("bank").transactions;
col.ensureIndex({state:1});

Example 25: Create the transaction account state index

Scaling

Secondary Reads

All the operations against the queue are write operations so secondary reads are not useful for this schema.

Sharding

In this case we can shard the accounts collection by the name field and use a hashed index for the field to ensure even distribution across all out shards.

var admin = db.getSisterDB("admin");
db.runCommand({enableSharding:'bank'});
db.runCommand({
    shardCollection: 'bank.accounts'
  , key: {name:'hashed'}
});

Example 26: Shard by accounts name key

For the transactions collection we will hash the _id field as we are looking up transactions mostly by _id. Looking up the transaction by the state field is a rare case that we can let incur a scatter gather query.

var admin = db.getSisterDB("admin");
db.runCommand({enableSharding:'bank'});
db.runCommand({
    shardCollection: 'bank.transactions'
  , key: {_id:'hashed'}
});

Example 27: Shard by transactions _id key

Performance

A simple exploration of the performance on a single machine with MongoDb 3.0 shows the difference between MMAP and WiredTiger for a narrow simulation using the schema simulation framework mongodb-schema-simulator.

Scenario

https://github.com/christkv/mongodb-schema-simulator/blob/master/examples/scripts/single_or_replset/account/account_100_successful_transfer_scenario.js

MongoDb runs on a MacBook Pro Retina 2013 with 512 GB ssd and 16 gb ram. The simulation runs with the following parameters against a single mongodb instance under osx 10.10 Yosemite.

Parameters Value
processes 4
poolSize per process 50
type linear
Resolution in milliseconds 1000
Iterations run 25
Number of users querying using metadata per iteration 1000
Execution strategy slicetime

MMAP

The MMAP engine is run using the default settings on MongoDB 3.0.1.

Successfully transfer between accounts

Successfully transfer between accounts

Statistics Value
Runtime 48.086 seconds
Mean 9951.659 milliseconds
Standard Deviation 7131.847 milliseconds
75 percentile 12220.925 milliseconds
95 percentile 25750.843 milliseconds
99 percentile 34520.997 milliseconds
Minimum 65.398 milliseconds
Maximum 41428.625 milliseconds

WiredTiger

The WiredTiger engine is run using the default settings on MongoDB 3.0.1.

Successfully transfer between accounts

Successfully transfer between accounts

Statistics Value
Runtime 46.653 seconds
Mean 8655.022 milliseconds
Standard Deviation 5743.181 milliseconds
75 percentile 11037.199 milliseconds
95 percentile 19937.439 milliseconds
99 percentile 29972.029 milliseconds
Minimum 77.812 milliseconds
Maximum 40603.657 milliseconds

The two storage engines MMAP and WiredTiger are fairly close in performance, as the document actually changes are small and thus do not adversely affect WiredTiger performance.

Notes

important

Notes

Real world applications will likely be more complex and require updating more than just the balance of the accounts. The application might need to update pending credits, pending debits and other details.

If these fields are part of the account document they can still occur within a single update, ensuring an atomic update of all the fields.