- Schema Book
- Account Transactions
Account Transactions

MongoDB 3.0 support any notion of a transaction across one or more documents. It does, however, guarantee atomic write operations on single documents. This allows us to implement a Two-Phase commit strategy using double bookkeeping to simulate transactions. However, it’s important to note that due to only single document operations being atomic, MongoDB can only offer transaction-like semantics. It’s still possible for applications to return intermediate results during the two-phase commit or rollback.
In this chapter, we will use two collections to simulate a bank account system. The first collection is accounts and contains all the customer accounts, while the second one is transactions and is our bookkeeping collection. The goal is to transfer 100 from Joe to Peter using a two-phase commit.
Schema Observations
- Simple.
- Requires background processes to manage unfinished transactions.
- Can still cause intermediate results during a two-phase commit or rollback.
Schema
In our simplified schema, we will only keep the basic necessary fields needed to perform the account transaction.
Let’s look at an example document representing an account.
{
"_id": ObjectId("54fd7392742abeef6186a68e")
, "name": "Joe Moneylender"
, "balance": 1000
, "pendingTransactions": []
}
| Schema Attributes | Description |
|---|---|
| name | Name of the account owner (name is used for simplicities sake) |
| balance | The account balance in a magic universal currency. |
| pendingTransactions | Contains any pending transactions on the account |
A transaction document example would look something like.
{
"_id": ObjectId("54fd7392742abeef6186a68e")
, "source": "Joe Moneylender"
, "destination": "Peter Bum"
, "balance": 200
, "state": "initial"
}
| Schema Attributes | Description |
|---|---|
| source | Name of the source bank account |
| destination | Name of the destination bank account |
| balance | Contains the amount of the transaction |
| state | The state of the transaction. Can be one of initial, pending, committed, done and canceled |
Operations
Performing a successful Transfer
Create two accounts for Joe and Peter, crediting them each with an initial balance of 1000.
var col = db.getSisterDB("bank").accounts;
col.insert({name: "Joe Moneylender", balance: 1000, pendingTransactions:[]});
col.insert({name: "Peter Bum", balance: 1000, pendingTransactions:[]});
We are going perform a transfer from Joe to Peter of 100. We start by creating a new transaction and setting its state to initial.
var col = db.getSisterDB("bank").transactions;
col.insert({source: "Joe", destination: "Peter", amount: 100, state: "intital"});
After successfully creating the initial transaction, we are going to attempt to apply it to both accounts. First we need to flip the transaction state to pending in order to signal that we are attempting to apply it to both accounts.
var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "initial"});
col.update({_id: transaction._id}, {$set: {state: "pending"}});
Next, we attempt to apply the transaction to the source account. The transaction variable represents the previously created transaction document.
var col = db.getSisterDB("bank").accounts;
col.update({
name: transaction.source, pendingTransactions: {$ne: transaction._id}, balance: { $gte: amount}
}, {
$inc: {balance: -transaction.amount}, $push: {pendingTransactions: transaction._id}
});
The update will only succeed if the transaction is not already present on the account and there is enough money to cover the attempted transaction (there is no concept of credit in this world).
If the criteria are met, we push the transaction id to the pendingTransactions embedded array and deduct the amount from the account balance.
After applying the transaction successfully to the source account, we credit the destination account with the transfer amount requested.
var col = db.getSisterDB("bank").accounts;
col.update({
name: transaction.source, pendingTransactions: {$ne: transaction._id}, balance: { $gte: amount}
}, {
$inc: {balance: -transaction.amount}, $push: {pendingTransactions: transaction._id}
});
As with the source we ensure the transaction does not already exist on the destination account to guard against applying the same transaction multiple times.
Once we have successfully applied the transaction to the source and the destination accounts, our account documents will look something like the following (the pendingTransactions ObjectId entries will vary).
{
"_id": ObjectId("54fd7392742abeef6186a68e")
, "name": "Joe Moneylender"
, "balance": 900
, "pendingTransactions": [ObjectId("54fd7392742abeef6186a68e")]
}
{
"_id": ObjectId("54fd7392742abeef6186a68e")
, "name": "Joe Moneylender"
, "balance": 900
, "pendingTransactions": [ObjectId("54fd7392742abeef6186a68e")]
}
After having applied the transaction to the account, we need to update the transaction state to committed.
var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "pending"});
col.update({_id: transaction._id}, {$set: {state: "committed"}});
Once the transaction is in committed state, we need to clean up the source and destination accounts by removing the transaction from both accounts. Let’s start by removing it from the source account.
var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});
After removing it from the source account, we remove it from the destination account.
var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});
Finally, we set the transaction to the done state, signaling the transfer has been completed.
var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "pending"});
col.update({_id: transaction._id}, {$set: {state: "done"}});
That wraps up the successful transfer of an amount between two accounts. Next, we need to look at potential failure scenarios.
Transaction failure between initial and pending
If there is a failure while flipping the transaction state to pending, the recovery process is to set the transaction state to canceled.
var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "initial"});
col.update({_id: transaction._id}, {$set: {state: "cancel"}});
Transaction failure between pending and committed
We have a partially applied transaction, leaving the transaction in an unfinished state. In this case we need to reverse the transaction.
First reverse the debit of the source account.
var col = db.getSisterDB("bank").accounts;
col.update({
name: transaction.source, pendingTransactions: {$in: [transaction._id]}
}, {
$inc: {balance: transaction.amount}, $pull: {pendingTransactions: transaction._id}
});
Then, reverse the credit to the destination account.
var col = db.getSisterDB("bank").accounts;
col.update({
name: transaction.source, pendingTransactions: {$in: [transaction._id]}
}, {
$inc: {balance: -transaction.amount}, $pull: {pendingTransactions: transaction._id}
});
Finally, set the transaction itself to the canceled state.
var col = db.getSisterDB("bank").transactions;
var transaction = col.findOne({state: "initial"});
col.update({_id: transaction._id}, {$set: {state: "cancel"}});
Transaction failure between committed and done
If the transaction fails between the state of committed and done we need to clean up the state of the accounts.
var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});
After removing it from the source account remove it from the destination account.
var col = db.getSisterDB("bank").transactions;
col.update({name: transaction.source}, {$pull: {pendingTransactions: transaction._id}});
Finally retry changing the transaction state to done.
tip
Recovery Process
It’s helpful to have a process running in the background that will look for any transactions left in pending or committed state. To determine the time a transaction has been sitting in an interrupted state, it might be helpful to add a createAt timestamp to all transaction documents.
Undoing/Rolling back a transaction
For some cases you might need to undo (rollback) a transaction due to the application canceling the transaction or because it cannot be recovered (for example, if the one of the accounts does not exist during the transaction).
There are two points in the two-phase commit we can rollback.
- If you have applied the transaction to the accounts, you should not rollback. Instead, create a new transaction based on the existing transaction and switch the original source and destination fields.
- If you have created the transaction but have not yet applied it, you can use the following steps.
First set the transaction state to canceling
var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "canceling"}});
Next, let’s undo the transaction. Notice that a non-applied transaction means the transaction _id has not yet been removed from the pendingTransactions array. Let’s undo the transaction on the source account.
var col = db.getSisterDB("bank").accounts;
col.update({
name: transaction.source, pendingTransactions: transaction._id
}, {
$inc: {balance: transaction.value}, $pull: {pendingTransactions: transaction._id}
});
Next undo the transaction on the destination account.
var col = db.getSisterDB("bank").accounts;
col.update({
name: transaction.destination, pendingTransactions: transaction._id
}, {
$inc: {balance: -transaction.value} , $pull: {pendingTransactions: transaction._id}
});
Finally set the transaction state to canceled
var col = db.getSisterDB("bank").transactions;
col.update({_id: transaction._id}, {$set: {state: "canceled"}});
Concurrent Transaction Application
Let’s imagine applications A1 and A2 that both start processing the single transaction, T1 while the transaction is still in theinitial` state.
A1can applyT1beforeA2startsA2will then applyT1again because it does not appear as pending in theaccountsdocuments.
You can avoid this by explicitly stating in the transaction which application is handling it.
var col = db.getSisterDB("bank").transactions;
col.findAndModify({
query: {state: "initial", application: {$exists: 0}}
, update: {$set: {state: "pending", application: "A1"}}
, new: true});
findAndModify will retrieve and update the document in one atomic operation. This guarantees that only a single application can tag a transaction that is being processed by it. In this case, a transaction in the initial state, is marked as being processed by A1 should the application field not exist.
If the transaction fails or needs to be rolled back, you can retrieve the pending transactions for the specific application, A1.
var col = db.getSisterDB("bank").transactions;
col.transactions.find({application: "A1", state: "pending"});
Indexes
Since we are retrieving transactions by the _id field we do not need any additional indexes for the transactions collection. For the accounts collection we are retrieving the accounts by the name field.
var col = db.getSisterDB("bank").accounts;
col.ensureIndex({name:1});
name index
If you have processes looking up transactions that are unfinished and are still in the committed or pending states, continuing to either apply the transactions to the accounts or to roll them back, an additional index on the transaction state field will make retrieval of transactions more efficient.
var col = db.getSisterDB("bank").transactions;
col.ensureIndex({state:1});
state index
Scaling
Secondary Reads
All the operations against the queue are write operations so secondary reads are not useful for this schema.
Sharding
In this case we can shard the accounts collection by the name field and use a hashed index for the field to ensure even distribution across all out shards.
var admin = db.getSisterDB("admin");
db.runCommand({enableSharding:'bank'});
db.runCommand({
shardCollection: 'bank.accounts'
, key: {name:'hashed'}
});
name key
For the transactions collection we will hash the _id field as we are looking up transactions mostly by _id. Looking up the transaction by the state field is a rare case that we can let incur a scatter gather query.
var admin = db.getSisterDB("admin");
db.runCommand({enableSharding:'bank'});
db.runCommand({
shardCollection: 'bank.transactions'
, key: {_id:'hashed'}
});
_id key
Performance
A simple exploration of the performance on a single machine with MongoDb 3.0 shows the difference between MMAP and WiredTiger for a narrow simulation using the schema simulation framework mongodb-schema-simulator.
Scenario
MongoDb runs on a MacBook Pro Retina 2013 with 512 GB ssd and 16 gb ram. The simulation runs with the following parameters against a single mongodb instance under osx 10.10 Yosemite.
| Parameters | Value |
|---|---|
| processes | 4 |
| poolSize per process | 50 |
| type | linear |
| Resolution in milliseconds | 1000 |
| Iterations run | 25 |
| Number of users querying using metadata per iteration | 1000 |
| Execution strategy | slicetime |
MMAP
The MMAP engine is run using the default settings on MongoDB 3.0.1.

Successfully transfer between accounts
| Statistics | Value |
|---|---|
| Runtime | 48.086 seconds |
| Mean | 9951.659 milliseconds |
| Standard Deviation | 7131.847 milliseconds |
| 75 percentile | 12220.925 milliseconds |
| 95 percentile | 25750.843 milliseconds |
| 99 percentile | 34520.997 milliseconds |
| Minimum | 65.398 milliseconds |
| Maximum | 41428.625 milliseconds |
WiredTiger
The WiredTiger engine is run using the default settings on MongoDB 3.0.1.

Successfully transfer between accounts
| Statistics | Value |
|---|---|
| Runtime | 46.653 seconds |
| Mean | 8655.022 milliseconds |
| Standard Deviation | 5743.181 milliseconds |
| 75 percentile | 11037.199 milliseconds |
| 95 percentile | 19937.439 milliseconds |
| 99 percentile | 29972.029 milliseconds |
| Minimum | 77.812 milliseconds |
| Maximum | 40603.657 milliseconds |
The two storage engines MMAP and WiredTiger are fairly close in performance, as the document actually changes are small and thus do not adversely affect WiredTiger performance.
Notes
important
Notes
Real world applications will likely be more complex and require updating more than just the balance of the accounts. The application might need to update pending credits, pending debits and other details.
If these fields are part of the account document they can still occur within a single update, ensuring an atomic update of all the fields.