Geeks With Blogs

News This is the *old* blog. The new one is at blog.sixeyed.com
Elton Stoneman
This is the *old* blog. The new one is at blog.sixeyed.com

Following up on The concurrency shoot-out: SQL Server versus MongoDB (part 1), I extended my performance tests with some variations to look at the options for performance and reliability, and see how they affected the results.

Variant 1: reuse the database context

In the first version of my performance tests, I used a new database context for each part of the workflow (instantiating a new EntityContext in EF, or a new MongoCollection in Mongo for parts 2, 3 and 4 of the workflow).

Realistically, the workflow would be in a unit of work with a shared context, so in the second version I reused the context for the lifetime of each workflow,  so we still had the same concurrency at the server, but the client was reusing one connection.

That made a big different to the Entity Framework results:

image

EF+SQL Server got a lot better, processing 1,000 workflows at 100 concurrency in 23.4 seconds, and 10,000 workflows in 220.2 seconds. But Mongo performance improved too, running 1,000 workflows in 5.1 seconds and 10,000 in 53.2 seconds, so Mongo ran 4 times faster.

This is an obvious optimization, so the next variations built on this one.

Variant 3: NOLOCK with Entity Framework

With the default options, EF runs all database access – reads and writes - inside a SERIALIZABLE transaction, which is safe but not great for performance.  In our scenario we are working with one entity per workflow, so parallel workflows aren’t accessing the same data. We can tune EF to send reads with NOLOCK, meaning the database can read tables even if they are locked (e.g. for an INSERT in another workflow), and doesn’t have to wait for the lock to complete.

You can’t always do that, because of the risk of dirty reads - being returned data which has been changed in another transaction, but not committed. So if the other transaction fails and rolls back, you’ve got data which is out of sync with the database. If you can safely deal with that in your domain though, NOLOCK can give a big performance increase.

Running our EF context within a no lock scope gave us a 30% improvement:

image

SQL Server now ran 1,000 workflows in 15.8 seconds and 10,000 in 171.2 seconds, so Mongo is 3 times faster.

Variant 4: Journaling with Mongo (default journal commit interval)

With Mongo, when you call Save() the default driver option uses a write concern of Acknowledged – meaning you get a positive response from the server if the save is valid, but before the database has actually committed the changes.

You could call Save and get no errors, thinking your changes are permanent, but if the server went down before the committing the journal, your changes would be lost.

The default period for committing the journal is every 100ms, so we have a very small window for potential data loss, but the risk does exist.

For safety, you can specify a write concern of Acknowledged with journaling, meaning  the client call wants to wait until the server has committed the changes to the journal, before receiving an acknowledgement.

If the database does go down, when it comes back up it will move any outstanding updates from the journal to the data files before accepting new connections. With journaled acknowledgments. if you get a positive response from the Save, you can be confident the data is safe, even if the server goes down.

That comes at a pretty big performance cost:

image

Mongo ran 1,000 workflows in 27.0 seconds, and 10,000 in 241.8 seconds. That performance is on a par with variant 2 in the SQL test, with a reused entity context for the workflow, but it’s 4.5 times slower than Mongo without journaled acknowledgements. And it’s 40% slower than SQL running with NOLOCK reads.

Variant 5: Journaling with Mongo (minimum journal commit interval)

It’s easy to tweak journaling in Mongo, with the –journalCommitInterval switch, which lets you specify how long Mongo waits before committing in-memory changes to the journal on disk.

The default is 100ms, but you can set it between 2ms and 300ms, to get a balance between reliability and performance.

For the final test I set the journal interval to the minimum of 2ms – meaning a lot more disk access, but better response times for journaled acknowledgements:

image

Mongo now ran 1,000 workflows in 7.0 seconds, and 10,000 in 79.1 seconds, so even with a highly-reliable setup, Mongo now runs over twice as fast as the fastest SQL variant, and is more comparable with the fastest Mongo variant.

Next time round we’ll look at which variants we’re likely to actually go with, and why the performance difference between SQL Server and MongoDB is so marked.

Posted on Wednesday, December 18, 2013 4:24 PM SQL Server , Mongo | Back to top


Comments on this post: The concurrency shoot-out: SQL Server versus MongoDB (part 2)

No comments posted yet.
Your comment:
 (will show your gravatar)


Copyright © Elton Stoneman | Powered by: GeeksWithBlogs.net