[v.2.9.0 2015-08-19]
I’ve decided to keep live the first version of this post. It is interesting how tests evolve and how results change. V.1.0.10 see here.
The v.2 was inspired by OniBait, thanks a lot! He added code for Avro and Bond serializers, added 99st% to measurements, made other improvements which significantly impacted results.
That’s the force and beauty of the open source.
Small improvements in the project code change results all the time. So I just mention the last improvement date.
The project code is on GitHub.
Any distributed system requires serializing to transfer data between systems and applications. The serializers used to be hidden in adapters and proxies, where developers did not deal with the serialization process explicitly. The WCF serialization is an example, when all we need to know is where to place the [Serializable] attributes. Contemporary tendencies bring serializers to the surface. In Windows .NET development, it might have started when James Newton-King created the Json.Net serializer and even Microsoft officially declared it the recommended serializer for .NET.
There are many kinds of serializers; they produce very compact data very fast. There are serializers for messaging, for data stores, for marshaling objects.
What is the best serializer in .NET?
Nope… this project is not about the best serializer. Here I show in several lines of code, how to use different .NET serializers. Want to serialize object and look for the sample code? You are on the right place, just copy-past this code in your project. The goal is to help developers with samples. Samples should be simple enough to copy-past code without effort. Also samples should provide effective code to be good enough for most messaging scenarios. I want to show serializer in the simplest way, but it is good to know that it would not hit your code performance. That is why I added some measurements, so you can do right decisions.
Please, do not take these measurements too seriously. I have some numbers, but this project is not the right place to get decisions about serializer performance. I did not spent time to get the best results. If you have the expertise, please, feel free to modify code to get numbers that are more reliable.
Note: I have not tested the serializers that require IDL for serialization: Thrift, , Cap’n Proto, FlatBuffers, Simple Binary Encoding. Those sophisticated beasts are not easy in work; they needed for something more special than straightforward serialization for messaging. These serializers are on my Todo list. ProtoBuf for .NET implementation was upgraded to use attributes instead of IDL, kudos to Marc Gravell. Single exception is the new Microsoft Bond (thank OniBait again for coding Bond part).
Installation
Most of serializers installed with NuGet package. Look to the “packages.config” file to get a name of the package. I have included comments in the code about it.
Tests
The test data created by Randomizer. It fills in fields of the Person object with randomly generated data. This object used for a single test cycle with all serializers, and then it is regenerated for the next cycle.
If you want to test serializers for different object size or for different primitive types, change the Person object.
The measured time is for the combined serialization and deserialization operations of the same object. When serializer called the first time, it runs the longest time. This longest time span is important and it is measured. It is the Max time. If we need only single serialization/deserialization step, this is the most significant value for us. If we repeat serialization / deserialization many times, the most significant values for us are Average time and Min time.
There are two average measurements:
- The Avg-100% : all measured times are used in calculation.
- The Avg-90% : 10% slowest measurements ignored.
Some serializers serialize to strings, others – to the byte arrays. I used a string as a denominator. The base64 format is used to convert byte arrays to strings. I know, it is not fair, because in many cases we use only a byte array to serialize, not a string. UTF-8 also could be more compact format.
Test Results
Again, do not take test results too seriously. I have some numbers, but this project is not the right place to get conclusions about serializer performance. You have to take this code and run it on your specific data on your specific workflows.
Here is a test result for the 100 repetitions. This number of repetitions shows stable results.
There is no such thing as the “best serializer”. If you invest time in optimizing code, the loser will be winner. If you change the test data, the winner would not be winner anymore. So I will mark serializer as a “winner” if it shows significant leadership in numbers, not just 1-3%.
There are several winners in each category.
Compression
This category important if you need smallest data size (on wire, in store…). Look at the Size: Avg measurement. All winners in this category use proprietary unreadable formats.
Winners are:
- Solar.Bois
- Avro
- Solar.Bois
- MsgPack
- MessageSharkSerializer
- NetSerializer
- Bond
- ProtoBuf
Notes:
- All Json serializers create strings with almost the same size.
- You can see the output serialized strings. They written to the trace output, so use DebugViw to see them.
- Many Json serializers do not work well with DateTime format out-of-box. Only NetSerializer and Json.Net take care of DateTime format without additional customization.
Speed on Single Run
This category important, if you only need make a single serialization/deserialization. Look at the Max measurement.
Winners are:
- Microsoft NetDataContract (surprise)
- Json.Net
- Microsoft Binary
Notes:
- Avro, Bond, and NetSerializer also show good results.
- Jil shows the worst result but, again, it means nothing.
- Serializers show biggest incoherence in this category.
Speed on the Large Scale
This category important, if you need many serialization/deserialization acts on the same objects. Look at the Min and Avg:-90% measurements.
Winners are:
- NetJSON
- Bond
- MessageSharkSerializer
- NetSerializer
- Jil
- Avro
- ProtoBuf
- Solar.Bois
Notes:
- MsgPack and BondJson also show good results.
Speed on Several Cycles
Winners are:
- Avro
- Bond
- NetSerializer
- Json.Net
This category important, if you need several serialization/deserialization acts on the same objects. Look at the Avg:-100% measurement. It is a mix of slowest and fastest times. As you can see, MessageShark, Jil and ProtoBuf not in leaders anymore because of the suboptimal initialization. But Json.Net now is a leader because of good initialization time on our test data.
Notes:
- The classic Json.Net serializer used in two Json.Net (Helper) and Json.Net (Stream) tests. Tests show the difference between using the streams and the serializer helper classes. Helper classes could seriously decrease performance. Therefore, streams are the good way to keep up to the fast speed without too much hassle.
- The first call to serializer initializes the serializer, that is why it might take thousand times faster to the following calls.
- For Microsoft Avro my simple serialization interface is patched. For some reason it is not possible to pass generic T type into it. Type was hardcoded. Because of this hack, Avro makes tests faster.
- Json and Binary formats bring not too much difference in compacting the serialized strings.
- Serialization of the classes with explicit constructors usually has negative impact on the serialization speed. That’s why test class (Person) creation is implemented not by constructor but by a method.
- Many serializers do not work well with Json DateTime format out-of-box. Only NetSerializer and Json.Net take care of DateTime format without special treatment.
- Test prints the test results on the console. It also traces the errors and the serialized strings, which can be seen in DebugView for example.
- The Apolyton.Json and HaveBoxJSON serializers failed in tests, that why we see zeroes in several measurements.