First, before going into preventing duplicate messages, we need to investigate where they come from.
In most cases there is a problem getting data from A to B which results in messages being resent and therefore potentially delivered muliple times. For example, if there is a complicated/unreliable system of routers (and multiple paths) between sender and receiver, you could see the following happen:
- Message is sent down route A
- A network bottleneck appears for some reason so message is delayed en route
- The sender gives up waiting for an acknowledgement of delivery and resends but the routers now use route B because of the bottleneck
- Bottleneck is cleared and original message reachs destination via route A
- The resent message also arrives at destination via route B but is now a duplicate
There are various other causes but the theme is the same - the sender (either the original MSMQ machine or some routing server or hardware router on the way) cannot know that the original message is still in transit or has actually been delivered before it needs to decide whether to resend or not.
As far as message delivery is concerned, MSMQ has two types of messages:
- Non-transactional messages
- Transactional messages
Non-transactional messages means Express messages, and also Recoverable messages that are not flagged as Transactional. You can regard Transactional messages as a special form of Recoverable messages.
The two types of messages have independent ways of ensuring messages do not get duplicated .
Preventing non-transactional duplicate messages
This is a relatively simple (but not fool-proof) method - the MSMQ service on a machine keeps in memory lists of messages it has received from other queue managers. There is a list for each queue manager (identified by its QMID) which simply contains all the IDs of received messages. As each message arrives from a particular sender, its ID is checked in the list and any duplicates are simply discarded. This is discussed further inthe following KnoweldgeBase article:
255546 Microsoft Message Queuing duplicate message removal mechanism
Note that the default is to maintain lists for a total of 10,000 messages which are cleaned up of old entries every half hour. This does, therefore, provide at least two scenarios for duplicate messages to sneak in under the wire and not be listed in the cache:
- If the queue manager is receiving over 10,000 non-transactional messages within the cleanup interval.
- Solution - benchmark the throughput of the MSMQ system and increase RemoveDuplicateSize accordingly.
- Note - longer lists take extra time to process which may impact overall performance.
- If duplicates take over 30 minutes to arrive after the original messages were sent.
- Solution - this one is tricky as how do you work out the worst-case for the lifetime of a duplicate? You could just keep raising RemoveDuplicateCleanup until no more non-transactional messages are processed.
- Note - Increasing the cleanup interval itself may impact overall performance as lists will stay longer for longer.
- If the cache is lost as a result of the MSMQ service being restarted.
If duplicate non-transactional messages must NOT be processed then one approach would be to have extra application code that is catching the processing of duplicate information (by referencing a back-end database, for example). This might not be practical, depending on if it is possible to check for duplicate processing and what the performance overhead of trying to catch them in your own code will be.
Preventing transactional duplicate messages
In the following article:
174307 Interpreting file names in the Storage directory in MSMQ
"To allow this continuous, cyclic writing to the QMLog, the state of receiving ordered messages and the state of all active transactions is periodically saved to the following files (respectively):
So when transactional messages arrive, they are tracked in files on disk. This persistent information does not get lost on service restart and so is much more reliable than the system used for non-transactional messages. The down-side is that files require disk I/O for updating which can cause performance issues. If the MSMQ system is a high-performance one with many transactional messages being received then you may see periodic pauses in message arrivals as the queue manager flushes data to the files before accepting new messages. If you are experiencing this then check out:
897326 FIX: You may encounter a slow performance issue causing any program to stop responding for several minutes on a Windows 2000-based computer