Convoys are frequently used in the in the orchestrations in BizTalk Server. Sometimes we have got contra intuitive behavior in the convoys, when messages and orchestrations get suspended in unpredictable manner. This issue is well-known, the suspended messages get name the "zombie". The name is unofficial, but issue is still there. Here I would describe in details, what, when and why this zombie situations are happen.
An orchestration can be enlisted with many subscriptions. In other word it can have several Receive shapes. Usually the first Receive uses the
Activation subscription but other Receives create the
Instance subscriptions. [
See “Publish and Subscribe Architecture” in MSDN]
Here is a sample process.
Let's experiment started.
There are three possible scenarios depended of the message sequences.
First scenario: everything is OK
Activation subscription for the Sample message is created when the orchestration the SampleProcess is enlisted.
The Instance subscription is created only when the SampleProcess orchestration instance starts and this subscription is removed when the orchestration instance ends.
So far so good, the Message_2 is delivered exactly in this time interval and consumed.
Second scenario: no consumers
Three Sample_2 messages are delivered. The first one is delivered before the SampleProcess starts and before the instance subscription is created. Second message is delivered in the correct time interval. The third one is delivered after the SampleProcess orchestration ended and the instance subscription was removed.
Note:
· It is not the first Sample_2 consumed. It was first in the queue but it was not waiting, it was suspended when it had been delivered to the Message Box and didn’t have any subscribers at this moment.
The first and the last Sample_2 messages are Suspended (Nonresumable) in the Message Box. For each of them two (!) service instances have created. One service instance has the ServiceClass of Messaging, and its Error Description is:
The second service instance has the ServiceClass of RoutingFailureReport, and its Error Description is:
Third scenario: something goes wrong
Two Sample_2 messages are delivered. Both are delivered in the same interval, while the SampleProcess orchestration is working and the instance subscription has created.
First Sample_2 is consumed. The second Sample_2 has the subscription, but the subscriber, the SampleProcess orchestration, will not consume it. After the SampleProcess orchestration is ended (And only after! I will discuss this in the next article.), it is suspended (Nonresumable). Now only one service instance is suspended. This service instance has the ServiceClass of Orchestration, and its Error Description is:
In the Message tab the Sample_2 message is in the Suspended (Resumable) status.
Notes:
- The orchestration consumes the extra message(s) and gets suspended together with these extra messages. These messages are not consumed in term of “processed by orchestration”. But they are consumed in term of the “delivered to the subscriber”. The receive shape in the orchestration does not receive these extra messages. But these messages are routed to the orchestration. The Error information looks ambiguous.
- The time zone between the last receive shape and the end of the orchestration is a "dangerous zone". The message delivery pattern should be scrutinized to avoid it.
Unified Sequential convoy
Now one more scenario.
It is a unified sequential convoy. The activation subscription is for the same message type as it for the instance subscription. The Sample_2 message now is the Sample message. For simplicity the SampleProcess orchestration consumes only two Sample messages. Usually the orchestration consumes a lot of messages inside a loop in this scenario, but now there are only two of them.
First message starts the orchestration, the second message goes inside this orchestration. Then the next pair of messages follows, and so on.
But if the input messages follow in shorter intervals we have got the problem.
We lose messages in unpredictable manner.
Conclusion:
- Maybe the better behavior would be if the orchestration removes the instance subscription after the message is consumed, not in the end on the orchestration. This behavior looks like a bug. But right now it is a “feature” of the BizTalk subscription mechanism.
- The time zone between the last receive shape and the end of the orchestration is a "dangerous zone". The message delivery pattern should be scrutilized to avoid this zone as much as possible.
Note [2011-2-9]:
- I saw several times the explanation of the zombies, where the zombie can be created in the time zone between the moment an orchestration is scheduled to dispose and the moment the orchestration is disposed. I.e. the average dangerous time zone is about half of the MessageBox heartbeat interval (by default it is 1 sec), that means 0.5 sec. It is not correct. The dangerous zone is between the last receive (for the message with the instance subscription) and end of orchestration, and this zone could be much bigger than a heartbeat.
See more about zombies in BizTalk from the BizTalk Core Engine Blog
Print | posted on Saturday, February 5, 2011 5:10 PM