News

My Stats

  • Posts - 26
  • Comments - 25
  • Trackbacks - 55

Twitter










Recent Comments


Recent Posts


Archives


Post Categories


August 2009 Entries

Question marks in your flatfile output CONTINUED


In order to get the FFASM encoding bug looked at, Microsoft Support asked me to call an 0800 number and provide payment info :-(

Since I am still convinced every bug should be reported I will try again next month when I have my client's support contract details (the guy with the info is on a very long summer holiday :).

In the mean time I have developed a custom fix which you may use at your own risk...

This TranscodingStream class is a binary transformation stream decorator: while reading, the underlying bytes are transcoded on the fly from a source to a target encoding. Both source and target encoding are configurable.

You can make your own pipeline component that replaces the incoming message its bodypart stream by this TranscodingStream (and a context property to dynamically set the target encoding). In fact I have built it already, if you want it just let me know.

This way you can leave the FFASM to its default (UTF8) and let this new component do the work right-after. Also mind that the TranscodingStream doesn't prefix a BOM, which could be problem for UTF-16 (for the simple reason I didn't need it, like in YAGNI).

In retrospect my stream decorator could have been done a bit better. It violates the single responsibility principle since it does decoding + encoding in 1 step. This explains the Read's method cyclomatic complexity being too high. A cleaner design would have been: DecodingStream decorator that translates whatever source encoding to Unicode + EncodingStream decorator that translates Unicode to whatever target encoding. 

posted @ Thursday, August 06, 2009 3:19 AM | Feedback (0) | Filed Under [ BizTalk - EAI - B2B ]


Question marks in your flatfile output?


Or how to fix a bug while introducing another one that's a lot nastier. Let's start by describing the original bug: the BizTalk flatfile assembler has issues with custom target (output) encodings.

When compiling a custom pipeline with the FF assembler and configuring the 'target charset' in the pipeline designer everything works as expected and you will get your messages in the desired encoding.

When you want to dynamically control the encoding, according to the docs, you should also be able to do this by writing the XMLNORM.TargetCharSet property onto the message context.

In my case the desired output was 'Windows-1252'.

I was able to verify that this technique indeed works using a default XmlTransmit assembler pipeline.
With a custom FF assembling pipeline though I got a question mark in the flatfile output for every special character.
 
Since the only differentiator being the pipeline I am quite confident that it's a FFASM assembler bug.

Now to come to the second problem: apparently this bug was already discovered over a year ago and I guess it was never reported to Microsoft (since it is still not fixed in BTS 2009, the RC I checked). It might be a regression since people reported that it only happens from version 2006 R2.

Instead, the author decided the shortest path to success was to develop a custom encoding pipeline component that takes care of the output encoding. The issue showed up a few more times since then and it looks like other people were inspired by his workaround.

Now take a good look again at how the author implemented the custom encoding. There is something terribly wrong with it.

HINT: the input encoding is UTF-8, a variable byte length encoding. Inside the loop a fixed number of bytes is read from the input. Got it?

posted @ Monday, August 03, 2009 4:54 AM | Feedback (0) | Filed Under [ BizTalk - EAI - B2B ]