Scott Muc

Another .Net Developer Named Scott

  Home  |   Contact  |   Syndication    |   Login
  29 Posts | 0 Stories | 27 Comments | 4 Trackbacks

News

Archives

Post Categories

Blogging Tools

Personal

Work Related

It came to my attention recently that some XML output (ASX files) is not working on the Mac platform. Asx is the Windows Media playlist file format. I am using the XmlTextWriter object to output to a text file, and these files are located on a webserver so people can click on them and listen to pretty music.

Unfortunately these files don't work on the mac because it chokes on the BOM. Removing the BOM fixed the issue. Yay, I'm happy and so is the client. Unfortunately I'm left feeling like I shouldn't have had to make that change. Why would the BOM be a problem?

I did some research and found a couple good resources: Unicode.org, Wikipedia

The one thing that I picked up is that the UTF-8 encoding doesn't have a specific endianness. So I don't need to have an explicit label to declare the edianness. Also, UTF-8 is byte for byte the same as ASCII. It's when you have a character in high-latin space that the document begins to break (Scott Hanselman).

Because of these findings, I've decided to start writing all of my Xml output without the BOM. These are the following assumptions for this decision:

  • I don't have to worry about apps that can't handle a BOM
  • Files will operate in Unix without any BOM issues in text editors
  • In Web requests, I can get the encoding via the mime type
  • If I am consuming Xml in my code, I already know it's Xml and don't need a BOM signature to tell me

The main negative that I see is that if a document is something other than UTF-8 and I attempt to open it in a text editor, I won't know the encoding... but if I only discard the BOM in UTF-8 documents, this shouldn't be a problem.

posted on Tuesday, April 17, 2007 3:11 PM