For a project, I have been setting up a schema using the flat file extension. The file to read looks like a modified Edifact message. This is what it looks like:
PNA,H123
LIN,2,XYZ0001
RFF,REFNO01
IMD,C,CODE1, VALUE1
IMD,C,CODE2,VALUE2
IMD,C,CODE3, VALUE3
The schema was created so that it reflects the structure of the message. For reading flat files, a few things had to be changed:
- Add the flat file extension to the schema
- Set the tag-identifier for each record that can be in the flat file (note: some additional group nodes were added to allow for 'group repeat'
- Set the child delimiter character, type and order for all nodes in the schema
A good indication about the effects off all settings can be obtained by using “generate instance“ every now and then. BizTalk should generate a “native“ (flat file) instance.
After some changes, the generated instance looked ok. Then I tried it the other way around: with a given example of the file, try to validate the instance. After I had resolved a few bugs related to the child delimiter settings, I continued to have errors. The parser reported messages like:
Unexpected data found while looking for:
'ERC'
'\r\n'
The current definition being parsed is LINgroup.
All problems were related to the optional, repeating records.
A blog entry gave an important hint. The “Parser Optimization” property (of the Schema node) has to be set to “Complexity”. The default setting (”Speed”) can not handle the optional, repeating records.