BizTalk Server and Agile. Can they live together?

So far I saw only the waterfall methodology in the BizTalk project development. I worked for small and big companies and everywhere I saw only waterfall. Is there something special in the BTS project that Agile is never used with it?

So far we, BizTalk developers, have all disappointments of the waterfall development: the long stages of the project, huge and unusable documentation, disagreements between users, stakeholders and developers, bloated code, unsatisfactory quality of the code, scary deployments and modifications.

Recently our team decided to use Agile principles to address these issues. We had our victories and our defeats, but right now we feel better on our journey to the “Agile world”.

The business goal is simple, we desperately need faster development. When we deploy and test new code in hours not weeks, we make a lot more iterations, we make and find and fix a lot more errors. Which is just great. Now error is not a catastrophe, it is a small thing. That means the more reliable code. Now our applications are more reliable and we fix errors very fast.

The main reason to use Agile is because of economics. It is not only faster and more reliable, it is cheaper.

We decided to use Agile together with SOA and microservice architecture.  First we thought, the BizTalk is too heavy tool set to be used with Agile. But it happens the BizTalk Server has very special set of attributes that suits microservice architecture very well out of the box. If you think about orchestrations and ports as the microservices, this part of BizTalk fits perfectly to SOA.

Stoppers

Three main things keep BizTalk developers from using Agile. It is the artifact dependencies, “niche, unnecessary tools”, and manual deployment.

Here I am talking only about technical side of problem. The management side will be touched later.

Artifact Dependencies

Dependency is the other side of the code reuse. And here lays one of the main differences between BizTalk applications and generic applications. The later are created as a set of dll-s on C#, Java or any other programming language. In many cases we prefer to simplify things with code reuse, which creates some dependency problems, but usually this is not a big problem. What about the BizTalk applications? BizTalk Server keeps strict control of the working artifacts. Reliability is the king. We cannot just replace one buggy artifact if there is a dependency on it. If hurts redeployment but, remember, reliability is the king. Anything else is not so important. So for us, as the BizTalk developers, the cost of dependency is really high.

Moreover, dependency is a stopper for the SOA application. Keep services independent, and it is easy to modify then, to add new one, to the service versioning.

Niche, Unnecessary Tools

The BizTalk Server is a big toolset. It is impossible to hire a full team of the expert BizTalk developers. Most of the folks are not specialized in BizTalk too much.

So if we keep the technology stack limited, the time to make a new developer productive would be short. Any tool could be replaced by C# code and the decision was to use a bare minimum of the BizTalk tools: Schemas, Maps, Orchestrations.

Completely prohibited are BRE, ESB Toolkit.

The custom pipelines, direct binding and xslt are limited to the very special cases.

The differentiator was a question “Is this tool 100% necessary and does it require a special skill set?”.

I am not going to start another holly war. Our decisions are based only on our use cases. In your company, in your zoo the decisions would be distinct.

Deployment

How to make development iterations fast? One part of development cycle is deployment. It is not a problem at all if you write your program on Python or Go. It is not so big problem if you write your app on .NET. But in BizTalk development it is the Problem. So any methods to keep deployment quick are very important.

BTDF was a necessary tool in our case but the BizTalk PowerShell Provider is also used.

And deployment is always about dependencies. Penalize more dependencies and prize less dependencies, that is the idea.

Technology Rules

We started with defining rules about technology side of the projects. We forced several SOA and microservice rules:

  • Service size: Service encapsulates a single business function and exposes only a single interface. This rule effectively cuts a BizTalk application into a single orchestration (or couple ports) in most cases.
  • Shared Contracts, API: The services must communicate only through contracts. The only permitted dependences between services are the contracts. We never share maps, orchestrations, ports, pipelines between applications. We only share schemas and API-s.
  • Versioning: the service upgrade is published as a new service. Once published service is never changed and could be only removed, not changed.
  • Tests: Tests are important part of application. Application without tests is not approved. We need only user acceptance tests. The minimum test set should cover: Successful tests and Failure tests. Performance tests and Unit tests are not mandatory. The special test data is part of design. We tried to design our application in such a way that test data can be used in production together with production data.
  • Automation: Automated deployment is important part of application. Application without automated deployment is not approved.

As you can see this list has some specifics for the BizTalk projects. BizTalk is oriented to XML that’s why we tell an XML Schema when we mean a Contract. In BizTalk projects it is really hard to test endpoints, so we tried to be easy with testing. The deployment of the BizTalk application is the complex and long process, so we force the deployment automation.

These rules were not easy to push in development. There were many unanswered questions on the way: What size is “small” and what is “big”? What modification is considered as a new version? What test coverage is enough?

These rules are not ideal. We had discussions, we added and removed rules, we changed them. We are still in the process.

Team Rules

Those are the “technological” rules. But we need the management, team rules, because Agile is about the team structure and communications, right? The Conway’s law cannot be ignored if you think about Agile. Also we cannot avoid all this hype with DevOps movement. So we put in practice some additional “team” rules:

  • One service implemented by one developer.
  • One service implemented in one week sprint.
  • Service developer is DevOp, he/she is responsible for deployment and all operations of the service in all environments, including production.

“One week” rule is just happened after couple months of experiments. We tried 2-3 days and 2 weeks. One week works because we want to keep all projects in one team. All developers work remotely, we never met in team together. And our team is not big, so “one week” was cause of those factors.

One of the key issues with Agile and DevOps is the service knowledge is tightly coupled with single developer. Enterprises cannot tolerate this issue because it drastically increases risks to lost service if developer is not accessible. So we added a new role (the senior developer) and two more “team” rules:

  • A Senior developer approves the service design. This senior developer performs tests and deployment of the service in production and sign service to production.
  • If service developer or service senior developer is not accessible, a new developer should take responsibility for this role.

The “Senior Developer” rule is tricky. With this rule a senior developer cannot just sign documents, approve something and voila. No way. This rule effectively forces senior developer to make good code review and monitor all development steps. Officially this rule is cover two short tasks, but they are short only if this team of two invests good amount of time to communicate all details of the service.

Dependency Rule

The true hart of the SOA and microservice is in limited dependencies. With BizTalk applications the dependency problem is even more important than in the standalone applications, because the BizTalk controls dependencies and prohibits many shortcuts permitted in the standalone applications. So eliminating dependencies simplifies development and makes possible to make simple services, the microservices, which is our goal.

But we cannot just remove all dependencies. The service dependencies are the necessary evils. We should share schemas, dll-s, services. So there are special “dependency” rules:

  • Any changes in the shared resources, any changes that could go outside of the service boundaries must be approved by the whole service team. All team members should agree on this change. Any team member can veto the change.

For example, we implemented the shared infrastructure for logging. The first approach was to create the shared library (dll) and force everybody to use it. It was vetoed. The second approach was to create a special logging services and expose it as a single endpoint. It was also vetoed, because the code to use this service was too big. The next try was to use log4net or NLog and standardize the log format only. Then there was the next attempt and more. Now we discuss the InfluxDB, but what is matter, now we know much-much more what we need in reality and what we don’t need.

Complete Rule Set

Now we have this rule set:

  • Single Interface
  • Shared Contract
  • Never Change [Published]
  • Test inside Application
  • Automated Deployment
  • DevOp
  • Senior Developer
  • New Developer
  • New Dependency

These rules are not ideal. For example, we still struggling with the Wrong Requirements problem. How to fix this problem?

Our rule set is too long. Now we consider to join the Test and Deployment rules.

This rule set works for our team. What is special about our team? A half of the team members are the full-time employee, which enables the DevOp and Senior Developer rules. Not sure these rules would work if all members are contractors. Our team has a big list of projects to develop and the big list of applications to support. If you have mostly applications to support, our approach is possibly not the best fit for you.

I did not describe the communications with our customers and stakeholders, which is important part of the whole picture.

I did not describe the Agile practices we use (but we definitely use the Kanban board), it is not the point of this topic.

We were happy with our management, which took a risk of changes in processes and the team. Now the management is happy (hmm… almost happy) with BizTalk development and operations. Now we are SUPERFAST team! We are not sharks, not yet, but we are not jellyfishes, not anymore.

NServiceBus vs. MassTransit

I have got an interesting request to compare and choose the right integration technology for one of my customer.

By the end of the day the technologies were limited by NServiceBus and MassTransit systems only.

The comparing process is simple. We create a sieve and filter the technologies through it. In the end we have got several, one or even zero winner.

The filters are the mix of technology, life cycle or old man questions. The whole filter process looks unscientific and unsystematic but it is sane (I hope) and simple.

Here are the filters:

  • What systems do we integrate?
  • What are your development resources: the team skills, the team agility, the team size?
  • Java, .NET or both?
  • What is the life horizon of the integration? 1, 2, 5, 10 years?
  • The nonfunctional requirements:
    • reliability
    • sustainability
    • scalability
    • performance: throughput (messages per sec/day); message size; latency
  • etc.

The integration technology could fail just on one filter and it will be enough to filter it out of competition.

So how NServiceBus and MassTransit compete there? They are very similar in technology aspects and a non-technology factor should give us the winner. It is the life horizon factor.

Here is the development activity on the code base of both systems:

image

 

image

Do I need to comment on those graphs? Probably not.

OK Let’s do one more check with the Google Trends:

image

There is a company that supports the NServiceBus, it is the Particular Software. The MassTransit is supported by open source community only, that had shown a lot of enthusiasm at early years but pretty much stopped by now. The key developers of the MassTransit have moved on some other projects.

I did not work with any of this system and, by any means, I do not choose the “best” integration system. Some of the features of one system could be much better then other, it doesn't matter now. I do not choose the better system, I do chose the failed system only for my specific requirements.

It happens the customer wants the integration code works and constantly improves at last next 5 years. From this perspective the MassTransit failed.

[Update: 2014-07-16
 BTW One of the core developer of the NServiceBus, Udi Dahan, made a comment on this post about the first picture. Seems a good proof of the healthy system.]

Domain Standards and Integration Architecture

The domain standard schemas are the repositories of domain knowledge. The message schemas are schemas for the real data transferring between real systems. Use the domain standard schemas as a reference model for your schemas. Do not use the domain standard schemas as your message schemas.

EDI to XML or EDI to SQL?

    In the EDI processing we need to transform data from the EDI format to the data formats of our applications and back. App to EDI to App - ERD

    What is the best way to do this?

    The most popular transformation is using an intermediate XML format. We use the XML Schema and the XSLT standards to transform one XML format to another XML format. Is it the best way?

    Let’s look at the whole data processing chain:

    EDI to XML to SQL - ERD

     

    1. EDI to XML transformation;

    2. data transformation (with XML Schema and XSLT);

    3. XML to SQL transformation.

        Now I’m going to step back and look at the EDI document structure a little bit.

        Does the EDI structure resemble the XML structure or the SQL structure?

        Definitely it resembles the SQL. The EDI segments are like the SQL table records. The EDI elements are like the table columns. (Sometimes the elements are compounded from several subelements but it is unimportant now.)

        Does the EDI structure resemble the XML structure? It is not. The XML relations are expressed by nesting structures. For example in XML the Order Detail records are nested inside an Order record. In EDI the Order Detail and Order segments are not nested, the relations are defined the same way as it is in SQL by some correlated IDs in related segments.

        So, EDI structure resembles the SQL structure not the XML structure.

        Also remember our final goal, which is the application data in the SQL format. Can we just bypass the XML format and transform EDI directly to SQL format?

        This is more natural. We throw out two complex EDI to XML and XML to SQL transformations and replace them by single EDI to SQL transformation. Why the EDI to XML and XML to SQL are so complex? Because we have to map the reference relations to the nesting relations or vise versa. It is not simple. There are the whole books teaching us tips and tricks for these transformations.EDI to XML OR XML to SQL - ERD

        Why EDI to SQL transformation is simple? Because it is one to one mapping, a segment to a record and an element to a field, the SQL-like EDI structures to the SQL application structures.

        EDI to SQL - ERD

        The is one problem with this new approach. We have to create a code for the EDI to SQL transformation. It is not a hard problem if we use the contemporary techniques like Linq or Entity Frameworks. Those techniques look competitive even against the adapters implemented in specialized integration systems like the BizTalk Server of Mule ESB. Code is pretty straightforward if we use two step transformation.

        EDI to SQL to SQL - ERD

        1. EDI to SQL transformation;

        2. data transformation (SQL to SQL)

          1. The first step is to transform EDI segments and elements to the SQL records and fields that mirror of the EDI structure (EDI SQL), and the second step is to map those SQL records to the SQL records of our application database (App SQL).

            This sample shows us how we could simplify our solutions, i.e. how to do some real architecture.

          BizTalk: Deployment Hell & BizTalk Deployment Framework

          Is there something special in the BizTalk Server application deployment? Why is it so special?

          BizTalk Deployment Hell

          For the .NET applications the live is simple. There is an exe file and maybe several additional dll-s. Copy them and this is pretty much all deployment we need.

          The BizTalk Server requires the dll-s placed in GAC and registered in the special Management database. Why this is required? Mostly because the BizTalk automatically hosts applications in the cluster and because of the reliability. BizTalk application maybe is not easy in deployment but it also not easy to break. For example, if the application A is related to the application B, BizTalk will prevent us from removing B accidentally.

          This Why? question is a big theme and I will not cover it here.

          Another factor is the BizTalk has many pieces that require a special treatment in deployment and run-time.

          Yet another factor is the BizTalk application integrates the independent applications / systems, that means the application should take care of many configuration parameters, as endpoint addresses, user names and passwords and so on.

          As a result the BizTalk application deployment is complex, deployment is error prone, slow, unreliable, requires a good amount of resources.

          And look here, it is a savior, the BizTalk Deployment Framework.

          The BizTalk Deployment Framework - the Champion

          The BizTalk Deployment Framework (BTDF) is an essential tool in arsenal of the BizTalk Server developers and administrators. It solves many problems, it speeds up the development and deployment enormously.

          It is a definitely a main face in the BizTalk Server Hall of Fame. 

          BTDF was created by Scott Colestock and Thomas Abraham.

          It is an open source project despite the fact, that BTDF is more powerful, reliable and thorough than the most of commercial BizTalk Server third-party tools. I think it is fair to donate several dollars to those incredible guys on the Codeplex. Just think abut days and months we save in our projects, in our private life.

          BTDF is an integration tool and it is created by guys with pure integration mindset. It integrates the whole bunch of open-source products in one, beautiful BTDF. Below I copy the "Contributors and Acknowledgements" topic from the BTDF Help:

          Thanks also to:

          • Tim Rayburn for contributing a bug fix and unit tests for the ElementTunnel tool
          • Giulio Van for contributing ideas and bug fixes.
          • The hundreds of active users across the world who have promoted the Deployment Framework, reported bugs, offered suggestions and taken the time to help other users! …”

          And how exactly BTDF saves our life?

          BTDF is installed and tuned up. Now it is time to deploy an application.

          DeploymentMenu.Deploy

          The deployment was successful and I’ve got the long output, what exactly was done in this deployment.

          Here is a log. I've marked my comments in the deployment log with “+++”. The full redeployment lasted 2 minutes.

          If I perform the same amount of tasks manually, I would spent at last  3-4 additional minutes fully concentrated on this long list of deployment tasks. The probability of missing some step would be quite high as the probability of errors. With BTDF I start deployment and free to do another tasks.

          So my gain is 3+ minutes on each deployment. Risk of an error is zero, everything is automated.

          There is one more psychological problem with manual deployment. It is complicated and it requires the full concentration. As a developer I am concentrated on the application logic and a manual deployment task breaks my concentration. Each time after deployment I have to concentrate on my application logic again. And here is where the development performance goes down to the hell.

          There are other helpful BTDF commands:

          DeploymentMenu.All

          • Restarts all host instances, only those are used in this application; restart IIS, if I deploy web-services, all with the "Bounce BizTalk" command.
          • Terminate all orchestration and messaging instances remained after tests.
          • Install the .NET component assemble into GAC.
          • Include the modified configuration parameters into a binding file.
          • Import this binding file.
          • Update SSO with modified configuration parameters.

          All this with one click.

          You do not have to use the slow BizTalk Administrative Console to do those things anymore.

          Check my comments in the BTDF deployment log. Several automated tasks are like blessing! We do not create the drop folders anymore, do not assign permissions to those folders. BTDF makes this automatically.

          We do not care about undeployment and deployment order, BTDF makes everything right.

          We do not stop and start ports, orchestrations, applications, host instances, IIS. Everything is automated.

          Look at the list of the top level parameters in BTDF:

          DeploymentProperties

          Remember, I told you the BizTalk application deployment is a little bit complicated? All those application components in this list require a little bit different treatment in deployment. In a simple application we do not use all those components, a typical application uses maybe 1/3 or 1/2 of them, but you have an idea.

          How to tune up BTDF?

          One thing I love in BTDF is the Help. It is the exemplary, ideal, flawless help.

          If you never tried BTDF, there is a detailed description of all parts, all tasks. Moreover there is the most unusual part, the discussions of the BTDF principles and the BizTalk deployment processes. I got there more knowledge about the BizTalk Server deployment than in the Microsoft BizTalk Server Help - official documentation.

          BTDF Help helps if you are a new user or if you use BTDF several years in row. Descriptions are clear, they are not stupid and arranged in clear hierarchy. BTDF Help is one of the best. You are never lost.

          Of course there is a detailed tutorial in Help and the sample applications.

          OK Now we have to start with tuning.

          Typical BTDF workflow for setting up a deployment of a BizTalk application:

          1. Creating a BTDF project.
          2. Setting a deployment project.
          3. Creating an Excel table with configuration parameters
          4. Setting a binding file

          All configuration parameters are managed inside an Excel table:

          Forget about managing different binding files for each environment. Everything is inside one Excel table. BTDF will pass parameters from this table into binding file and other configuration stores.

          Excel helps when we compound parameters from several sources. For example we keep all file port folders under one root. The folder structure below this root is the same in all environments, only the root itself is different. So there is a parameter "RootFolder" and we use it as a part of the full folder paths for all file port folders. For example, we have a "GLD_Samples_BTDFApp_Port1_File_Path" parameter which defined in the cell with formula: = "RootFolder" + ""GLD_Samples_Folder" + "BTDFApp_Folder" + "Order\\*.xml" (of course in the Excel cell it would be something like = C22 & C34 & C345 & "Order\\*.xml"). If we modify the RootFolder path, all related folder paths will be modified automatically.

          OK We work hard setting up the application development. Now surprise, we have to create a new environment. The best part of BTDF configuration fiesta is there: all configuration parameters for ALL ENVIRONMENTS for an application are here just in one table. (If you are tenacious enough you could keep a single table for ALL applications, but you have to ask me how.)

          Settings for different environments: In our Excel table we copy-past a column of one of the existed environment to a new column for a new environment. Then we modify the values in this new column. And again, this is an Excel table, and if we define single RootFolder variable for a new environment and voila, all file port paths for this environment are modified.

          Now we have to pass those configuration parameters to the binding files, right?

          We replace all values in the binding file which are different for different environments, like the Host names, NTGroupName, Addresses, transport parameters like connection strings, etc. We replace these values with the variables, like this:BindingFileParameters

          Now we save this binding file with the PortBindingsMaster.xml name.

          That is pretty much everything we need. Now execute the Deploy BizTalk Application command and application is deployed.

          Deployment into a Production environment is different. It is limited in the installed tools and we have to do additional installation steps. BTDF creates an msi and command file. This msi includes all additional pieces we need to install. We don’t have manually add resources to the BizTalk application in the Administrative Console anymore.

          Conclusion: BizTalk Deployment Framework is a mandatory tool in the BizTalk development. If you are a BizTalk developer you must use it, you must know it.

          Complex XML schemas. How to simplify?

          [Sample code is here: ]


          The XML Schemas are used for two main tasks: 

          • for processing XML documents (for the XML document validation and for the XML document transformation); 
          • for defining the domain specific standards.

          XML Schemas and Domain Standards

          Let's talk about the domain standards. EDI, RosettaNet, NIEM, ebXML, Global Justice XML Data Model, SWIFT, OpenTrave, Maritime Data Standards, HIPAA, HL7, etc. If we look at those standards, we see that schemas embrace the domain knowledge in form which can be formally and officially validated. [In this article I discussed those standards in more details: Domain Standards and Integration Architecture] The XML Schemas are very helpful for such tasks.

          Compare standards which defined in form of the XML schemas and in form of the documents. It is almost impossible to verify if the data satisfy the standard or not if we use the text document where this standard is defined. And it is possible to validate it and validate it automatically, if we use the XML Schemas.

          The domain specialists use XML Schemas to define standards in unambiguous form, in machine verifiable format.

          Those schemas tend to be large, huge and very detailed. And it is for very good reasons.

          But if we start to use XML Schemas for the first task, for processing XML documents in our programs, we need something different, we need the small schemas. In the system integration we need small schemas.

          We don't need an abundance of HIPAA schemas in most applications. We only need a small portion of schema to validate or transform the significant for this application part of schema.

          We upload the megabyte size schemas, we perform mapping for these huge schemas, and it lasts for eternity and it consumes huge amount of CPU and memory.

          For the most integration projects we don’t want to validate data to satisfy the standard. We want to transfer data between systems as fast as possible with minimal development effort.

          How to work with those wealthy schemas? How to do our integration fast at run-time and in development?

          First we have to decide, does our application require the whole schemas or not?

          If the answer is "No" we could read further to solution.

          How to Simplify?

          Solution is to simplify the schema. Cut out all unused parts of schema.

          The first step in our simplification is to decide which parts of original schema we want to transfer further, want to map to another schema. We keep these parts unchanged and we simplify all other unnecessary schema parts.

          The second step is to research if the target integrated system perform the data validation of the input data or not. Good system usually validates input data. Validation includes the data format validation (is this field integer, date type or does it match a regex expression?), the data range (is this string too long or is this integer too big?), the right encoding (is this value belong to the code table?), etc.

          If the target system performs this validation, it doesn't make sense to us perform the same data validation on the integration layer. We just pass the data without any validation to the target system. Let this system validate data and decide what to do with errors: send errors back to the source system or try to repair or something else. Actually it is not good architecture, if an intermediary (our integration system) is trying to do such validations and decisions. It means spreading the business logic between systems where target system delegates the data validation logic to intermediary. The integration system deals with data validation only if it needed.

          Example: HIPAA Schema Simplification in the BizTalk Server

          Now let's be more technical. The next example is implemented in the BizTalk Server and the HIPAA schemas, but you can use the same principles with other systems and standards.

          The first step in the schema simplification is the structural modification. It is pretty simple. We replace the unused schema parts with <any> tags [http://www.w3.org/TR/xmlschema-0/#any]. If we are still want to map this schema part but without any details, we can use the Mass Copy functoid.

          The second part of the schema simplification is the type simplification.

          For the HIPAA schemas I use these regex replacements:

          Open your schema with XML (Text) Editor mode:

          image

          Click Ctrl-Shift-H (Find and Replace in Files) and check “Use Regular Expressions” option:

          image

          Make two replacements:

          • type="X12_.*"  --> type="xs:string"
          • <xs:restriction base="X12_.*">.*\n.*\n.*\n.*</xs:restriction>  --> <xs:restriction base="xs:string"/>

          Save and close.

          Open schema again with Schema Editor, make any small change and undo it. Editor will recalculate type information and pops up the Clean Up Global Data Types window. Check all types and click OK.

          image

          This cleans up all unused Global Data Types.

          Previously we replaced all those types with “xs:string” type and those types are not used anymore.

          It takes 5 minutes for this replacement. What is the result?

          image

          The modified schema is twice smaller.

          image - is the dll size with original schema.

          image - is dll size with modified schema.

          The assembly for modified schema also cut twice in size.

          Result is not bad for 5 min job.

          How these simplified schemas change our performance?

          All projects with schemas and maps are compiled in Visual Studio notably faster. I like this improvement as a developer.

          How about the run-time performance?

          I have made a simple proof of concept project to check the performance changes.

          Test project

          The project compounded of two BizTalk applications and two BizTalk Visual Studio projects. Do not do this in production projects! One Visual Studio solution should keep one and exactly one BizTalk application.

          Each project keeps one HIPAA schema, one very simple schema, one “complex” map (HIPAA schema to HIPAA schema), and one simple map (HIPAA schema to the very simple schema).

          The first project works with original HIPAA schema and the second project with simplified HIPAA schema.

          Build and Deploy one project.

          Each BizTalk application compounded of a receive file location and a send file port. The receive location uses the EdiReceive pipeline to convert the text EDI documents into the XML documents. So we need to add a reference to the “BizTalk EDI Application”:

          image

          After deployment import the binding file which you find in the project folder. Create the In and Out folders and apply necessary permissions to those folders. Change the folder paths in the file locations for your folders.

          There is also a UnitTests project with several unit tests. Change folder paths in the test code.

          Perform tests.

          Then delete the application and deploy second BizTalk project and perform tests again.

          Do not deploy both projects side by side.

          Performance results:

          Note: Before each test start the Backup BizTalk job to clean up a little bit the MessageBox.

          image

          Tests for 1, 10 and 100 messages did not show visible difference. The difference could be noticeable in my environment in 1000 message and 3K message batch tests. The above table shows the test result for 3K batch tests.

          The performance gain is about 10%. It is not breathtaking but anyway it is not so bad for the 5 minutes effort.

          Conclusion: The schema type simplification is worth to do if the application expects the sustainable high payloads, the high peak payloads, and everywhere you want to get the best possible performance.

          BizTalk: Complex decoding in data transformations

          Sometimes we need to make complex decoding in the data transformations. It could happen especially in the big EDI documents as HIPAA.

          Let’s start with examples.

          In one example we need to decode a field. We have the source codes and the target codes for this field. The number of both codes is small and the mapping is one to one or many to one (1-1, M-1). One of the simplest solution is to create a Decode .NET component. Store the code table as a dictionary and decoding will be fast. We could hard code the code table, if codes are stable, or extract it in cache, reading it from a database or a configuration file.

          The next example is on the opposite side of complexity. Here we need to decode several fields. Target codes related to several source codes/values (M-1). It is not the value-to-value decoding but this decoding also includes several if-then-else conditions, which opens a can of worms with 1-M and M-M cardinality.
          Moreover the code tables are big and cannot be placed in the memory.

          We can implement this decoding with numerous calls to the database to get the target codes and perform these calls inside a map or inside a .NET component. As a result for each document transformation we calls database many times.

          But there is another method of implementing this without flooding the database with these calls. I call this method the “SQL Decoding”.

          We might remember that SQL operations are the set operations working with relation data directly. The SQL server is very powerful in executing these operations. Set operations are so powerful, we might decode all fields in a single operation. It is possible, but all source values should be in database at this time. All we have to do is to load the whole source document to the SQL data. Hence our method is:

          1. Load the source message to SQL database.
          2. Execute encoding as a SQL operation or series of operations.
          3. Extract the target message back from the SQL.

          We can do all structure transformations in maps (XSLT) and perform only decoding in SQL form. Or we also can do some structure transformations in SQL. It is up to us.

          The Pros of this implementation are:

          • It is fast and highly optimized.
          • It does not flood database with numerous calls.
          • It nicely utilizes the SQL engine for complex decoding logic.

          The Cons are:

          • Steps 1 and 3 can be not simple.

          In real life we usually don’t have a clear separation between our scenarios and the intermediate solutions can be handy. As an example we can load and extract not a whole message but only part of it, related to decoding.

          Personally, I use the SQL Decoding in the most complex cases where mapping takes more than 2 days of development.

          Note:

          • If you are familiar with LINQ, you can avoid steps 1 and 3 and execute set operations directly on the XML document. I personally prefer to use LINQ. But if XML document is really big, SQL approach works better.

          Conclusion: If we need a complex decoding/encoding of the Xml document, consider to use the set operations with SQL or LINQ.

          BizTalk Integration Development Architecture

          You can find some architecture information into the BizTalk documentation. You find out several tutorials and good amount of samples. Almost all of them related to the infrastructure architecture, i.e. how to create high available systems with clusters, how to scale out the BizTalk systems, etc. 
          I covered several development aspects of architecture in series of articles:

          Copying a new build to all environments

          I am doing this task again and again, so maybe this code will be helpful not only for me.

          That is a standard routine. I am developing a BizTalk Server applications and use the BizTalk Deployment Framework (BTDF) for all my deployments. When an application is ready for testing and, at the end, for production, the build files have to be deployed. Usually the BizTalk installation has several environments. For example, the environments can be: Development, QA, Staging, Production. Sometime less, sometime more. The best practice is to keep all environments isolated of each other. So each environment keeps its deployment packages separately. That means, in my case, the build files should be copied to all environments.

          A good practice is to save the old builds in case of rollback.

          The folder structure for the builds looks like this:

          image

          The Current folder keeps the currently deployed build. The [YYYYMMDD_hh_mm_ss] folders keep the old builds.

          What is interesting in this code?

          The Copy performs two nested loops. One through the EnvironmentName and the second through the NewBuildToCopy files.

          Also the code generated the folder name in [YYYYMMDD_hh_mm_ss] format.

          Here is the code:


          <!-- Copy a new deployment build to all environments and to a Personal share. 
            Before this rename a Current folder to the [CurrentDateTimeTime] to save an old build. 
            -->
          <Target Name="AfterInstaller" AfterTargets="Installer">
            <PropertyGroup>
              <NewBuild>..\Deployment\bin\$(Configuration)</NewBuild>
              <CurrentDateTime>$([System.DateTime]::Now.ToString("yyyyMMdd_hh_mm_ss"))</CurrentDateTime>
              <Shares>\\fileshares.domain.com\Shares\</Shares>
              <SourceCodeShare>\BizTalk\Deployment\$(ProjectName)</SourceCodeShare>
              <PersonalShare>Z:\Projects\BizTalk\GLD\Samples\Deployment\$(ProjectName)</PersonalShare>
            </PropertyGroup>
          
            <!-- Rename Current shares to the [CurrentDateTime]: -->
            <ItemGroup>
              <EnvironmentName Include="QA;STG;PROD"/>
            </ItemGroup>
            <ItemGroup>
              <CurrentShare Include="$(Shares)%(EnvironmentName.Identity)$(SourceCodeShare)" />
              <CurrentShare Include="$(PersonalShare)" />
            </ItemGroup>
          
            <Exec Condition="Exists('%(CurrentShare.Identity)\Current')"
                   Command='Rename "%(CurrentShare.Identity)\Current" "$(CurrentDateTime)"'/>
          
            <ItemGroup>
              <NewBuildToCopy Include="$(NewBuild)\**\*.*">
                <Destination>%(CurrentShare.Identity)</Destination>
              </NewBuildToCopy>
            </ItemGroup>
          
            <!-- Copy the last build to the Current shares: -->
            <Copy Condition="@(NewBuildToCopy) != ''"
                  SourceFiles="@(NewBuildToCopy)"
                  DestinationFiles="@(NewBuildToCopy->'%(Destination)\Current\%(RecursiveDir)%(Filename)%(Extension)')" />
          </Target>


          This target could be a part of the Deployment.btdfproj file (which is a file from the BTDF Deployment project). Also you can add it to the BizTalkDeploymentFramework.targets file.

          BizTalk: Custom API: Promoted Properties

          How to get the Promoted Properties within .NET code? This sample exposes very simple API to access all Promoted Properties, currently deployed into a BizTalk group.

          The Best Application Server from Microsoft

          -Are you stupid? The BizTalk Server is an Integration Server. It is nothing to do with Application servers.

          That’s what you are probably thinking now…


          Application Types

          Let’s discuss different application types.

          Sequential Processing Application

          These applications work as single-threaded processes. An application gets the data potions and processes them one after another.
          . BatchProcessing

          One exemplar of such kind is a file processing application. An application reads a file and processes it, then it reads another file, etc. An application starts with a new file only after finishes with a current file. It works with data in sequence; it processes one data portion then starts with another.

          OS works as an application server for such applications. The applications are started by users of when OS is started. Widows Service is such application. OS provides the additional management features like an automatic restart after failure, etc.
          image 

           

          The Service [Application]

          An example of this kind of applications is a web-site. An Application Server here is the Internet Information Server (IIS). A Service Application is constantly waiting a request from clients. A separate service instance is started for each client request.

          ShortRunningTransactions

          If there are too many requests, the application server places the waiting requests in a queue, so the requests are not over flood the server. For the sequential application only a single application instance is working at the time, here for the service application many application instances are working simultaneously, one instance per a client request.

          Usually a service does not store the client data to a durable storage. If the service instance is crashed for any reason, a client just repeats a request and a new service instance will be created. Reliability is good enough, if the service instance works for a short period of time. But if a service instance works for hours or days, the possibility of the crash increases and reliability is not good.

          The Long-Running Service

          In the simplest case a service instance gets a client request, processes it, returns the result data, and that’s it. A little bit more complex case, when a service instance itself requests data from an external service. In this case the life duration on this service instance is unpredictable. We cannot manage the external services, we don't know when they respond. Our service instance can possible wait hours and days, and the Application server can move the service instance snapshot to the durable storage and move it back when it gets the result from an external service. This process is called dehydration-hydration, and it increases the service reliability. Also it frees the computer resources, so a large number of dehydrated service instances do not clog the computer.

          LongRunningTransactions

          Another problem with long-running services is a request correlation.

          Everything is simple if a new independent service instance is created for every client request, a service instance process a request, and returns a result to client. But now the service instance itself requests an external service. Imagine hundreds service instances are working simultaneously and all of them send requests to an external service. The external service processes these requests and returns the responses back. As a result we have hundreds service instances and hundreds responses. How to route a response to the right service instance, that requested exactly this response? Each such response must have some information which links this response exactly with the correct service instance. The process of linking is called the correlation. For example, the service instance Id is included in the request, and then this Id is moved into a response and is used for routing this response back, to correlate this response with the right service instance.

          The correlation data should be unique, so each response can be correlated with correspondent service instance.

          Ideally the application server provides the correlation mechanism.

          Another problem with long-running services is the “dead instance cleanup”. For example, a service instance makes a request to an external service and doesn’t get a response back in predefined period of time. The external service could be down, the network could be overloaded, etc.; there could be many reasons in the distributed world. In another example, the service instance is hanged and does not respond in any way. It would be clever to declare such service instances as “dead” and remove them from memory.

          One popular method for the dead instance cleanup is the “heartbeat measure”. A special heartbeat service periodically requests all working service instances and if some instance does not respond, it is pronounced dead and removed from memory.

          The requests to the external services can return errors for different reasons. Sometimes the external service returns legitimated business error and we cannot do anything with it. A request is processed with error and it will be an error again, if we repeat the request. But most frequently the error is temporary. For example, the network is overloaded, the database is overloaded, the I/O queue is overcrowded, the external service is busy, etc. In such cases the request could be repeated after a small delay and after several times we get the good response back. The error will be raised only, if the request is retried too many times.

          So we have a list of additional functionality for the application server which works with the long-running services:

          • a durable storage for the service instances
          • a correlation mechanism
          • a system for monitoring and cleanup of the dead instances
          • a retry system for the requests to the external systems


          And it happens, the Microsoft has such application server in its arsenal. It is the BizTalk Server.

          The BizTalk database, which is called the Message Box, performs as the reliable data storage. The service instance could dehydrate in this storage and wait here infinite time. Thousand, hundred thousand service instances could wait and the performance of the whole system will not decrease.

          The correlation mechanism is a part of the Message Box. Requests are analyzed and routed to the service instance, which is waiting exactly this request, otherwise a request starts a new service instance.

          There is a heartbeat and the dead instance cleanup mechanisms.

          There is a retry system which automatically resends requests to the external services.

          And here is the main feature of the BizTalk Server, its famous reliability. The BizTalk Server processing power is spreads between several computers and data is stored on the SQL clusters. Deployed applications are working for years without any attention. Power grid could be down, the servers could crash, the network went down, the hard drives could fail. But the applications are not affected. The BizTalk Server automatically restarted, hundreds or thousands interrupted service instances are restored and work as nothing had happened.

          Upgrade an Application from the BizTalk Server 2010 to 2013. One error

          I was moving the projects from the BizTalk Server 2010 to the BizTalk Server 2013 under the Visual Studio 2012.

          I have spent a good chunk of time investigating this error and decided to blog on it to save you this time.

          The error happens at deployment time at this command:

          BTSTask.exe AddResource -Type:BizTalkAssembly -Source:"..\<ApplicationName>\bin\Debug\<ApplicationName>.dll" -ApplicationName:"<ApplicationName>" -Options:GacOnAdd,GacOnImport,GacOnInstall

          or when I Deploy a BizTalk project from the Visual Studio 2012.

          error DEPLOY: Access to the path 'C:\Users\…\AppData\Local\Temp\2\BT\PID3656\BizTalkAssembly\c871b7bbc5c2ac36b7da3592c65912d5\<ApplicationName>.dll' is denied.

          this error was coupled with another error

          error MSB3073: The command "BTSTask.exe AddResource -Type:BizTalkAssembly -Source:"..\<ApplicationName>\bin\Debug\<ApplicationName>.dll" -ApplicationName:"<ApplicationName>" -Options:GacOnAdd,GacOnImport,GacOnInstall" exited with code 1. […\<ApplicationShortName>\Deployment\Deployment.btdfproj]

          Do you think it is something with permissions? Access is denied, right?

          Nope. Nothing was wrong with permissions.

          The problem was in the AssemblyInfo.cs file which is under the Properties folder of the project. For some reason this file was not stored in the source control and was not moved to the upgraded project.

          To resolve this error create this file and… That’s it.

          [UPDATE] Not so easy, not so easy...

          Not any AssemblyInfo.cs is working! Make sure it has a line:

          [assembly: Microsoft.XLANGs.BaseTypes.BizTalkAssembly(typeof(Microsoft.BizTalk.XLANGs.BTXEngine.BTXService))]

          BizTalk: the Naming Conventions in Examples

          See the BizTalk: BizTalk Solution Naming Conventions article.

          In small application we do not really care about names. When the number of objects start growing, we start pay attention to the names. Experienced developers working with big projects are recognizable by the carefully crafted names in the developed code.

          When a new developer start working in a new team, he/she spent first several hours to read the general level documentation and the naming convention usually is a part of it. Then he/she start to develop and now time should be spent to review the existed code base to become accustomed to the coding standards, and again the naming conventions are the main parts of them.

          So the naming conventions document are important but the existed code base is also important. The documentation could be not up to date, but the code always is.

          The code example easily could replace the the documentation, that’s why I decided to show the BizTalk naming convention in the real-life examples.

          Example description

          This BizTalk deployment is compounded from many BizTalk applications. Each of application has 1-10 Visual Studio projects. The applications were developed through the many years by different development teams. Applications integrate several systems. Some systems have just one interface, some systems have many interfaces. In most cases one system interface is integrated with another system interface. Sometimes several systems are integrated with one system interface.

          One solution is chosen for this example. Why the GLG.Samples.Name.Shared.Schemas solution was created in real life?

          As we know an interface in BizTalk defined by a protocol and one or several XML schemas.  In many cases those schemas are dictated by the integrated system not by the BizTalk application developer. Usually the BizTalk developer uses the Adapter Wizard to generate/import these schemas. These schemas are managed (created, modified) not by BizTalk application developer but by the external system owners. Let’s call these schemas the external schemas.

          One interesting aspect of the external schemas is one schema can be used by several BizTalk applications. As we know the schemas for the receive locations should be unique, they cannot be deployed in several assemblies. That means if we want to use the same schema in several projects we should share an assembly with this schema, reference this assembly from all these projects.

          In this situation the special attention is paid to these schemas.

          All external schemas are placed in a separate Visual Studio solution. The schemas are grouped by the systems. Inside each system the schemas are grouped by the interface names.

          This design gives us one more useful feature. Here we always know where to find the external schemas. They are not spread through many projects and application but always placed and can be found in one solution.

          The sample provide two versions of the naming conventions. I called them "Long Names" and "Short Names".

          Long Names

          It is the Solution Explorer view:

          Solution.Long

          It is the folder view:

          Folders.Long

          Short Names

          It is the Solution Explorer view:

          Solution

          It is the folder view.

          Folders

          Comparing

          The Short Name convention looks more clean. There are no generic prefixes (GLD.Samples.Names.Shared.Schemas) inside the Project and folder names. If you fight the well-known issue with the TFS limitation on the size of the full file names, it would be a preferable variant.

          I personally prefer the Long Name conventions. The project name is equal to the project namespace and the project assembly name. The full project names are better in situations where a project might be included in several Visual Studio solutions or the projects could be frequently shuffled between solutions.

          Q&A :

          1. Is it OK to use the names compounded from 4 and more pieces like GLD.Samples.Names.Shared.Schemas?
          2. Do we need the GLD.Samples prefix in the BizTalk application name, in the solution name?
          3. Why we use the _Ops suffix for the SystemA, SystemB, and SystemC projects?

          BizTalk Applications

          Here is the BizTalk Administration Console view:

          BizTalkAdminConsole

          As you can see the GLD.Samples.Names.Shared.Schemas solution is deployed as the Names.Shared.Schemas application. The GLD.Samples prefix is removed because all deployed solutions are using this GLD.Sample prefix. If all artifacts has the same name part, this part definitely can be removed. Why this part is not removed from the solution/project name? Because projects (assemblies) are working together with many system, Microsoft and others assemblies in one global name space, in .NET run-time space. All assemblies should be placed in GAC, where GLD.Samples prefix helps us to find out our assembles. This is an answer on the second question. Smile

          This is the Short Name convention for the application names.

          The Long Name convention is simple:

           the BizTalk Application Name equal the Visual Studio Solution Name

          I personally prefer the Long Name convention as a faster way to work, copy/past without any change, and do not guess the full name of the assemblies.

          How complex could be the name hierarchy?

          For example, we have such name as GLD.Samples.Names.Shared.Schemas.Sap.AV_CREDIT_STATUS_MODIFY. Isn't it too complex?

          Not at all. All parts of this name are here for a reason. Each part shows the grouping of an object. Our names unambiguous show us the object hierarchy. Just try to avoid some parts to get something shorter and you would be in trouble. You would create not one but numerous naming conventions and numerous rules how to use them.

          Hierarchical naming conventions is the king of the BizTalk projects. These projects are complex enough to create the object grouping hence hierarchy. But they are not too complex to push us in the areas, where the hierarchical grouping is not working well, where keys or tags are working better.

          I hope this is an answer for the first question.

          Now the answer for the third question. Why we use the _Ops suffix for the SystemA, SystemB, and SystemC projects?

          For instance, we have GLD.Samples.Names.Shared.Schemas.SystemA.CreditCheck_Ops project
          why it is not just
          GLD.Samples.Names.Shared.Schemas.SystemA.CreditCheck ?

          The problem is, this project has a schema which has several roots and one of them is the CreditCheck. What is wrong with it, why we cannot use the same name for a part of the project name and for a schema root?

          The XML schema is serialized into the .NET classes, and a schema root serializes into one such class. So we will get a class with full qualified name as GLD.Samples.Names.Shared.Schemas.SystemA.CreditCheck.ChreditCheck. And while the project is building, we got an error like this:

          Error.SymbolXIsAlreadyDefined

          Description would be a little bit different from the picture about: “symbol ‘GLD.Samples.Names.Shared.Schemas.SystemA.CreditCheck’ is already defined; the first definition is in assembly…”.

          For some reason the .NET builder parses the names sometimes left-to-right, sometimes right-to-left and this creates this kind of errors.

          To avoid this kind of error we have to avoid creating .NET classes with name equal one of the namespace part. The "_Opr" suffix is used for this purpose.

          This is an implementation detail, but we have to take care of it in the naming conventions.

          Conclusion

          There is no single naming convention for all and every situation. Be reasonable and do not create an over engineered conventions. They are not the goal in development but the helpers.

          Quiz for readers Smile

          • Propose your improvements to these naming conventions.
          • What are the Pros and Cons of placing the external schemas in a separate solution?
          • This solution could be very big. Would be this a problem?

          BizTalk: Internals: Schema Uniqueness Rule

          A source code could be downloaded from here.

          Global artifacts are usually tentative things. Languages and tools have different methods to limit the artifact visibility. Think about public and private variables, for example. BizTalk limits the artifact visibility usually by the assembly (project) boundaries. For example, the port types, the correlation set types. The BizTalk applications were introduced as containers for artifacts and they naturally limit the artifact visibility. Artifacts are not visible outside of an applications by default.

          But sometimes in BizTalk the artifact visibility can be global. We place the artifacts in different assemblies and it doesn't limit the global visibility. We place the artifacts in different BizTalk applications and it doesn't limit the global visibility. 
          I am talking about schemas. In my previous post BizTalk: Internals: namespaces I've shown that schemas have additional parameter, the Xml [target] namespace. Why it is so important?
          The BizTalk receives messages on the Receive Locations. Theoretically messages are just the byte arrays. But the BizTalk frequently has to do additional tasks like changing the message format from the one system format to another. In this case a message should be received in the well-known format. Without knowing the message format the BizTalk cannot make not the message validation nor the message transformation. (The BizTalk message internal format is an Xml or a byte array. Here I am talking about the messages in the Xml format.)     
          So the Xml messages are received by the Receive port as the byte arrays and should be parsed into Xml format. The XMLReceive pipeline makes this transformation. The first thing the XMLReceive makes, it searches a Xml namespace and a root node. Those two parameters create a MessageType parameter with is promoted as a context parameter.
          As you can see, the BizTalk uses the MessageType to find out the schema for this message. This Xml schema is used to parse the whole message into Xml format.
          So the first step is to find a Xml namespace and a root node inside the byte array regardless of anything else, the second step is to find out the Xml schema inside the schema repository, then this schema is used to parse the whole message into Xml format.
          Now we are discussing the second step, how the BizTalk is searching the right schema.
          The BizTalk searches through the whole schema list regardless of the application boundaries. Each schema belongs to one of the BizTalk application, but BizTalk ignores this. The global schema list is stored inside the BizTalk management database. When we deploy a project with a schema, the schema is registered inside this database.
          When an inbound  message is received and processed by the XMLReceive pipeline, the BizTalk extracts a MessageType (a namespace + a root node), and searches for this MessageType in the management database. An error is raised if the MessageType is not found. The error is raised if more than one schemas with this MessageType is found.

          An important note is the schema uniqueness rule is not verified at the deployment time.
          I have created several samples to show how it works. These samples are bare-bone projects, focused on the “schema uniqueness” rule.

          Identical schemas in the one BizTalk application

          The application includes two projects. Each project is compounded from one schema.

          OneApp.VSSolution

          Both schemas have the same structure and the same Xml namespaces. Schema1

          The only difference is the .NET namespaces. Each schema is placed in different assembly. OneApp.DeployedSchemas.

          I have created one receive port with one File receive location and XMLReceive pipeline. A send port works as a subscriber to this receive port.

          Nothing else is in this application. Two identical schemas in different assemblies and receive and send ports.

          I dropped a test Xml file in the receive folder. What do you think is happened? You are right, I've got an error and two related events, 5753 and 5719. OneApp.EventsOneApp.Events.5753OneApp.Events.5719

          It is a famous "Cannot locate document specification because multiple schemas matched the message type ###" error. :)

          Conclusion: schemas of the receive messages should be unique across all BizTalk application.

          Identical schemas in the different BizTalk applications

          Now two the same projects are in different BizTalk applications. (As you can see I have broken the naming convention, which forces to place the BizTalk applications in the different Visual Studio solutions. In real life there would be two Visual Studio solutions. Sorry for that. Smile)

          VSSolutionFirstApp.Schema1SecondApp.Schema1

          The ports are the same as they are in the first sample.

          When I dropped a test file I have got the same error as in the first sample. So the schemas are now placed in different applications but it doesn't change the schema visibility. The schema from one application is visible from another application.

          Conclusion: schemas of the receive messages should be unique in different BizTalk applications. In other words they should be unique across the whole BizTalk Server Group.

          Identical schemas imported by different schemas

          Now let’s ask a question. Is the “schema uniqueness” rule relating to the imported schemas?

          Here is a real-life example. We are consuming the WCF web-service metadata, and it creates the schema set where we can see the Microsoft Serialization schema. It is the same schema almost in each and every WCF service. Do we have to take care of this schema if we consume several WCF services? Do we have to “fight” this schema, extract it into Shared project, etc.?

          This sample is the same as a previous sample with two additional schemas: Importing.xsd and Imported.xsd.
          VSSolution 

          Both Importing schemas in two applications are different. They have different Xml namespaces.
          ImportingSchema

          But both Imorting schemas are identical.ImportedSchema

          The ports are the same as they were in the first sample.

          When I dropped a test file there was no error.

          So the “root” schema namespaces, the namespaces which define the MessageType were different, but the namespaces of the imported schemas were the same and this the XMLReceive pipeline could successfully recognize the MessageType of the message.

          Conclusion: the “schema uniqueness” rule works only for the schema which define a MessgeType of the received message and doesn't work for the imported schemas.

          Notes:

          • Two parameters of Imported schema are changed from default values. It forces to include the namespace prefixes of the Imported schema into the Xml documents. Those parameters are the AttributeFormDefault and ElementFormDefault.

            ImportedQualifiedParam

            A message with these parameter values equal “Qualified”:Xml.Qualified

            A message with these parameter values equal “Default”:Xml.Unqualified
          • We could mention the “typeOnlySchema” Root Name for the Imported schema. Actually this schema doesn't have any root, any elements, it includes only types.
            TypeOnlySchema

          • Note: The samples above are about the XMLReceive pipeline, but the same  “schema uniqueness” rule works also for the XMLTransmit send pipeline.

          “Schema Uniqueness” Rule

          The MessageType schema should be unique within the BizTalk Service group if this schema is used in the XMLReceive and XMLTransmit pipelines.

          Is there a reason for the “Schema Uniqueness” rule?

          Is there a good reason for this rule?

          Looks like a reason for this “schema uniqueness” rule is simple. The BizTalk application was introduced only in the BizTalk 2006. In this time Microsoft decided do not change the XMLReceive and XMLTransmit pipelines to embrace new application conception and since then this feature was not on the priority list in the BizTalk development.

          I would prefer if the schema visibility is limited inside the BizTalk application. It would extremely simplify the schema and service versioning.

          Let’s take an example when our partner added a new version of the service. The service schemas itself are different in both services (several nodes were removed, several nodes were added), but the schema namespaces and roots are the same in both services. The new and old services work side-by-side and we have to consume both service versions. If we could limit the schema visibility inside the application boundaries, creating the BizTalk applications for consuming both service versions would be simple. Now it is cumbersome. I am sure, each experienced BizTalk developer can remember several real-life examples, when this “schema uniqueness” rule was a reason when several hours project was instead implemented in several days.


          BizTalk: Internals: the Partner Direct Ports and the Orchestration Chains

          Partner Direct Port is one of the BizTalk hidden gems. It opens simple ways to the several messaging patterns.

          This article based on the Kevin Lam’s blog article. The article is pretty detailed but it still leaves several unclear pieces. So I have created a sample and will show how it works from different perspectives.

          Requirements

          We should create an orchestration chain where the messages should be routed from the first stage to the second stage. The messages should not be modified. All messages has the same message type.

          Common artifacts

          Source code can be downloaded here.

          It is interesting but all orchestrations use only one port type.Schema1PortType

          It is possible because all ports are one-way ports and use only one operation.

          I have added a B orchestration. It helps to test the sample, showing all test messages in channel.

          Orch.B

          The Receive shape Filter is empty. A Receive Port (R_Shema1Direct) is a plain Direct Port. As you can see, a subscriptionB.Subscription expression of this direct port has only one part, the MessageType for our test schema: A Filer is empty but, as you know, a link from the Receive shape to the Port creates this MessageType expression.

          I use only one Physical Receive File port to send a message to all processes.

          Each orchestration outputs a Trace.WriteLine(“<Orchestration Name>”).

          Forward Binding

          This sample has three orchestrations: A_1, A_21 and A_22.

          A_1 is a sender, A_21 and A_22 are receivers.

          Here is a subscription of the A_1 orchestration:A_1.Subscription

          It has two parts

          • A MessageType. The same was for the B orchestration.
          • A ReceivePortID. There was no such parameter for the B orchestration. It was created because I have bound the orchestration port with Physical Receive File port. This binding means the PortID parameter is added to the subscription.

          All

          How to set up the ports?

          1. All ports involved in the message exchange should be the same port type. It forces us to use the same operation and the same message type for the bound ports.
          2. This step as absolutely contra-intuitive. Ninja We have to choose a Partner Orchestration parameter for the sending orchestration, A_1. The first strange thing is it is not a partner orchestration we have to choose but an orchestration port. But the most strange thing is we have to choose exactly this orchestration and exactly this port.It is not a port from the partner, receive orchestrations, A_21 or A_22, but it is A_1 orchestration and S_SentFromA_1 port. Surprised smile
          3. Now we have to choose a Partner Orchestration parameter for the received orchestrations, A_21 and A_22. Nothing strange is here except a parameter name. We choose the port of the sender, A_1 orchestration and S_SentFromA_1 port.

          As you can see the Partner Orchestration parameter for the sender and receiver orchestrations is the same.

          Testing

          I dropped a test file in a file folder. There we go: DebugView.Forward

          1. A dropped file was received by B and by A_1
          2. A_1 sent a message forward.
          3. A message was received by B, A_21, A_22

          Let’s look at a context of a message sent by A_1 on the second step:

          • A MessageType part. It is quite expected.
          • A PartnerService, a ParnerPort, an Operation. All those parameters were set up in the Partner Orchestration parameter on both bound A-1.MessageSent.Contextports.

           

           

          Now let’s see a subscription of the A_21 and A_22 orchestrations. A_2x.Subscription

          Now it makes sense. That’s why we have chosen such a strange value for the Partner Orchestration parameter of the sending orchestration.

          Inverse Binding

          This sample has three orchestrations: A_11, A_12 and A_2.

          A_11 and A_12 are senders, A_2 is receiver.

          All

          How to set up the ports?

          1. All ports involved in the message exchange should be the same port type. It forces us to use the same operation and the same message type for the bound ports.
          2. This step as absolutely contra-intuitive. Ninja We have to choose a Partner Orchestration parameter for a receiving orchestration, A_2. The first strange thing is it is not a partner orchestration we have to choose but an orchestration port. But the most strange thing is we have to choose exactly this orchestration and exactly this port.It is not a port from the partner, sent orchestrations, A_11 or A_12, but it is A_2 orchestration and R_SentToA_2 port. Surprised smile
          3. Now we have to choose a Partner Orchestration parameter for the sending orchestrations, A_11 and A_12. Nothing strange is here except a parameter name. We choose the port of the sender, A_2 orchestration and R_SentToA_2 port.

          Testing

            I dropped a test file in a file folder. There we go: DebugView.Inverse

            1. A dropped file was received by B, A_11 and by A_12
            2. A_11 and A_12 sent two messages forward.
            3. The messages were received by B, A_2

          Let’s see what was a context of a message sent by A_1 on the second step:

          • A MessageType part. It is quite expected.
          • A PartnerService, a ParnerPort, an Operation. All those parameters were set up in the Partner Orchestration parameter on both bound ports.A-1x.MessageSent.Context

          Here is a subscription of the A_2 orchestration.

          A-2.Subscription

          Models

          I had a hard time trying to explain the Partner Direct Ports in simple terms. I have finished with this model:

          Forward Binding

          Publisher/Sender doesn't know Subscribers/Receivers. Subscribers/Receivers know a Publisher/Sender.
          1 Publisher –> M Subscribers

          Inverse Binding

          Publishers/Senders know a Subscriber/Receiver. Subscriber/Receiver doesn't know Publishers/Senders.
          M Publishers –> 1 Subscriber

          Notes

           

          Orchestration chain

          It’s worth to note, the Partner Direct Port Binding creates a chain opened from one side and closed from another.

          The Forward Binding: A new Receiver can be added at run-time. The Sender can not be changed without design-time changes in Receivers.

          The Inverse Binding: A new Sender can be added at run-time. The Receiver can not be changed without design-time changes in Senders.