As many of my readers will know, I've been doing a lot of work around Hybrid integration solutions over the last few years involving Windows Azure Service Bus and various other technologies. One of the challenges which comes up in any architecture is how do you manage and implement logging. Well if you consider that we are now often building globally distributed applications in various data centers which we own, or data centers which we rent from cloud providers, this logging challenge is now even harder.
With this architecture in mind and all of the possibilities that the cloud gives us, let's consider and play around with the idea of a globally capable logging solution. My first thought on this is that we want a few requirements:
- I would like to be able to publish audit events which are things which need to be reliably delivered
- I would like to publish logging events which I can live with occasional loss of messages
- I would like to be able to configure the logging in my applications so that I can control how much is logged centrally or not
- I would like to be able to keep some logging just in the client application
- I want to offer an interoperable approach so non .net applications would be able to log messages.
Let's get started
In this imaginary solution let's consider a solution where Acme is the main business and their business partner calls an API they host in the cloud which uses Windows Azure Service Bus to bridge to their on premise BizTalk instance. BizTalk gets the message and then transforms it and calls into your on premise line of business application which processes an order and confirms to BizTalk that it's complete.
The solution will look something like this.
When we consider the logging requirements for this solution it will now look something like this.
You can see that the abstracted logging system means that all applications could have the ability to push logging information up to the logging system. This means that we want a logging system which exposes an interoperable way to be for applications to be able to send it messages and also to be hosted in a place where it can be reached by Acme and their partners.
How could we build the Logging/Auditing System
First we need a store for the logging and auditing events which is capable of holding a lot of data. A No-SQL type of database could be quite a good choice here. There data doesn't really need to be relational and it's a pretty simple structure. Since we want to accept messages from inside and outside of the organization we can host this in the cloud. Let's say for argument sake we would like to use a PAAS offering so let's choose An Azure Table Storage account. It's pretty cheap to store the data here which is great.
Next we need to think about how to get messages into this data store. Well we could just use a key and give applications access to the table directly via the REST API. Yep that's easily do-able but it's going to make the rest of this article a bit boring if we do that and we would lose some control over the information we would like the client to send. Instead we will sit Windows Azure Service Bus in front of the Table store. The clients will send messages to Windows Azure Service Bus and we will have something which can then process the messages from there and into the Azure Table.
The benefits of putting Windows Azure Service Bus in front of the table include:
- We will be able to offer an more interoperable interface for clients supporting REST, AMQP, NetTcp and WCF
- We will be able to use a Topic to provide a level of filtering of messages. This could reduce the amount of data in the central store and could allow us to turn up and down what we accept centrally
- We can filter and route logging information to different data stores. For example we could send audit messages to one table and debug messages to another
- We can provide different security access for different applications. For example each application could submit to different queues
- Azure Service Bus will allow us to filter the different types of log messages to subscriptions which we could process at different speeds. For example we could process error and audit events from a priority queue with lots of processors to get the events into the database as quick as possible and debug events could process much slower
Now that we have the log messages in Windows Azure Service Bus the next question would be how to get them to a permanent data store. A Windows Azure Worker Role would be a good choice of host for a background queue processing component. This worker role could poll a number of queues or subscriptions and then save messages into the Azure table store we described above. We may consider whether we store all messages in one table of store messages in different tables such as an audit messages table and a logging messages table. Either way the Azure Table Storage account can with just a tick box be geo-replicated giving us the benefits of the data being backed up to another data center.
After the core logging capability was in place we could then consider how we would manage and use the information we could capture. There is really two sides to the information.
- Operations Data
- Analysis Data
In the space of operations we would be considering what kind of information could be used for troubleshooting and reactively responding to support queries. We could also be looking for information which could be proactively used to identify operations issues and hook into Azure Notification Hubs as a great way to get alerts out to people who need to be aware. Building a custom dashboard hosted in an Azure Website or Web Role would be a good way to give your operators access to this data. Operators would be able to correlate log messages across applications and troubleshoot the flow of a specific transaction across systems.
The next obvious capability would be around analysis and in particular how could I gain useful insights into this logging information. There are many evolving cloud based business intelligence tools and in a logging system like this you could potentially build up a lot of data over time. One of the big benefits of the cloud is that you have the expandable compute power to burst the analysis of large amounts of data so you would be able to ask deep probing questions of your logging and auditing data across your applications, potentially across your global enterprise and also your partners.
When we consider all of these capabilities, the solution for the logging system might look something like the below:
Other benefits of this model of hosting in the cloud is that we could have multiple service bus instances in different Azure Data Centers and let our applications or partners log to the one that is most convenient for them and using a partitioned queue they would have a resilient queue to send to.
Integrating the Applications into the Logging System
Once we have the conceptual logging system in place and its capabilities to help us have a great insight into our hybrid solutions on a global scale we now need to consider how we might integrate into the logging system. As I mentioned earlier in the article one of the benefits of using Azure Service Bus is the ability to expose a number of different standards based interfaces in addition to some optimized ones for .net. With Azure Service Bus we will have AMQP & REST interfaces which should support an easy way to interop with most applications. They would just need to send a correctly formatted message along with any appropriate headers.
For .net applications we then can integrate the use of the Service Bus SDK.
Cloud API & .net Applications
As I've just mentioned for a .net application you could integrate with the use of the Service Bus SDK, but many organizations use logging components like log4net in their custom developed solutions so I did another article where I experimented with the idea of writing a log4net appender which would be capable of publishing log events to Windows Azure Service Bus. If you're interested in the detail of that then please check out the article on my blog called Log4net Service Bus Appender.
By using this appender if gives you an easy way to configure the log events which you would like your .net application to publish to the centralized logging system. Perhaps you want to publish all events, or perhaps just events of a specific logging level. It would also encapsulate the details of the logging behind the normal log4net interface and have the benefit of being able to optionally publish in an asynchronous fashion.
If you weren't using the log4net interface you could simple publish your own message to the Azure Service Bus and as long as it meets the serializable contract and contains the properties expected then it would be able to be processed ok.
In the BizTalk part of this demo solution you have a couple of choices when wanting to publish logging events to the central system. Many organizations who use BizTalk also use log4net in their solution so it would be possible to just implement this in the same way you would for a .net solution. Another option would be to publish messages to the message box with information that could be mapped to the audit event data type and then publish them to Azure Service Bus using the SB-Messaging adapter which comes with BizTalk.
The on premise line of business (LOB) application is most likely to be the difficult one to integrate into this solution. It really depends upon the capability to extend the application. In some applications you can add extension points to their workflow processes where you could perhaps make a REST call out to the Azure Service Bus to add information. Alternatively if you had less capability then you could just take advantage of using BizTalk to log before and after the interaction with the LOB application. This would mean you lose some insight into what is happening in the LOB application but you at least have options depending upon its capabilities.
What about devices?
If your solution includes devices there is no reason you couldn't develop the ability to send background REST calls to something like Azure Mobile Services which could then send information to the Azure Service Bus for you or perhaps you would have used Azure Mobile Services as the application platform for your mobile development. In this case things would get easier again and you could send messages from your mobile services API to the logging system. The below picture shows what this may look like:
What about costs?
One of the cool things about this kind of solution would be the potential to cost per use. Using the configuration knobs in the log4net configuration and BizTalk configuration your applications could be quite specific about what data you wanted to send to the centralized logging system based on the Level property. Even if you were streaming quite a lot of data you would still be able to keep control on the costs through the filter rules. You might use the subscription properties to only accept messages from certain applications or certain levels.
If you were to do the full solution including the dashboard, and various reporting options I can imagine you would need to think a lot more about the cost aspect of this solution but one of the big benefits of this approach is that with the log4net appender I mentioned earlier and a table storage account and a worker role component which wouldn't be difficult to implement I bet you could very easily get a prototype of this solution working. I would also expect that from this quick to demo prototype you would get a lot of interest in your organization in the ability to get this holistic view of across system and business unit instrumentation. If you decided not to take it any further then just remove the log4net stuff from your applications and it's taken away.
What about Application Insights?
If you are a follower of Visual Studio Online you will see that there is the developing Application Insights capability which is currently in preview. You may be wondering if this is the same thing. Based on what seems to be available in Application Insights at present I don't view these as the same thing although there is some overlap. In Application Insights you will be using agents to push through instrumentation based information to your Visual Studio Online instance. This is really useful stuff about the performance of your application but it's coming from the lower level end of devices, and servers and some stuff about your application. In this type of solution I'm thinking about a slightly different angle based on the following:
- I am thinking specifically about processes and transactions that span across applications. I want to bring this information about the process execution together to gain insights into it
- I am more interested in the human/business readable type of information such as "this key bit of logic did this" rather than "this is the level of the CPU usage"
I think that there is a small degree of overlap in how the support operators could use a centralized logging capability along with what Application Insights will eventually offer but that should be a complimentary overlap which allows them to better support cross business hybrid solutions which let's face it is a difficult thing to do.
Taking this approach further
In my fictitious example if I have many applications and my integration platform all capable of logging audit and diagnostic messages into my central cloud logging store then I can begin to get a good operational overview of how my applications are working, but taking that to the next level I would be able to look at taking advantage of the data processing capabilities in the cloud to get some interesting insights into my application and business process data. Earlier I eluded to the options for using SQL Reporting and HDInsight to analyze some of the data and thinking about it, if you went to the next level and build a pretty interface and some good reporting you wouldn't be far from a Business Activity Monitoring solution. You could also build a pretty visualization about how the process and logging has flown across applications in something like the BizTalk 360 Graphical Message Viewer but at a level higher than just BizTalk.
Using the topics in Windows Azure Service Bus as described earlier for some of the inbound logging events you could even create some rules to push out business process notifications via Windows Azure Notification Hubs and start to think about complex event processing opportunities.
In conclusion as we have developed more hybrid integration solutions the challenge of how we support them has become greater. It now involves support teams from our organization and other business units around the world and in many cases also support teams from our partners. This complex architecture makes it difficult for people to understand what is going on and a centralized logging capability at a global scale becomes an obvious requirement. If we think through what we need like I have tried to do above then a logging capability also becomes a great candidate for a Business Activity Monitoring (BAM) solution. At the recent BizTalk Summit Microsoft announced their plans for a BAM offering on Azure at some future point and that is something which excites me quite a lot. One of the key things about the BAM offering when it comes out from Microsoft is that we need it to have the following:
- It needs to be simple for all applications to plug into it not just BizTalk or BizTalk services
- We need to think how we can have flexible processes where information can be brought together but also support the process changing. We don't want the tightly coupled business activity model to the implementation. This is where I hope the Hadoop Business Intelligence capabilities allow us to be much more flexible here.
- We need to be able to store huge amounts of data and get data from all over the world
I have high hopes for the BAM module when it comes out but in the meantime hopefully this article provides some food for thought as to what you could do if you wanted to create a centralized logging system with the capabilities available on Azure today.