Last Monday, Microsoft used the TechEd keynote to announce their plans to include a Complex-Event Processing (CEP) engine in SQL Server 2008 R2. Technical previews are expected later this year, hopefully by summer (I have heard some doubts expressed about this) with launch of the product planned sometime next year.
Richard Seroter reported that the session on the CEP engine was sparsely attended on Tuesday. I’m not surprised. Although well understood within certain verticals – perhaps most notably by financial institutions involved in algorithmic trading - the term ‘CEP’ has yet to make a broad impact across the IT industry. For my part, I believe the technology deserves wider attention, and may yet prove far more significant that many might imagine.
My inner geek longs to discuss the more arcane aspects of the technology. However, without access to the code, I only have a couple of papers from Microsoft Research to go on, plus some sight of more recent unpublished documentation. Between Richard (see http://seroter.wordpress.com/2009/05/12/teched-2009-day-2-session-notes-cep-first-look/) and Marc (see http://magmasystems.blogspot.com/2009/05/microsoft-cep-part-2.html), I think the basic shape of the technology has been described to the level of detail that is appropriate at this time. It will be very interesting to dig deep into the relationship between the LINQ programming approach supported by the engine and the underlying CEDR (Complex-Event Detection and Response) technology on which the engine is founded. However, that will have to wait. In this article, I want, instead, to discuss the nature of CEP, the role the engine could play within Microsoft’s platform and the relationship of event processing and detection of complex events in relation to the more familiar territory of message brokerage, business process automation, rules and analytics. In short, I want to address the question as to why BizTalk architects and developers might have a reason to take an interest in the forthcoming CEP engine.
A matter of value
This blog was always BizTalk-orientated, and BizTalk Server is an integration server based, in part, on a message broker. I want to explain CEP from the perspective of a long-time user of BizTalk Server. To do that, let me first explore notions of value associated with message interchange.
The most fundamental driver for selecting a message-broker technology is related to the business value we accord to each individual message that the broker is asked to handle. Message broker technology is an appropriate choice when that value is high and we must ensure that individual messages are delivered, validated, enriched, transformed and routed in a robust manner. If we have no reason to value the content of each individual message, the use of a message broker is unlikely to make much sense. Hence, the bulk of messages passing through BizTalk servers represent business entities such as purchase orders, financial transactions, expense forms, customer records, etc.
BizTalk Server integrates its message-brokerage capabilities with service orchestration used to define automated business processes. High-valued message content plays a central role in driving those processes. Processes are commonly built around the business entities they handle and the values contained within messages. Consider how common it is for conditional logic and state transitions to be driven by the value of message fields. Consider the centrality of mining and transforming message content within process flows.
BizTalk naturally lends itself to the publishing and consumption of service interfaces, and often plays a role in implementing service bus patterns within the enterprise. From a service-orientated perspective, messages remain vitally important and full of value. However, the emphasis shifts to the services that consume and/or return those messages, rather than the messages themselves. Message content represents the data on which the service acts. Depending on our chosen approach, core data entities may be wrapped by additional service enablement data (SOAP headers, certificates, claims tokens, etc.) representing the mechanics of service invocation. Alternatively, the RESTful approach maintains a clearer emphasis on the core message content itself, but shifts the semantics of the message towards the representation of component, application or service state, and away from the representation of business entities.
The value we associate with messages passed in service interchange typically remains high, but the role of the message is subsumed by the notion of a service interface and the semantic emphasis may change. We can therefore begin to see something a spectrum emerging in which high value business messages and service interchange messages appear close together but with some separation. It is its ability to handle this spectrum that makes BizTalk Server so much more than a message broker and so much more than a business process execution tool. It is this same spectrum that begins to explain the complementary nature of BizTalk Server and the WCF-based ‘Dublin’ application server that will be released in a few months time.
Extending the Message Interchange Spectrum
We can extend the message interchange spectrum in two directions. During the 1990s, emphasis was laid on the concept of object request brokerage. The two main competing technologies were CORBA (an OMG specification) and Microsoft’s DCOM (Distributed Component Object Model). Java’s RMI also deserves a mention in despatches. These technologies emphasise object behaviour and exploit message interchange for the purpose of invoking specific behaviours on distributed objects. Messages represent remote procedure calls, return values and errors. Hence, messages are used simply as an enabling medium. Value lies entirely in the object behaviour and state accessed via message interchange.
CORBA, RMI and DCOM pose a number of challenges in terms of interoperability, discoverability etc. To a large extent, modern service oriented architectures provide an attractive alternative to object brokerage, and have a much wider application. However, these older technologies continue to play a role within the enterprise. For example, the COM+ application server for DCOM is used to allow objects to participate in distributed transactions via Microsoft’s Distributed Transaction Server.
Extending the spectrum in the other direction, we enter the territory of event handling, event-driven architecture and complex-event processing. It is here that we will consider the role of Microsoft’s new CEP engine.
‘Things that Happen’
An event is ‘something that happens’. It is located in time (either at a point in time, or during some time interval) and associated with some system or domain. It involves specific actors and is described by state. It may represent some kind of state transition, often in some external domain. We often use the term ‘event’ rather loosely as a synonym for messages that represent the actual events. These messages generally live for some time after the event has completed, and are used to communicate the historical record of that event to other systems or sub-systems.
In this part of the message interchange spectrum, the value lies principally within the event itself. Messages are used to convey records of events, together with any relevant event data. The value of those messages is intrinsically bound to the value we place on the events they describe.
Architectural Patterns for Message Interchange
Putting this all together, we can illustrate the spectrum of message interchange in the following diagram.
Different architectural patterns and approaches can be associated broadly with different parts of this spectrum. In the middle, around message and service handling, we are on familiar territory. Here, we build solutions that are based on well-understood integration (EAI) patterns. Increasingly, the architectural emphasis has shifted towards ‘service bus’ (ESB) patterns that subsume EAI within a common, distributed ‘fabric’ that centres on service-based interchange. We use message brokerage, adaptation, service orchestration, mediation and many other techniques to handle message interchange between a bewildering array of applications and systems, and to build service bus frameworks that connect all kinds of distributed services in an agile, scalable and manageable fashion.
The far right of the spectrum is the locus of distributed object models and object request brokerage built on architectures that exploit remote procedure call capabilities. As stated above, service orientation has resulted in the de-emphasis of these techniques and architectures, but they continue to be used in scenarios where a higher degree of coupling at the lower level of object representation offers benefits over service-orientated architectures. Tightly coupled control of distributed transactions is one common motivation for adopting architectures in this part of the spectrum.
The far left of the spectrum is where we will concentrate our attention in the remainder of this article. This part of the spectrum is the locus for two broad architectural concerns. The first has to do with observation of real-world events via event ‘sensors’. Events occur within some given domain. Very often, that domain lies beyond the natural boundaries of our systems or service buses, but events may also be generated within domains that we establish and control. Sensors are used to observe and detect events at the points in time and space where they occur. They are responsible for communicating the historical record of those events to other parts of our architecture. Hence, they generate event messages, which we loosely term ‘events’, and submit them to other systems.
Sensors may involve the use of specialised hardware and device drivers (e.g., bar code readers, RFID tag scanners, temperature sensors, etc.), or may consist of instrumentation that observes events within existing software systems (a familiar example is the event observation API included in BizTalk’s BAM technology). They may live at the edge of our domain or organisation, or may be found scattered orthogonally across the many different applications and systems within our domain. The relationship, therefore, between sensors and other components of our architecture can be quite intricate. However, event observation remains a distinct and separate architectural concern, and lives in its own part of the message interchange spectrum.
Sensor management may be an important requirement within our architectures. This tends to be true in scenarios where sensors operate at the ‘edge’ of our electronic universe and mediate between the real world and our computerised systems. We need to manage arrays and networks of sensors, adding and removing sensors in accordance with evolving requirements, and ensuring that events are observed in a resilient and scalable fashion. Microsoft includes sensor management as a central capability of its RFID Server which ships with BizTalk Server.
The second area of architectural concern is all about the processing of event streams and communication of events to other systems and applications. This is broadly associated with the concept of ‘event-driven architecture’ (EDA). Sensors observe events at the point they occur and generate streams of event messages. These streams must typically be processed and the occurrence of relevant events must be communicated to interested systems and services. Event stream processing (ESP) involves many different approaches and techniques, including event filtering, enrichment, capture, aggregation, visualisation, etc. The outcome of ESP typically includes clean and relevant event records that are then consumed by other systems and used to drive aspects of various systems within the overall business process. Event records may ‘drive’ reactive business processes directly, or they may be collected and stored for later analysis. In this second scenario, the results of analysis may feed back into a loop of continuous process improvement.
The Character of Events
Event messages are typically differentiated from other message categories in terms of several characteristics. The first of these has to do with the degree of non-determinism present in event-driven architectures. We generally do not know when events will occur or the order in which they will be observed, processed and passed on. Of course, non-determinism can occur in other contexts. For example, a message broker receiving batched input typically does not know how many messages it will receive in the batch, and may not know when or how often those batches will be received. However, non-determinacy tends to be a more prominent characteristic of event processing than for other types of message-based architecture.
Another characteristic is the immutability of event records. This is directly related to the temporal nature of events. Events always occur at some point or interval of time. An event record is therefore reporting something that happened in the past. It makes little sense to allow change to that record with respect to data that describes the event itself. That would be somewhat akin to time travel and could lead, quite literally, to paradoxical (as well as inaccurate) states downstream. Event records are naturally immutable.
In reality, the immutability issue is rather more complex. It is common for event records to carry additional data which may be mutable. For example, an event message may contain some kind of time-to-live or expiry data. This does not describe the actual event, but may be subject to change. Microsoft’s CEDR technology, for example, will retrospectively alter the ‘Valid’ expiry time on previously processed event records as part of its speculative processing approach, effectively retracting event records.
Validity, itself, is another aspect of the temporal nature of events. Most event stream processing technologies, including Microsoft’s new CEP engine, process events in accordance with the notion of ‘time windows’. Time windows may be ‘batched’ (a contiguous series of time windows with fixed start and end times) or ‘sliding’ (a single window which maintains a constant time span between the start and current times). Event records are generally deemed valid only if their timestamps occur somewhere within the given time window. As event records age, their timestamps disappear beyond the horizon of the trailing edge of the time window, and they are discarded.
Event processing often requires near-real time detection and response to events. Arguably, the only real-time part of an event processing architecture is located at the sensors which observe events as they occur. Everything else happens after the event. However, the temporal nature of events means that event record validity is generally transient. In more challenging scenarios, we need to reduce the latency of event processing to near-zero. The text-book example is the processing of events within electronic trading systems in which only the smallest latencies can be tolerated and the trading book could be lost if, for example, the garbage collector decides to do a gen2 sweep at just the wrong moment. ESP engines should reduce latency to an absolute minimum.
Unfortunately, event processing architectures must often deal with the double challenge of low latency and high volume. The volume of events, of course, depends on the scenario and the granularity at which events are observed and detected. However, events often prove to be cheap and plentiful. ESP engines should be able to scale effectively to handle 10s or 100s of thousands of events per second when required. It is important to note, however, that a great many scenarios involve far lower throughput.
High volume is often accompanied by high redundancy. There are many reasons why events may be discarded as being of no interest or relevance. For example, we may only be interested in events where some threshold value has been breached. In RFID event processing, it is common to have to deal with large numbers of redundant, duplicate reads of the same tag.
High levels of redundancy within event streams require careful filtering in order to identify only those events we are interested in. An event stream processing engine should be able to filter event records in a very lightweight, performant manner and should discard unwanted records at the earliest opportunity. The ability to match event records according to defined patterns is a fundamental capability of ESP engines, and suggests they can be regarded as a class of rule engine. However, the ability to simply filter individual event records within a given stream, although very useful, hardly begins to address the true complexities that emerge in ESP scenarios.
So far, we have only discussed the concept of event stream processing. Microsoft’s forthcoming engine, however, is billed as a ‘Complex-Event Processing’ (CEP) engine. The idea of complex events emerges from the observation that in some scenarios, value is not always associated with any specific event, or the record of that event. Instead, we accord value to the results of pattern matching across multiple events, often of different types arriving at different times as part of different streams. In short, we value the inferences we can make from processing diverse events.
CEP distinguishes between ‘primitive’ and complex events. A primitive event is observed directly by a sensor and represented using an event message. It is the raw material that we feed to a CEP engine. A complex event is inferred by detection of patterns across the various streams of primitive events. A CEP engine detects and outputs these complex events.
Complex events can be thought of as aggregations of primitive events. Each complex event collects a match-set of inner events. For performance reasons, and depending on the implementation of the engine, these inner events may be ‘consumed’ as soon as a complete match is found, preventing them from being included in other complex events. Alternatively, they could be retained for further matching. The choice of consumption mode takes us deep into various pattern matching approaches that may can be adopted by CEP engines, and is a subject I may well return to at a later date.
The inference of a complex event may be based on context as well as events. Consider the classic RFID scenario where a pallet of goods, each individually tagged, is scanned at a dock door. The RFID tag reader is a sensor which generates a stream of events over the tags it detects. The sensor may generate numerous repeat reads. Due to the physical limitations of the technology, it may fail to read some tags. Hence, the character of the event stream is quite different to the much cleaner nature of the batched messages that BizTalk Server developers are used to dealing with. The semantics of each event record represents that fact that the RFID reader detected a given tag at a particular time and place. That is all. The event record tells us nothing about the pallet as a whole, the order or assignment it is associated with or if we are dealing with goods-in or goods-out.
In this example, the inference of a complex event (e.g., a ‘goods-out’ event where the pallet leaves the warehouse bound for a given customer site) involves more than simple filtering of the single event stream. We need to take context into account. As part of the process, a human operator may operate a ‘traffic light’ system indicating if the pallet is in-bound or out-bound before, or after, placing the pallet under the reader. We may have additional context, such as information about the truck currently at the dock door. This additional context may itself be represented as events (e.g., a ‘traffic lights’ event indicating ‘goods-out’). We may rely on a chain of previous events associated with stock picks, etc, together with database lookups and reference data in order to make inferences. For example, although we cannot ensure that every tag has been read, we may still be able to infer which delivery or part-delivery the pallet is associated with by analysing the event information and other data available to us.
Complex events may, themselves, be matched by additional rules. In CEP, this leads to the concept of event abstraction hierarchies. The primitive events observed by sensors are aggregated into higher-level events which may, themselves be further aggregated into even higher-level events. At each level, we move naturally to a higher level of abstraction in terms of representing events and understanding how processes and systems are behaving. In our warehouse example, for example, we move from raw tag-read events at the lowest layer to higher-order inferred events which represent stock receipt and dispatch. Further matching and aggregation may result in even higher-order events associated with the level of service we are receiving from our suppliers or providing to our customers. As we move up the event abstraction hierarchy, we are likely to use more sophisticated forms of analysis, and to draw inferences over ever-wider aggregations of different event types.
The Event Cloud
The concept of the event ‘cloud’ has become a central idea in CEP. This is not directly related to the idea of ‘cloud computing’, although I will return to this subject before the end of this article. Instead, the notion of the event ‘cloud’ deliberately de-emphasises the concept of event streams in order to highlight the vast depths of untapped resource that lie within myriad events and event types available to us. These events may come from a bewildering variety of sources, may be observed in many different ways, and may not always fit neatly into the concept of sequential streams. Events can be raised with regard to virtually anything that happens within our system or domain, and may well originate from beyond our domain. Within the event cloud, some events will be ordered with regard to others. Within any single event stream, events will typically (though not necessarily) be ordered in a sequential fashion according to their timestamps. However, all kinds of other ordering and association may be present within the cloud.
Extending our RFID example, the event cloud might contain a delivery event associated with multiple tag-read events for in-bound stock from which we infer that certain parts of our order have not been delivered. These higher-order non-delivery events may be analysed against outstanding purchase orders and recent stock pick events to infer the likelihood, or otherwise, of meeting the requirements of our customers. When a customer order cannot be fulfilled, the event we raise to indicate this failure may be directly tracked back to the chain of events which are deemed to be the cause of this failure, providing deeper insight into cause and effect across our entire operation. Our event cloud is, in technical terms, a partly ordered set (a ‘poset’) of events linked by causal relationships.
Ordering, or rather its absence, is an important theme in CEP. The event cloud is amorphous and messy, and is unlikely to deliver strong guarantees with regards to total ordering, especially when considering the ordering of different events related only by causality. Worse than that, relevant events may not even be present, or may arrive too late to be useful in real-time scenarios. CEP engines should be able to reason under uncertainty in order to infer higher-order events even where the raw event data is incomplete and disordered.
Event Processing Networks
The concepts of event abstraction hierarchies and event clouds imply that complex-event processing is likely to be distributed across the enterprise. Filtering and aggregation of primitive events is best carried out close to the observation point – i.e., in proximity to sensors or instrumented line-of-business applications. Inference over filtered or aggregated events may naturally belong within the middle-tier that connects various systems. The correlation and analysis of higher-order events in order to infer events in the upper reaches of our event abstraction hierarchy may best be done close to the centre in association with data cubes, sophisticated analytical tools and reporting facilities.
This leads naturally to the notion of Event Processing Networks (EPNs) which link multiple Event Processing Agents (EPAs). Each EPA is a node within the EPN, and is responsible for processing event input and communicating event output to other nodes, as required. An EPA will typically be an instance of an ESP or CPE engine. As we move up the event abstraction hierarchy, it may be appropriate to use additional types of rules-based and analytical engines as EPAs. For example, we might use Rete production systems, Bayesian engines, etc.
The concept of an EPN overlaps to a significant degree with the general concept of Enterprise Service Buses (ESBs). Each EPA can be thought of as providing a discrete service, and must be able to communicate with other EPAs within the EPN, as required. More importantly, the topography of the EPN is subject to change over time, suggesting that we should be able to deploy EPAs and manage event routing in an agile fashion across the distributed environment. EPNs should ideally be manageable through single-point administration, and should support monitoring and tracking. All of this suggests the use of some common container model for hosting EPAs and plugging them into the EPN.
CEP and BizTalk Server
We are now in a position to consider Microsoft’s recent announcements within the broader context of their enterprise platform. Their CEP engine, due to be launched in 2010, will help to fill a missing gap within Microsoft’s current technology stack. However, I believe its role will only really make sense when combined with the rest of that technology stack. The CEP engine will play a central role in enabling complex-event processing, but in reality, it is just one building block from which future EPNs will be constructed on the Microsoft platform.
We should take note of Microsoft’s existing support for event processing. Much of this centres on BizTalk Server. Three capabilities of this rich application immediately present themselves for consideration:
BAM: Originally distinct from BizTalk Server, Microsoft Business Activity Monitoring provides an event observation API for instrumentation of custom code, analyst tooling for defining KPIs in term of milestones, data values and data aggregations, and event streaming architecture in which events can be written directly to provisioned data tables in SQL Server or indirectly through a high-performance ‘event buffering’ database (the BizTalk Message Box is used for this purpose). BAM also offers additional capabilities for pumping captured event records into a data warehouse and optionally aggregating event data and maintaining cubes built in SQL Server Analysis Services.
RFID: A companion server included with BizTalk. BizTalk RFID’s central role is to provide a Device Service Provider Interface (DSPI) and associated management tools. This allows all kinds of devices including RFID tag readers, printers, bar code scanners, etc., to be plugged into a common fabric in which logical containers are mapped to physical devices. RFID manages the synchronous issuance of commands to devices and supports simulation of hardware devices for testing purposes. In addition, RFID provides an asynchronous event processing infrastructure based on a tree of ‘pipelines’ into which event handler components are plugged. Events from different sources enter the tree at the ‘leaf’ pipelines and are filtered and aggregated up towards the ‘root’ of the pipeline tree where events (including higher-order events) can be emitted to middleware (e.g., BizTalk Server) or captured in data stores. RFID ships with integration of the Business Rules Engine as an ‘event handler’. In the latest version of BizTalk Server, RFID ships with extended support for RFID standards and new capabilities in relation to mobile devices.
Business Rules Engine: A forward-chaining production system that implements the Rete algorithm. Rete engines provide strong support for heavy-duty inference over data ‘facts’ of many different types. However, the Rete algorithm does not lend itself naturally to the lightweight event consumption modes implemented by some ESP and CEP engines and MS BRE has no explicit support for temporal logic. The usefulness of the BRE in very low latency event processing (e.g., within the RFID framework) is therefore quite limited. It is chiefly designed to be used as a ‘business rules’ engine rather than an EPA, and is generally best used only at higher levels within event abstraction hierarchies.
As we can see, there is already a surprising degree of support for event processing with BizTalk Server. At the time of writing, I have no knowledge of detailed plans for the next version of the product. With the recent launch of BizTalk Server 2009, that version is unlikely to ship for another two or three years. However, there are all kinds of ways in which the new CEP engine could be integrated with these existing technologies in order to enhance their effectiveness and reach. The CEP engine has an obvious role to play in business activity monitoring and RFID. We could imagine, for example, using the BAM API to write event streams directly as input to the CEP engine, writing CEP output directly to the BAM event streams, or plugging the CEP engine into BizTalk RFID as an event handler. The last two scenarios will be easily implementable with the current version of BizTalk Server. More generally, we can imagine using the CEP engine in conjunction with the rules engine in order to implement different EPAs at different levels of an event abstraction hierarchy.
One further consideration surrounds the role BizTalk Server plays in the implementation of service bus design patterns on the Microsoft platform. Today, BizTalk Server forces all messages to pass through a set of persisted, transactional queues built using SQL Server. This provides a highly resilient and scalable model, but at the cost of introducing significant latencies. Today’s BizTalk Server would be no use in terms of building the fabric for a low-latency, high throughput EPN. However, BizTalk Server increasingly offers direct support for the Windows Communication Foundation (WCF). WCF is a library contained within the .NET Framework. It provides the basis for all kinds of message-based communication between services, is transport-agnostic and highly extensible. It is the natural foundation for building ESB and EPN fabric. In a few months, Microsoft will launch ‘Dublin’ as a WCF application server, and incorporate it into the Windows system. Doubtless we will see close integration of BizTalk Server and ‘Dublin’ in future editions. Microsoft has long talked about providing low-latency support within BizTalk Server, and it may be that future versions will provide a convincing platform for building EPNs.
CEP and Analytics
The CEP engine is being developed by the SQL Server group, and is expected to ship with SQL Server 2008 R2 (commercials and SKUs have not yet been finalised, so this is somewhat speculative at the current time). Apart from the obvious need for data management, a significant aspect of this association is that SQL Server ships with extensive support for analytics. SQL Server Analysis Services (SSAS) is a standards-compliant suite of technologies that provides comprehensive support for on-line analytical processing using ROLAP against SQL Server or MOLAP/HOLAP via a multidimensional data store. In addition, SSAS offers rich capabilities for data mining. These include a suite of algorithms for predicative analysis using decision trees, Naive Bayes, neural networks, time series regression, sequence clustering, etc. Additional algorithmic support is provided for association and segmentation. SSAS also offers extensive client-side capabilities and APIs for interacting with multidimensional data and an off-line ‘local cube’ facility.
SSAS is a good example of how larger software companies like Microsoft can drive technologies once considered as ‘niche’ towards broad acceptance across the marketplace. By incorporating analytical services into their existing platform, and significantly lowering the price point, Microsoft did more than most to foster the growth of the mainstream OLAP market. This is the real value of incorporating a CEP engine into an existing technology stack, and CEP may travel the same route as OLAP. Perhaps of more immediate importance is the fact that Microsoft’s CEP engine will live alongside an existing analytics capability with widespread adoption across the enterprise market. CEP needs analytics capabilities, and these come ready made with SQL Server.
CEP and the Presentation Tier
CEP involves the inference of higher-order events. Higher levels of the event abstraction hierarchy tend to be strongly aligned to business, rather than technical, viewpoints. The visualisation of these higher-order events and presentation of event-driven intelligence to business users is an important aspect of CEP capability.
Currently, Microsoft’s premier presentation tool for monitoring and analytics is PerformancePoint Server. This product was originally known as ProClarity and was purchased by Microsoft in 2006. From the beginning of April, PerformancePoint Server has been withdrawn from Microsoft’s portfolio (for new sales only – it continues to be supported) in anticipation of inclusion of its monitoring and analytics functionality in the next version of SharePoint Server. SharePoint Server, together with Office System, is evolving to provide richer business analytics and intelligence capabilities, delivering relevant information and knowledge directly to business users through familiar interfaces. We can therefore expect that SharePoint Server and Office will be the locus for presentation and visualisation of complex event-related information. In addition, SQL Server provides business reporting capabilities which can be surfaced via the browser and within SharePoint sites.
Microsoft’s strategy has always been to provide a foundational platform on which others can build. They aim to enhance the value of their platform by encouraging a rich eco-system to build up around specific technologies. I would expect this strategy to be applied to CEP. We know that the CEP engine will ship with direct integration of development tools in Visual Studio. I would not, however, expect Microsoft to provide rich tooling aimed at specific verticals. I would be surprised, for example, to see Microsoft supply algorithmic trading reporting and visualisation tools aimed at financial institutions. This is an area where they will probably look to their ISV partners to build rich offerings on top of the Microsoft platform. Such is the Microsoft way.
The following illustration builds on the earlier diagram illustrating the message interchange spectrum in an attempt to place the CEP engine in context with regard to the rest of Microsoft’s enterprise platform.
The Application of CEP
All this brings us to some final considerations with regard to how CEP technologies will be applied. Is CEP an exotic curiosity of interest only within highly specialised environments, or does it have the capacity to grow into a mainstream technology that will radically change the face of IT?
I suspect the answer will lie somewhere between these two extremes. If I were a betting man, I would risk good money on the idea that CEP technologies will continue to grow in importance and sophistication over many years. I don’t see some sudden revolution, but I do see continued evolution in which CEP will play an increasingly important role. By itself, a CEP engine will not revolutionise the market place. Combined with the richness already present in several competitive platforms, including Microsoft’s platform, I believe that sophisticated event driven architectures will become an important part of the IT landscape over the next decade.
Today, the application of CEP on the Microsoft platform is likely to be limited to those areas which are already centred heavily on the detection of, and response to, complex events. Automated trading systems, manufacturing systems, fraud detection, network monitoring and click-stream analysis are obvious candidates. From my BizTalk-centric perspective, the really interesting consideration is the role that CEP will play in relation to automated business process, business activity monitoring, business rule processing and analytics. Today, many of our customers remain indifferent to the potential locked up within the event clouds in their own domains, and do not envisage implementation of the kind to real-time situation analysis that CEP enables. Implementation of automated business processes provides them with significant benefit in contrast to the more manual approaches they have relied on previously. However, addressing the orthogonal issues of complex-event detection is low on their list of priorities. Will this change?
I am convinced that it will. As I say, I do not expect this to be an overnight revolution. It will, I believe, be driven by three complementary and interlinked factors:
Increased Business Expectations: The business expectations laid upon IT grow inexorably over time. Consider how far things have progressed since the mid-1990s when concepts of ‘service orientation’ were unknown and 3-tier architecture was still regarded as an unobtainable nirvana or cynical marketing hype by many. Businesses are not directly interested in the architectures and technologies that we use. They want to see reduced costs, increased productivity, improved ROI, greater competitiveness etc. As more organisations discover the benefits of real-time situational awareness, we can expect to see adoption of CEP driven increasingly by the familiar competitive forces of the free market.
Cloud Computing: I’m amazed, frankly, that so little has been said to date about the relevance of cloud computing to CEP. We are moving rapidly into an era of cheap availability of massive compute resource. The momentum behind this is growing daily with all the major players now scrambling to get their strategies and technologies into place. While many remain sceptical, the major IT companies have long since been convinced that cloud computing will play a central role in tomorrow’s IT landscape.
It may be that cloud computing is still seen by many as simply a matter of consuming services like SalesForce.com. However, the reality is that cloud computing is also about connecting organisations and systems in ways that obliterate the impediment of strong domain boundaries. This, in turn, means that the size and richness of available event clouds is set to grow exponentially in future years. Companies that rapidly learn to exploit this vast resource will gain an early competitive edge. They will need to utilise the massive compute power available through cloud computing in order to gain that advantage.
Rich Platforms and Eco-Systems: This is the theme I have emphasised earlier in this article. CEP will never become a mainstream approach if it remains a matter of a few isolated and ‘exotic’ technologies that require significant investment of time and skills to exploit. There is a sense in which Microsoft’s announcement of a CEP engine is the least interesting aspect of this discussion. What excites me is the emergence of this capability within existing platforms that are already rich in complementary technologies, have deep market penetration and broad communities of developers and ISVs. This is what CEP needs in order to make a significant impact on the IT landscape.
Call to Action
My response to last Monday’s announcement has been an attempt to show why the BizTalk community should take note of the technology that Microsoft will release next year, and why I believe that CEP will represent an important capability within the Microsoft platform. What can you do to prepare?
· Consider reading David Luckham’s book ‘The Power of Events’. Luckham’s book, first published in 2002, introduced much of the terminology and concept of event clouds and EPNs, and remains the best-know text book on this subject. Also, look for Opher Etzion’s forthcoming book ‘Event Processing in Action’. His blog site is at http://epthinking.blogspot.com/feeds/posts/default.
· If you are geeky, take a look at the Microsoft Research material on CEDR. It is academic, but will give you an idea of the kind of rules-based technology represented by Microsoft’s CEP engine. Try the following links:
· If you are really geeky, take a look at some of the debate around how CEP relates to existing Rete rules technologies. The JBoss Rules team, for example, have a growing capability around CEP. Also, take a look at this paper on blending Rete and CEP inference networks - http://icep-fis08.fzi.de/papers/iCEP08_3.pdf.
· Finally, look out for CTPs of the Microsoft CEP engine later this year, possibly released with SQL Server 2008 R2 CTPs.