First up today was Gary Riley talking about how he introduced much greater performance into CLIPS 6.3. The talk was in more detail than last year’s. Gary talked about speed was not the chief priority for earlier versions of CLIPS, and how, when he first designed the engine, he was uncertain of how beneficial certain features such as hash tables would be. Over the years, though, CLIPS had started to fall significantly behind the performance of Java engines. The improvements made in 6.3 have broadly allowed it to leapfrog most Java engines. Gary discussed his belief that Java code is unlikely to offer the same performance possibilities as C code, but also argues that you can’t directly compare a C code base with OO Java code. Now, I am not sure I totally agree with his position on this, but that is OK. Gary made it clear that the speed improvements have come about mainly through adoption of prior art rather than some fundamental change in the algorithm.
The biggest improvement came from hashing the alpha memories. He spent some time fine-tuning the hash table size against a number of benchmarks (Waltz and manners). Gary also hashed the beta memories. He heard the suggestion that hashing the beta memories does not yield much performance improvement, but felt that there was a case for implementing this approach. Gary then went on to discuss salience groups on the agenda. This feature reduces the amount of agenda traversals. He then described the implementation of ‘Not’ nodes that uses a lazy evaluation approach using a kind of bookmarking to reduce the amount of work that needs to be done in determining existential quantification. Gary described his approach to ‘asymmetric’ retraction that has been in CLIPS for some time. In classic Rete, retraction is symmetrical to assertion, but he has implemented an approach in which he uses additional links to retract partial matches from the bottom of the network back up to the point of change. Gary went on to explain a number of additional areas where he tidied up and improved CLIPS, including Exists, joins from the right for handling Nots with embedded Ands, etc. He finished off with a discussion of benchmarks and benchmark results.
And so on to the next speaker of the day who was...well me actually! My talk was entitled “A Survey of Complex-Event Processing Models” and was the third of the CEP talks given during the conference. I was able to gloss over the first few slides thanks to the previous two presentations from Paul and Edson, although I had a couple of points to make. In CEP, it is the events that are complex, not necessarily the processing (though it may be). One important aspect of CEP is the notion of event abstraction hierarchies and the way they can be used to present views of complex events to different personas. I also introduced the broad notion of a complex event-processing agent and the way it detects and reifies complex events, and to describe the role of agents within event processing networks and wider event driven architectures.
And so on to the heart of the talk. The CEP market today is largely defined by event stream processing technologies. These mainly use forms of dataflow. There is a lot of variation in implementation, but a useful perspective is to think in terms of stream-orientated processing (immediate forms of event-by-event processing with minimal event retention) and set-based processing of materialised views of event data. A typical engine may use aspects of both approaches, often with an emphasis towards one or the other. I described stream-orientated approaches in a bit more detail, together with the need to implement select & consume strategies based on contextual configurations. I then compared this with Rete. Rete uses a very specific type of dataflow involving two inner networks. It combines stream-orientated and set based processing side-by-side. It is clear that a Rete network can reasonably be used for complex event detection. However, there are issues. The ‘holistic’ nature of a rete network (constructed over many rules) supports strong forms of redundancy elimination, but may be harder to maintain in a continuous query scenario. The degree of logical synchronisation required in a rete network is a barrier to scalability through parallelisation. Event processing technologies tend to offer lees redundancy elimination, but keep individual queries more separate, making them easier to maintain and easier to handle in a parallelised fashion. Event processing engines are the better bet for ultra-low latency, high-throughput time-critical event processing. Other issues concern having to add additional semantic and temporal processing support to traditional Rete implementations.
The last part of the talk looked at Rete engines an CEP agents, and explored various ways in which such agents might be combined with event stream processing technologies in an EPN and models for building hybrid agents that combine aspects of Rete and event stream processing. Some engines have begun to implement aspects of certain models. There are further avenues to explore. Rete is a better bet for downstream event processing, further away from event sources, and offers industrial-strength inferencing, a natural technology for bridging between the EPN and other parts of the event-driven architecture and offers great features for computing event abstraction hierarchies.
Next up was Mark Proctor who claimed that Fair Isaac had tried to poison him (with alcohol I assume) the night before :-) His talk was on distribute agent-based computing. He talked about Gregor Hohpe, responsible for EI patterns, and the work he is doing on stateless conversation patterns. An agent is capable of acting in an environment, communicate with other agents, may be able to reproduce itself and whose behaviour tends towards satisfying its objectives. In multi-Agent systems are considered to be autonomous. Mark contrasted this to distributed systems which are designed as a set of independent subsystems which collaborate as a single system. I think this distinction is problematic, but there is certainly a distinction to be made between agent-based processing and other forms of distributed processing. A mobile agent can migrate from machine to machine across a heterogeneous network. Agents don’t’ have to be mobile. They may be stationary. Mobility can be used to create dynamically self-configuring topologies that reduce network bandwidth usage, latency, etc. Mobile agents can provide a robust, fault-tolerant environment.
Mobility may be based on code mobility, code-on-demand and mobility of agents. Instead of moving code, we may choose to move state between (more) static agents. Agents need not only to be able to communicate with each other. They need to interact with each other cooperatively. Agent-to-agent communication can be enabled through shared memory (e.g. via a ‘blackboard’) and direct message passing. Mark talked about BDI model and agent communication models and languages (KQML, FIPA-ACL) and various ontological standards. He introduced the idea of ‘speech act’ theory involving locution, illocution, perlocution, etc. There is some debate about the application of Speech Act theory to computing. Mark talked in more details about various standards and specifications, and showed us examples of the conversational approach. He talked about coordination using different models (e.g., direct supervision, standardisation, mutual adjustment, distributed search goals). He also introduced the concept of contract nets involving recognition, bidding, awarding and expediting.
And so on to the final presentation by Dr. Charles Forgy. Charles developed the Rete algorithm back in the 1970’s and must surely be the best know name in this corner of IT. He has continued to work on rules engine implementation over the last three decades and is responsible for the fastest implementation of inferencing engines currently available. His talk was on “Making Parallelism Available to Rule Developers”. Processor speeds are not increasing as fast as they were. The free ride has been over since about 2003/2004. Attention is now on multi-core computing. Rule-based systems will need to adapt to this new world. A canonical forward-chaining rule-based Knowledge Source uses a recognise-act cycle. The amount of parallelisation available to these systems is limited. He described Amdahl’s Law. In early days of Rete, conflict resolution took perhaps 90% of the total execution system. This is no longer true. In OPSJ, Dr Forgy increased the rate of change of working memory and to make parallelism visible to the rule developer in a way that avoids error-prone forms or parallelisation.
One approach to improve parallelisation is to allow the recognise-act cycle to perform more changes on each cycle. A second possibility is to allow multiple recognise-act cycles to act in parallel. From the beginning, rules have been tuple-orientated. Set-orientated rules allow conditions to match sets of objects. There is no reason, however, why rule languages need to be tuple-orientated. Indeed, most modern rule languages support set-orientated approaches allowing more work to be done in a single cycle. Dr. Forgy talked about immediate rules. These rules cause multiple rules to fire concurrently. He described ways to hint to an engine when it is safe to use parallelisation. He talked about the CONS keyword in OPSJ that is used to control fact consumption. CONS provides fine-grained refraction control. OPSJ also keeps a count of the number of rules that might match an object. When the count goes to zero, the fact is retracted. OPSJ can run multiple knowledge sources, but historically in a limited fashion. This has been improved with the sendPacket API for use in parallelised situations. The engine has been changed to allow knowledge sources to pause when the conflict set is empty. The arrival of a packet restarts processing. OPSJ has no pre-defined definition of packages. This is in the gift of the developer. OPSJ adds a Probe construct which is a special type of rule that sends messages to its owning knowledge source. Dr. Forgy went on to describe Blackboard-style levels to working memory. Levels can be specified for insert operations, sendPacket operations and conditions.
Dr. Forgy then went on to describe how multiple threads might be used in a single Knowledge Source. This is very problematic. Is there room for improvement? The issue, as always, is shared mutable data. He talked about the use of private working memory levels and other ideas. He then moved on to a discussion of parallelism over networks. There are possibilities regarding shared distributed memory. He ended up talking about languages. Java is not as fast as C/C++. Modern processors are very complex, out of order engines that might have 100 in flight instructions. Java, according to Dr. Forgy, does not take advantage of this properly and as a result, IPCs are low. Sun’s Hotspot JiT compiler, for example, produces machine code that has too many sequential dependencies and fails to take advantage of modern architectures. There is room for improvement.
A great conference. A million thanks to James, Greg, Chelanie, etc., for all their hard work in making this happen.