Design and implementation of a data stream management system with advanced complex event processing capabilities
Type : Master's thesis
Publication Status : unpublished
Access : restrictedAccess
The world has seen proliferation of data stream applications over the last years. These applications include computer network monitoring, Radio Frequency Identication (RFID)-based supply chain and traffic management systems, e-trading, online financial transactions, web click-streams, some mobile communication applications, and civilian or military applications using sensor networks. All of these applications are considered ?mission-critical? by related organizations and require real-time stream processing to detect simple or complex events, so that strategic decisions can be made quickly. An emerging system architecture called Data Stream Management System (DSMS) is well-suited to address the analysis needs of emerging data stream applications. DSMS forms the basis for our project and allows processing of high-speed data streams with different continuous queries. In this thesis, we present design and implementation details of a data stream management system with advanced Complex Event Processing (CEP) capabilities. Specifically, we add ?online? Association Rule Mining (ARM) and testing capabilities on top of an open-source DSMS system and demonstrate its capabilities over fast data streams. Our most important findings show that online ARM can generate (1) more unique rules, (2) with higher throughput, (3)much sooner (lower latency) than online rule mining. In addition, we have found many interesting and realistic musical preference rules such as ?If a person listens to George Harrison, then s/he also listens to The Beatles?. We demonstrate a sustained rate of 15K rows/sec per core. We hope that our findings can shed light on the design and implementation of other fast data analytics systems in the future.
Date : 2013-06
Share this page