300 likes | 539 Vues
A new model and architecture for data stream management. Aurora. Sample Problem. Inspiration & Domain. Stream Processing Engines. HADP vs DAHP Events & Triggers Continuous Queries Real-time processing Transient data Lossy information. Application Domains. Online Auctions
E N D
A new model and architecture for data stream management Aurora
Sample Problem Inspiration & Domain
Stream Processing Engines • HADP vs DAHP • Events & Triggers • Continuous Queries • Real-time processing • Transient data • Lossy information
Application Domains • Online Auctions • Network Traffic Management • Habitat Monitoring • Military Logistics • Immersive Environments • Road Traffic Monitoring • System Monitoring • Smart Energy Grid
Aurora Overview
The Topic • Aurora • The prototype • DBMS / SPE / DSMS • UI • The query language • The project • The authors
The Authors • M.I.T. , Department of EECS and Laboratory of Computer Science • Michael Stonebraker • Brandeis University, Department of Computer Science • Daniel J. Abadi • Mitch Cherniack • Brown University , Department of Computer Science • Don Carney • Uğur Çetintemel • Christian Convey • Sangdon Lee • Nesime Tatbul • Stan Zdonik
Talk Overview • Stream Processing Engines • SQuAl • Runtime • Related work
Aurora SQuAl (Stream Query Algebra)
SQuAl Overview • Connection Points • Models • Continuous Query • View • Ad-hoc Query • Operators • Order-agnostic • Order-sensitive
SQuAl Operators • Order-agnostic • Filter • Map • Union • Order-sensitive • BSort • Aggregate • Join • Resample • Quirks!
Resample (Ordered) • Based on RRDTool’s philosophy? • Paper: • Simple interpolation • Use The Force, Read The Source: • Average • Count • Sum • Max • Min • LastVal
Aurora Runtime
Query Optimization • Dynamic Continuous Query Optimization • Inserting projections • Combining boxes • Reordering boxes • Ad-hoc query optimization
Real-time Scheduling • Timestamped Tuples • Train scheduling • Interbox nonlinearities • Intrabox nonlinearities • Superboxes • Introspection • Static • Run-time
Handling overload • QoS specifications • Response times • Tuple drops • Values produced • Load Shedding • Not Implemented at the time
Aurora Related work
Related work • STREAM • Stanford University, 2000-2006 • Telegraph • UC Berkley, 2000-2007? • SASE • UC Berkley / Mass Amherst, 2006-2008? • Cayuga • Cornell University, 2005-2007? • PIPES • University of Marburg, 2003-2007? • NiagaraCQ • University of Wiscon-Madison, 1999-2002
Complex Event Processing Today • Oracle • Oracle CEP • Microsoft • MS SQL Server StreamInsight • Open Source • OpenPDC • Aleri • Coral8 • TruViso • StreamBase • Aurora’s Grandchild • IBM • SPADE • Active Middleware Technology