HomeresearchPeopleGeneral InfoSeminarsResources
Parasol Seminar Fall 2009 | Parasol Laboratory Intranet


Parasol Seminar Fall 2009

Friday November 20, 2009. 4:00 pm
Room 302 HRBB


Sketching asynchronous data streams over sliding windows

Bojian Xu.
Department of Computer Science and Engineering, Texas A&M University


Abstract

Many real world data naturally arrive as streams. Examples include network traffic at a router and the sequence of accesses to a large database. These streaming data need to be monitored online for various reasons, such anomaly detection, load balancing and even helping make business decisions. However, due to the large size of such streaming data, conventional data processing methods, such as storing the data in a database and issuing offline SQL queries, are not feasible.

In this talk, I will introduce the data stream processing phenomenon, followed by the proposition of the new {em asynchronous data streams} model, motivated by applications involving network data. I will introduce a sampling technique for sketching the recent data elements over asynchronous data streams. This small space sketch can return provably error-bounded estimates for two basic aggregates over the relevant stream elements: sum and median. I will conclude the talk by a quick overview of the followup works on more generalized time-decayed asynchronous data stream processing, as well as related open problems.


Biography

Bojian Xu received his B.E. in Computer Science and Engineering from Zhejiang University, China in 2000. He worked for China Mobile Communications Corporation from 2000 to 2004. After spending the Fall 2004 semester as a master student in the Computer Science Department of the University of Alabama, he joined the department of Electrical and Computer Engineering of Iowa State University, where he will be graduating with a Ph.D. in Computer Engineering in December 2009. He is currently a Senior Research Associate in the department of Computer Science and Engineering at Texas A&M University, working with Professor Jeffrey Scott Vitter. His research interests are in developing algorithms and systems for managing large data sets. He has been working on managing distributed massive streaming data with the presence of memory and energy constraints, and is now more focused on compressed data structures for indexing massive data sets.


Parasol Home | Research | People | General info | Seminars | Resources  

Parasol Lab, 301 Harvey R. Bright Bldg, 3112 TAMU, College Station, TX 77843-3112 
Contact Webmaster      Phone 979.458.0722     Fax 979.458.0718 
Dwight Look College of Engineering
Department of Computer Science and Engineering | Dwight Look College of Engineering | Texas A&M University
    
Privacy statement: Computer Science and Engineering Engineering TAMU