Project DetailsDistributed querying the correlations of streaming time series in the cloud
|Laboratory : LSIR||Semester / Master||Proposal|
Time-series data is dramatically increasing . This trend is observed since the devices producing time-series data have exploded. Primary contributors to these are mobile phones, mobile sensors, servers, smart meters, smart home appliances. Due to modern hardware, these devices have the capacity to generate streaming time-series data at blazing speeds. In this light there is a natural demand to accommodate more and more time-series data for efficiently producing answers to important and interesting problem.
One such important problem in large-scale time-series data processing is to find correlated time series pairs above a specific correlation threshold. Some works have been devoted to solve this correlation threshold query over static time series data in a centralized or batch way. However, as the amount of streaming time series produced by various devices is increasing exponentially, traditional way to process such queries using a stand-alone computer is obvious unsuitable.
In this project, we target to explore how to process the correlation threshold query over sliding-window streaming time series in the cloud. We rely on a novel distributed stream processing system in the cloud, namely, Storm system. Regarding the algorithm aspect, in order to make the system adaptive to process queries over the sliding window of different lengths, we will apply some dimension reduction techniques (e.g., DFT, etc..). Moreover, some index techniques could be applied for reducing the search space.
- Having the motivation for indulging in a research oriented project
- Familiar with basic query processing and optimization techniques in database area.
- Programming skills with Java and experience on MapReduce and HBase is a plus.
ContactsIn case of any questions, please drop us an email or come to our offices: