Datasets

We maintain the following datasets which are used in our research works. The datasets are either produced by our lab or provided by other collaborative data providers.

Energy Consumption Dataset :

10 years French national hourly energy demand dataset (year 2003-2012). The dataset has been curated and completed with public holiday and weather information (e.g., temperature, humidity). Additionally, we also provide the raw weather data for 8 biggest city in France.

Crisis Tweets Collections :

  • Crisis Tweets Collections [More]

Finding timely and useful information during crises is critical for making potentially life-saving decisions. Social media is increasingly being used to broadcast useful information during such situations.

CrisisLex.org is a repository of crisis-related social media data and tools. Currently it includes collections of crisis tweets and a lexicon of crisis terms. It also includes tools to help you create these collections and lexicons.

Social network :

  • Tweets containing URLs collected over a period of three weeks [More]

The dataset contains the propagation of URLs in the social network of Twitter, a popular microblogging site. We track 15 million URLs exchanged among 2.7 million users over a 300 hour period. Data analysis uncovers several statistical regularities in the user activity, the social graph, the structure of the URL cascades and the communication dynamics.

  • Tweets related to LONDON collected over a period of three months [More]

12 million tweets related to LONDON were collected over four months, which can serve as an important basis in fusing social and sensor data in the cloud.

Energy consumption :

REDD data set is published by MIT, which contains several weeks of power data for 6 different homes, and high-frequency current/voltage data for the main power supply of two of these homes

Sensor Data :

Data collected from 54 sensors deployed in the Intel Berkeley Research lab between February 28th and April 5th, 2004