Distributed Information Systems Laboratory LSIR

Infer user expertise from Online Social Network content

Project Details

Infer user expertise from Online Social Network content

Laboratory : LSIR Semester / Master Completed


Online Social Networks (OSNs) allow users to generate and share large amounts of content with their friends. Additionally, we observe that different groups of friends tend to share content about different topics (e.g., “traveling” vs. “computer science”). Hence, by analyzing the production, flow, and consumption of information inside OSNs one could identify the area of interests of the users based on the content (e.g., tweets, comments, posts) that they produce or share.

The goal of the project is to identify what are the characteristic usage patterns of an active OSN user that is also a subject-matter expert. In particular, for our analysis we will use a large Twitter dataset, being it one of the most popular OSNs. Twitter users post and share (i.e., tweet/re-tweet) information about any topic, and follow others in order to receive their tweets. Based on the social graph structure, and the activity in the network (e.g., tweet content, re-tweet patterns, etc.), we want to infer the area and degree of expertise of each Twitter user.


  • Strong fluency in Java and/or Python
  • Familiarity with SQL and basic analytical queries

Preferred, but not required

  • Exposure to MapReduce/Hadoop
  • Familiarity with machine learning tools (e.g., Mahout, WEKA)

Contact: Michele Catasta