Mining, ranking and social media exploration of news events

Date(s) - 17/06/2016
This talk mainly addresses the problem of ranking news events on a daily basis for large heterogeneous news corpora, an essential step towards reducing information overload. News ranking has been addressed in the literature before but with individual news articles as the unit of ranking. We believe that a cluster of news articles representing an event is a better unit of ranking as it provides a wide variety of cues such as popularity, source diversity and authority. Building on the observation that events sometimes are parts of long-running topics we devise features which encode the historical importance of an event. Our main contribution in this work is to blend these features to provide a novel technique for ranking news events. To this end, we propose event mining and feature generation approaches which provide better estimates of event importance. Finally, we conduct extensive evaluation of our approaches on two large real world news corpora each of which span for more than a year with a large volume of up to tens of thousands of daily news articles and using a clean human curated ground-truth from Wikipedia Current Events Portal. Experimental comparison with a state-of-the-art news ranking technique based on textual features demonstrates the effectiveness of our approach.

Further the talk will also address the problem of exploring comments for news events from social media such as Reddit. These comments are often noisy and sparse, therefore, identifying sub-topics within them to explore social media is a challenge. In this work, we develop an effective way to distill sub-topics from all the comments related to a textual query and apply diversification techniques to select comments.

Speaker:  Vinay Setty

Short bio: Vinay Setty is a postdoctoral researcher at the Databases and Information Systems group, Max Planck Institute in Informatics, Saarbruecken, Germany, headed by Prof. Gerhard Weikum. Prior to that he got a PhD from University of Oslo, Norway under the supervision of Prof. Roman Vitenberg and Prof. Maarten van Steen on the topic of “Design, Analysis and resource provisioning of scalable Publish/Subscribe for social interaction applications”. Vinay also worked as a visiting scholar at Spotify in Sweden analyzing their pub/sub infrastructure and workload. Vinay has also served as a PC member for DEBS, ECML PKDD conferences recently. Prior to PhD Vinay got his Masters degree from Saarland University and wrote his Master’s thesis on exploring interesting time points in versioned documents again in Prof. Weikum’s group.