Scientific Paper: “Boosted seed oversampling for local community ranking”

Krasanakis, E., Schinas, E., Papadopoulos, S., Kompatsiaris, Y., & Symeonidis, A. (2019). Boosted seed oversampling for local community ranking. Information Processing & Management, 102053 – https://doi.org/10.1016/j.ipm.2019.06.002

Abstract

Local community detection is an emerging topic in network analysis that aims to detect well-connected communities encompassing sets of priory known seed nodes. In this work, we explore the similar problem of ranking network nodes based on their relevance to the communities characterized by seed nodes. However, seed nodes may not be central enough or sufficiently many to produce high quality ranks. To solve this problem, we introduce a methodology we call seed oversampling, which first runs a node ranking algorithm to discover more nodes that belong to the community and then reruns the same ranking algorithm for the new seed nodes. We formally discuss why this process improves the quality of calculated community ranks if the original set of seed nodes is small and introduce a boosting scheme that iteratively repeats seed oversampling to further improve rank quality when certain ranking algorithm properties are met. Finally, we demonstrate the effectiveness of our methods in improving community relevance ranks given only a few random seed nodes of real-world network communities. In our experiments, boosted and simple seed oversampling yielded better rank quality than the previous neighborhood inflation heuristic, which adds the neighborhoods of original seed nodes to seeds.