Discovering socially similar users in social media datasets based on their socially important locations

ÇELİK M., Dokuz A. S.

INFORMATION PROCESSING & MANAGEMENT, vol.54, no.6, pp.1154-1168, 2018 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 54 Issue: 6
  • Publication Date: 2018
  • Doi Number: 10.1016/j.ipm.2018.08.004
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus
  • Page Numbers: pp.1154-1168
  • Keywords: Spatial social media mining, Socially similar users, Mining user similarity, Socially important locations, Twitter, NETWORKS, TWITTER, PATTERNS
  • Erciyes University Affiliated: Yes


Socially similar social media users can be defined as users whose frequently visited locations in their social media histories are similar. Discovering socially similar social media users is important for several applications, such as, community detection, friendship analysis, location recommendation, urban planning, and anomaly user and behavior detection. Discovering socially similar users is challenging due to dataset size and dimensions, spam behaviors of social media users, spatial and temporal aspects of social media datasets, and location sparseness in social media datasets. In the literature, several studies are conducted to discover similar social media users out of social media datasets using spatial and temporal information. However, most of these studies rely on trajectory pattern mining methods or take into account semantic information of social media datasets. Limited number of studies focus on discovering similar users based on their social media location histories. In this study, to discover socially similar users, frequently visited or socially important locations of social media users are taken into account instead of all locations that users visited. A new interest measure, which is based on Levenshtein distance, was proposed to quantify user similarity based on their socially important locations and two algorithms were developed using the proposed method and interest measure. The algorithms were experimentally evaluated on a real-life Twitter dataset. The results show that the proposed algorithms could successfully discover similar social media users based on their socially important locations.