ECIS 2015 Completed Research Papers

Privacy on Reddit? Towards Large-scale User Classification

Benjamin Fabian, Humboldt University BerlinFollow
Annika Baumann, Humboldt University BerlinFollow
Marian Keil, Humboldt University BerlinFollow

DOI

10.18151/7217310

Abstract

Reddit is a social news website that aims to provide user privacy by encouraging them to use pseudonyms and refraining from any kind of personal data collection. However, users are often not aware of possibilities to indirectly gather a lot of information about them by analyzing their contributions and behaviour on this site. In order to investigate the feasibility of large-scale user classification with respect to the attributes social gender and citizenship this article provides and evaluates several data mining techniques. First, a large text corpus is collected from Reddit and annotations are derived using lexical rules. Then, a discriminative approach on classification using support vector machines is undertaken and extended by using topics generated by a latent Dirichlet allocation as features. Based on supervised latent Dirichlet allocation, a new generative model is drafted and implemented that captures Reddit’s specific structure of organizing information exchange. Finally, the presented techniques for user classification are evaluated and compared in terms of classification performance as well as time efficiency. Our results indicate that large-scale user classification on Reddit is feasible, which may raise privacy concerns among its community.

Recommended Citation

Fabian, Benjamin; Baumann, Annika; and Keil, Marian, "Privacy on Reddit? Towards Large-scale User Classification" (2015). ECIS 2015 Completed Research Papers. Paper 43.
ISBN 978-3-00-050284-2
https://aisel.aisnet.org/ecis2015_cr/43

Download

COinS

ECIS 2015 Completed Research Papers

Privacy on Reddit? Towards Large-scale User Classification

DOI

Abstract

Recommended Citation

Search

Links

Browse

Author Corner

ECIS 2015 Completed Research Papers

Privacy on Reddit? Towards Large-scale User Classification

Authors

DOI

Abstract

Recommended Citation

Share

Search

Links

Browse

Author Corner