Location

Level 0, Open Space, Owen G. Glenn Building

Start Date

12-15-2014

Description

Mobile application development is an emerging lucrative and fast growing market. With the steady growth of the number of apps in the repositories the providers will inevitably face the need to fine-grain the existing hierarchy of categories used to organize the apps. In this paper we present a method to bootstrap the categorization process via topic modeling. We apply Latent Dirichlet Allocation (LDA) to the textual descriptions of iTunes apps in order to identify recurrent topics in the collection. We evaluate and discuss the results obtained from training the model on a set of almost 600,000 English-language app descriptions. Our results demonstrate that automated categorization via LDA-based topic modeling is a promising approach, that can help to structure, analyze and manage the content of app repositories. The topics produced complement the original iTunes categories, concretize and extend them by providing insights into the underlying category content.

Share

COinS
 
Dec 15th, 12:00 AM

Enriching iTunes App Store Categories via Topic Modeling

Level 0, Open Space, Owen G. Glenn Building

Mobile application development is an emerging lucrative and fast growing market. With the steady growth of the number of apps in the repositories the providers will inevitably face the need to fine-grain the existing hierarchy of categories used to organize the apps. In this paper we present a method to bootstrap the categorization process via topic modeling. We apply Latent Dirichlet Allocation (LDA) to the textual descriptions of iTunes apps in order to identify recurrent topics in the collection. We evaluate and discuss the results obtained from training the model on a set of almost 600,000 English-language app descriptions. Our results demonstrate that automated categorization via LDA-based topic modeling is a promising approach, that can help to structure, analyze and manage the content of app repositories. The topics produced complement the original iTunes categories, concretize and extend them by providing insights into the underlying category content.