Track 4: AI-Empowered IS Development

Topic Classification for Short Texts

Dan Claudiu Neagu, Babeș-Bolyai UniversityFollow
Andrei Bogdan Rus, Cicada TechnologiesFollow
Mihai Grec, Cicada TechnologiesFollow
Mihai Augustin Boroianu, Cicada TechnologiesFollow
Gheorghe Cosmin Silaghi, Babeș-Bolyai UniversityFollow

Abstract

In the context of TV and social media surveillance, constructing models to automate topic identification of short texts is key task. This paper formalizes the topic classification as a top-K multinomial classification problem and constructs worth-to-consider models for practical usage. We describe the full data processing pipeline, discussing about dataset selection, text preprocessing, feature extraction, model selection and learning, including hyperparameter optimization. When computing time and resources are limited, we show that a classical model like SVM performs as well as an advanced deep neural network, but with shorter model training time.

Recommended Citation

Neagu, D. C., Rus, A. B., Grec, M., Boroianu, M. A., & Silaghi, G. C. (2022). Topic Classification for Short Texts. In R. A. Buchmann, G. C. Silaghi, D. Bufnea, V. Niculescu, G. Czibula, C. Barry, M. Lang, H. Linger, & C. Schneider (Eds.), Information Systems Development: Artificial Intelligence for Information Systems Development and Operations (ISD2022 Proceedings). Cluj-Napoca, Romania: Risoprint. ISBN: 978-973-53-2917-4. https://doi.org/10.62036/ISD.2022.50

Paper Type

Full Paper

DOI

10.62036/ISD.2022.50

References_DOI_ISD.2022.50.pdf (99 kB)

Download

COinS

Topic Classification for Short Texts

Track 4: AI-Empowered IS Development

Topic Classification for Short Texts

Abstract

Recommended Citation

Paper Type

DOI

Search

Browse

Author Corner

Links

Track 4: AI-Empowered IS Development

Topic Classification for Short Texts

Presenter Information

Abstract

Recommended Citation

Paper Type

DOI

Share

Search

Browse

Author Corner

Links