Data Analytics for Business and Societal Challenges
Loading...
Paper Number
1178
Paper Type
Completed
Description
Natural Language Processing (NLP)-based machine learning receives continuous attention in Information System (IS) research and practice. Despite the success of deep learning models, NLP feature engineering still plays a vital role in contexts where only little annotated data is available, and in which explainability is a precondition for productive deployment. However, NLP feature engineering is a labor-intensive and time-consuming endeavor, and there is still limited shared knowledge about the distinctive characteristics of NLP features from an interdisciplinary perspective. To address this gap, we draw on a systematic literature review and develop a five-dimensional NLP feature taxonomy based on 133 unique features from 211 scientific studies. This helps IS researchers and practitioners to classify, compare, and evaluate their NLP studies. Moreover, we used cluster heat mapping analysis to derive three clusters and several white spots to provide further assistance for developing and designing new NLP solutions in IS.
Recommended Citation
Wambsganss, Thiemo; Engel, Christian; and Fromm, Hansjörg, "Improving Explainability and Accuracy through Feature Engineering: A Taxonomy of Features in NLP-based Machine Learning" (2021). ICIS 2021 Proceedings. 1.
https://aisel.aisnet.org/icis2021/data_analytics/data_analytics/1
Improving Explainability and Accuracy through Feature Engineering: A Taxonomy of Features in NLP-based Machine Learning
Natural Language Processing (NLP)-based machine learning receives continuous attention in Information System (IS) research and practice. Despite the success of deep learning models, NLP feature engineering still plays a vital role in contexts where only little annotated data is available, and in which explainability is a precondition for productive deployment. However, NLP feature engineering is a labor-intensive and time-consuming endeavor, and there is still limited shared knowledge about the distinctive characteristics of NLP features from an interdisciplinary perspective. To address this gap, we draw on a systematic literature review and develop a five-dimensional NLP feature taxonomy based on 133 unique features from 211 scientific studies. This helps IS researchers and practitioners to classify, compare, and evaluate their NLP studies. Moreover, we used cluster heat mapping analysis to derive three clusters and several white spots to provide further assistance for developing and designing new NLP solutions in IS.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.
Comments
14-Data