Data Analytics for Business and Societal Challenges

Loading...

Media is loading
 

Paper Number

1178

Paper Type

Completed

Description

Natural Language Processing (NLP)-based machine learning receives continuous attention in Information System (IS) research and practice. Despite the success of deep learning models, NLP feature engineering still plays a vital role in contexts where only little annotated data is available, and in which explainability is a precondition for productive deployment. However, NLP feature engineering is a labor-intensive and time-consuming endeavor, and there is still limited shared knowledge about the distinctive characteristics of NLP features from an interdisciplinary perspective. To address this gap, we draw on a systematic literature review and develop a five-dimensional NLP feature taxonomy based on 133 unique features from 211 scientific studies. This helps IS researchers and practitioners to classify, compare, and evaluate their NLP studies. Moreover, we used cluster heat mapping analysis to derive three clusters and several white spots to provide further assistance for developing and designing new NLP solutions in IS.

Comments

14-Data

Share

COinS
 
Dec 12th, 12:00 AM

Improving Explainability and Accuracy through Feature Engineering: A Taxonomy of Features in NLP-based Machine Learning

Natural Language Processing (NLP)-based machine learning receives continuous attention in Information System (IS) research and practice. Despite the success of deep learning models, NLP feature engineering still plays a vital role in contexts where only little annotated data is available, and in which explainability is a precondition for productive deployment. However, NLP feature engineering is a labor-intensive and time-consuming endeavor, and there is still limited shared knowledge about the distinctive characteristics of NLP features from an interdisciplinary perspective. To address this gap, we draw on a systematic literature review and develop a five-dimensional NLP feature taxonomy based on 133 unique features from 211 scientific studies. This helps IS researchers and practitioners to classify, compare, and evaluate their NLP studies. Moreover, we used cluster heat mapping analysis to derive three clusters and several white spots to provide further assistance for developing and designing new NLP solutions in IS.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.