Loading...
Paper Type
Complete
Description
In information technology support/helpdesk transcripts, most of the data of interest, such as the issue of concern, issue severity, context, and system status, is not provided in a structured form. Moreover, the special traits of product issue orientation, implicit background knowledge, and off-topic dialogues require a domain specialized approach to extract knowledge from these transcripts. Accordingly, this study analyzes the specific domain requirements and proposes a novel solution based on Natural Language Processing (NLP) approach. In the core process, this approach uses an adapted term frequency-inverse document frequency (TF-IDF) algorithm by adding a new parameter reflecting the term’s priority in the text. Experimental results show that the proposed NLP-based solution performs reasonably well in topic categorizing with an accuracy of 92.8%. Compared to the performance of keywords extraction, the proposed approach achieves an accuracy of 93.4%, which outperforms the classic TF-IDF method signifying the importance of extracting and accommodating domain-specific knowledge.
Paper Number
1318
Recommended Citation
Zhao, Gary Yu; El-Gayar, Omar; and Tu, Cindy Zhiling, "A Knowledge Extraction Approach for IT Tech-support Transcripts" (2023). AMCIS 2023 Proceedings. 7.
https://aisel.aisnet.org/amcis2023/sig_odis/sig_odis/7
A Knowledge Extraction Approach for IT Tech-support Transcripts
In information technology support/helpdesk transcripts, most of the data of interest, such as the issue of concern, issue severity, context, and system status, is not provided in a structured form. Moreover, the special traits of product issue orientation, implicit background knowledge, and off-topic dialogues require a domain specialized approach to extract knowledge from these transcripts. Accordingly, this study analyzes the specific domain requirements and proposes a novel solution based on Natural Language Processing (NLP) approach. In the core process, this approach uses an adapted term frequency-inverse document frequency (TF-IDF) algorithm by adding a new parameter reflecting the term’s priority in the text. Experimental results show that the proposed NLP-based solution performs reasonably well in topic categorizing with an accuracy of 92.8%. Compared to the performance of keywords extraction, the proposed approach achieves an accuracy of 93.4%, which outperforms the classic TF-IDF method signifying the importance of extracting and accommodating domain-specific knowledge.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.
Comments
SIG ODIS