Practitioner Track

Loading...

Media is loading
 

Paper Type

Pract

Paper Number

1383

Description

Natural language processing (NLP) helps to extract data from digitized documents. An interesting use case occurred with the International Financial Reporting Standard 16. Following design science research in information systems, the objective of this article is to lay out design guidelines to automate data extraction from physical leasing contracts by applying NLP. Taking a leading international technology group in the areas of specialty glass and glass-ceramics as our case study, we discuss as follows: (1) The data format of the receiving IS is a sine qua non requirement of the project. Thus, set up the NLP process from the end of the project. (2) Evaluate machine readability of the input documents before preprocessing. List most frequent extraction issues in a manual early in the process. (3) Cluster documents regarding their structure and content beforehand. Then, apply a specifically trained ML algorithm for each cluster. (4) A trainer should guide the machine. Use recall and precision as measures. (5) Design an intuitive user interface by offering both parallel windows and a highlighting feature to offer a quick comparison even for complex contract documents. (6) Project iterations are worthwhile until a stable process is achieved.

Comments

23-Practice

Share

COinS
Best Paper Nominee badge
 
Dec 14th, 12:00 AM

Towards Natural Language Processing: An Accounting Case Study

Natural language processing (NLP) helps to extract data from digitized documents. An interesting use case occurred with the International Financial Reporting Standard 16. Following design science research in information systems, the objective of this article is to lay out design guidelines to automate data extraction from physical leasing contracts by applying NLP. Taking a leading international technology group in the areas of specialty glass and glass-ceramics as our case study, we discuss as follows: (1) The data format of the receiving IS is a sine qua non requirement of the project. Thus, set up the NLP process from the end of the project. (2) Evaluate machine readability of the input documents before preprocessing. List most frequent extraction issues in a manual early in the process. (3) Cluster documents regarding their structure and content beforehand. Then, apply a specifically trained ML algorithm for each cluster. (4) A trainer should guide the machine. Use recall and precision as measures. (5) Design an intuitive user interface by offering both parallel windows and a highlighting feature to offer a quick comparison even for complex contract documents. (6) Project iterations are worthwhile until a stable process is achieved.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.