Text Mining, Public Accounts History, Court of Accounts, Audit
Information systems that support public sector daily activities generate large data sets. As a large proportion of the data in these data sets are text, Text Mining can play an important role in deriving potentially useful and previously unknown information. The overall goal of this paper is evaluate the performance and quality of three text mining classification algorithms applied to detect irregularities in public sector records. To evaluate the algorithms, a tool was designed and a case study was carried out at the Court of Accounts of Sergipe. Performance and Quality metrics were evaluated: mean execution time, accuracy, precision, coverage and F-measure. The results show that the multinomial naive bayes algorithm using inverse document frequency was the best approach to find evidences of travel reimbursement irregularities.
Santos, Breno Santana; Colaço, Methanias Júnior; da Paixão, Bruno Cruz; Santos, Rafael M.; Nascimento, André Vinicius Rodrigues P; dos Santos, Hallan Cosmo; Filho, Wallace H. L.; and de Medeiros, Arquimedes S. L., "Comparing Text Mining Algorithms for Predicting Irregularities in Public Accounts" (2015). Proceedings of the XI Brazilian Symposium on Information Systems (SBSI 2015). 12.