Document Type


Publication Date



Text Mining, Public Accounts History, Court of Accounts, Audit


Information systems that support public sector daily activities generate large data sets. As a large proportion of the data in these data sets are text, Text Mining can play an important role in deriving potentially useful and previously unknown information. The overall goal of this paper is evaluate the performance and quality of three text mining classification algorithms applied to detect irregularities in public sector records. To evaluate the algorithms, a tool was designed and a case study was carried out at the Court of Accounts of Sergipe. Performance and Quality metrics were evaluated: mean execution time, accuracy, precision, coverage and F-measure. The results show that the multinomial naive bayes algorithm using inverse document frequency was the best approach to find evidences of travel reimbursement irregularities.


This paper is in Portuguese (Análise Comparativa de Algoritmos de Mineração de Texto Aplicados a Históricos de Contas Públicas)