Data Analytics for Business and Societal Challenges
Loading...
Paper Number
2423
Paper Type
short
Description
Fraud is a significant issue for insurers. Previous literature has mainly used supervised learning to detect insurance fraud. However, supervised learning must deal with significant difficulties in fraud detection, such as very few cases being labeled as fraud and overfitting to the outcomes of pre-existing fraud detection systems, which can lead to overlooking new fraud patterns. Unsupervised learning methods producing anomaly scores could be a remedy to improve insurance fraud detection systems. However, unsupervised learning must identify anomalies that are conceptionally meaningful for fraud. In this paper, we suggest a theoretical framework for choosing features to include in fraud detection models. We evaluate this framework using isolation forests for anomaly detection based on more than 32,000 automobile insurance claims. We further evaluate textual information based on concepts from deception detection in computational linguistics using straightforward cluster methods and state-of-the-art transformers.
Recommended Citation
Debener, Jörn; Heinke, Volker; and Kriebel, Johannes, "Insurance Fraud and Isolation Forests" (2021). ICIS 2021 Proceedings. 15.
https://aisel.aisnet.org/icis2021/data_analytics/data_analytics/15
Insurance Fraud and Isolation Forests
Fraud is a significant issue for insurers. Previous literature has mainly used supervised learning to detect insurance fraud. However, supervised learning must deal with significant difficulties in fraud detection, such as very few cases being labeled as fraud and overfitting to the outcomes of pre-existing fraud detection systems, which can lead to overlooking new fraud patterns. Unsupervised learning methods producing anomaly scores could be a remedy to improve insurance fraud detection systems. However, unsupervised learning must identify anomalies that are conceptionally meaningful for fraud. In this paper, we suggest a theoretical framework for choosing features to include in fraud detection models. We evaluate this framework using isolation forests for anomaly detection based on more than 32,000 automobile insurance claims. We further evaluate textual information based on concepts from deception detection in computational linguistics using straightforward cluster methods and state-of-the-art transformers.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.
Comments
14-Data