Fraud Detection with Machine Learning in Property Insurance Policy Requests

Authors

DOI:

https://doi.org/10.5281/zenodo.8071015

Keywords:

Machine Learning, Property Insurance, Fraud Prediction, Semi Supervised Learning

Abstract

The purpose of this study is to predict in advance, whether the policy claims are made for abuse in order to reduce claim payments in the property branch of the insurance industry and to prevent any financial losses. In order to detect abnormal situations, Label Spreading and Self Training semisupervised machine learning approaches were used due to the insufficient label ratio of the anomaly class in the data set. After the data labeling process, fraud prediction study was conducted with supervised machine learning models and it was discussed which semi-supervised learning approach worked with higher performance. The datasets include the policy demands in 2017 and 2018 and the weather information of that location. Accuracy, precision, recall, specificity, and F1 score were evaluated as model success measures, and according to these results, it was seen that the Self Training approach could label data with higher performance than the Label Spreading approach.

References

S. M. Palacio, “Abnormal Pattern Prediction: Detecting Fraudulent Insurance Property Claims with Semi-Supervised Machine-Learning,” Data Science Journal, vol. 18, no. 1, Art. no. 1, , Jul. 2019.

S. Panigrahi and B. Palkar, “Comparative Analysis on Classification Algorithms of Auto-Insurance Fraud Detection based on Feature Selection Algorithms,” International Journal of Computer Sciences and Engineering, vol. 6, pp. 72–77, Sep. 2018.

I. Bouzgarne, Y. Mohamed, O. Bouattane, and Q. Mohamed, “Composition of Feature Selection Methods and Oversampling Techniques for Banking Fraud Detection with Artificial Intelligence,” International Journal of Engineering Trends and Technology, vol. 69, pp. 216–226, Nov. 2021.

B. Baesens, S. Höppner, and T. Verdonck, “Data engineering for fraud detection,” Decision Support Systems, vol. 150, p. 113492, Nov. 2021.

R. A. Bauder, M. Herland, and T. M. Khoshgoftaar, “Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study,” in 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), Los Angeles, CA, USA, Jul. 2019, pp. 9–14.

A. Westerski, R. Kanagasabai, E. Shaham, A. Narayanan, J. Wong, and M. Singh, “Explainable anomaly detection for procurement fraud identification—lessons from practical deployments,” International Transactions in Operational

Research, vol. 28, no. 6, pp. 3276–3302, 2021.

L. Settipalli and G. R. Gangadharan, “Healthcare fraud detection using primitive sub peer group analysis,” Concurrency and Computation: Practice and Experience, vol. 33, no. 23, p. e6275, 2021.

S. Höppner, B. Baesens, W. Verbeke, and T. Verdonck, “Instance-dependent cost-sensitive learning for detecting transfer fraud,” European Journal of Operational Research, vol. 297, no. 1, pp. 291–300, Feb. 2022.

C. Gomes, Z. Jin, and H. Yang, “Insurance fraud detection with unsupervised deep learning,” Journal of Risk and

Insurance, vol. 88, no. 3, pp. 591–624, 2021.

M. K. Severino and Y. Peng, “Machine learning algorithms for fraud prediction in property insurance: Empirical evidence

using real-world microdata,” Machine Learning with Applications, vol. 5, p. 100074, Sep. 2021.

Ö. Şahin, S. Ayvaz, and E. ÇALIMFİDAN, “Sigorta Sektöründe Sahte Hasarların Tahmini İçin Geliştirilen Makine

Öğrenmesi Modellerinin Kıyaslanması,” Bilişim Teknolojileri Dergisi, vol. 13, pp. 479–489, Oct. 2020.

M. E. Irarrázaval, S. Maldonado, J. Pérez, and C. Vairetti, “Telecom traffic pumping analytics via explainable data science,” Decision Support Systems, vol. 150, p. 113559, Nov. 2021.

Y. Kang, N. Jia, R. Cui, and J. Deng, “A graph-based semi-supervised reject inference framework considering imbalanced data distribution for consumer credit scoring,” Applied Soft Computing, vol. 105, p. 107259, Jul. 2021.

K. K. Tsiptsis and A. Chorianopoulos, Data Mining Techniques in CRM: Inside Customer Segmentation, 1st edition. Wiley, 2009.

J. Wang, Encyclopedia of Data Warehousing and Mining, Second Edition, 2nd edition. Hershey: Information Science Reference, 2008.

X. Zhu and A. B. Goldberg, “Introduction to Semi-Supervised Learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 3, no. 1, pp. 1–130, Jan. 2009.

Published

17-06-2022

How to Cite

Ürgenç, S., Kaplan, H., & Çakmak Pehlivanlı, A. (2022). Fraud Detection with Machine Learning in Property Insurance Policy Requests. AINTELIA Science Notes Journal, 1(1). https://doi.org/10.5281/zenodo.8071015