Fraud Detection with Machine Learning in Property Insurance Policy Requests
DOI:
https://doi.org/10.5281/zenodo.8071015Keywords:
Machine Learning, Property Insurance, Fraud Prediction, Semi Supervised LearningAbstract
The purpose of this study is to predict in advance, whether the policy claims are made for abuse in order to reduce claim payments in the property branch of the insurance industry and to prevent any financial losses. In order to detect abnormal situations, Label Spreading and Self Training semisupervised machine learning approaches were used due to the insufficient label ratio of the anomaly class in the data set. After the data labeling process, fraud prediction study was conducted with supervised machine learning models and it was discussed which semi-supervised learning approach worked with higher performance. The datasets include the policy demands in 2017 and 2018 and the weather information of that location. Accuracy, precision, recall, specificity, and F1 score were evaluated as model success measures, and according to these results, it was seen that the Self Training approach could label data with higher performance than the Label Spreading approach.
References
S. M. Palacio, “Abnormal Pattern Prediction: Detecting Fraudulent Insurance Property Claims with Semi-Supervised Machine-Learning,” Data Science Journal, vol. 18, no. 1, Art. no. 1, , Jul. 2019.
S. Panigrahi and B. Palkar, “Comparative Analysis on Classification Algorithms of Auto-Insurance Fraud Detection based on Feature Selection Algorithms,” International Journal of Computer Sciences and Engineering, vol. 6, pp. 72–77, Sep. 2018.
I. Bouzgarne, Y. Mohamed, O. Bouattane, and Q. Mohamed, “Composition of Feature Selection Methods and Oversampling Techniques for Banking Fraud Detection with Artificial Intelligence,” International Journal of Engineering Trends and Technology, vol. 69, pp. 216–226, Nov. 2021.
B. Baesens, S. Höppner, and T. Verdonck, “Data engineering for fraud detection,” Decision Support Systems, vol. 150, p. 113492, Nov. 2021.
R. A. Bauder, M. Herland, and T. M. Khoshgoftaar, “Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study,” in 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), Los Angeles, CA, USA, Jul. 2019, pp. 9–14.
A. Westerski, R. Kanagasabai, E. Shaham, A. Narayanan, J. Wong, and M. Singh, “Explainable anomaly detection for procurement fraud identification—lessons from practical deployments,” International Transactions in Operational
Research, vol. 28, no. 6, pp. 3276–3302, 2021.
L. Settipalli and G. R. Gangadharan, “Healthcare fraud detection using primitive sub peer group analysis,” Concurrency and Computation: Practice and Experience, vol. 33, no. 23, p. e6275, 2021.
S. Höppner, B. Baesens, W. Verbeke, and T. Verdonck, “Instance-dependent cost-sensitive learning for detecting transfer fraud,” European Journal of Operational Research, vol. 297, no. 1, pp. 291–300, Feb. 2022.
C. Gomes, Z. Jin, and H. Yang, “Insurance fraud detection with unsupervised deep learning,” Journal of Risk and
Insurance, vol. 88, no. 3, pp. 591–624, 2021.
M. K. Severino and Y. Peng, “Machine learning algorithms for fraud prediction in property insurance: Empirical evidence
using real-world microdata,” Machine Learning with Applications, vol. 5, p. 100074, Sep. 2021.
Ö. Şahin, S. Ayvaz, and E. ÇALIMFİDAN, “Sigorta Sektöründe Sahte Hasarların Tahmini İçin Geliştirilen Makine
Öğrenmesi Modellerinin Kıyaslanması,” Bilişim Teknolojileri Dergisi, vol. 13, pp. 479–489, Oct. 2020.
M. E. Irarrázaval, S. Maldonado, J. Pérez, and C. Vairetti, “Telecom traffic pumping analytics via explainable data science,” Decision Support Systems, vol. 150, p. 113559, Nov. 2021.
Y. Kang, N. Jia, R. Cui, and J. Deng, “A graph-based semi-supervised reject inference framework considering imbalanced data distribution for consumer credit scoring,” Applied Soft Computing, vol. 105, p. 107259, Jul. 2021.
K. K. Tsiptsis and A. Chorianopoulos, Data Mining Techniques in CRM: Inside Customer Segmentation, 1st edition. Wiley, 2009.
J. Wang, Encyclopedia of Data Warehousing and Mining, Second Edition, 2nd edition. Hershey: Information Science Reference, 2008.
X. Zhu and A. B. Goldberg, “Introduction to Semi-Supervised Learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 3, no. 1, pp. 1–130, Jan. 2009.
Published
Issue
Section
License
Copyright (c) 2022 AINTELIA Science Notes Journal

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
COPYRIGHT NOTICE
Authors submitting a manuscript do so on the understanding that if accepted for publication, copyright of the article shall be assigned to Aintelia® Science Notes Journal (ASNJ).
By submitting their work, authors agree to the following terms:
-
Copyright Transfer: Copyright of the published article is transferred to Aintelia® Science Notes Journal. The journal reserves the right to publish, reproduce, distribute, and archive the work.
-
Licensing: While the journal retains the copyright, the article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This allows third parties to share and adapt the work for non-commercial purposes, provided the original work and the journal are properly cited.
-
Author Rights: Authors retain the right to use their article for their own scholarly needs, such as including it in a thesis or dissertation, presenting it at conferences, or distributing it to students for educational purposes, provided that the journal is cited as the original publisher.







