Objective: Financial fraud has been a big concern for many organizations across industries; billions of dollars are lost yearly because of this fraud. So businesses employ data mining techniques to address this continued and growing problem. This paper aims to review research studies conducted to detect financial fraud using data mining tools within one decade and communicate the current trends to academic scholars and industry practitioners.
Method: Various combinations of keywords were used to identify the pertinent articles. The majority of the articles retrieved from Science Direct but the search spanned other online databases (e.g., Emerald, Elsevier, World Scientific, IEEE, and Routledge - Taylor and Francis Group). Our search yielded a sample of 65 relevant articles (58 peer-reviewed journal articles with 7 conference papers). One-fifth of the articles was found in Expert Systems with Applications (ESA) while about one-tenth found in Decision Support Systems (DSS).
Results: 41 data mining techniques were used to detect fraud across different financial applications such as health insurance and credit card. Logistic regression model appeared to be the leading data mining tool in detecting financial fraud with a 13% of usage.In general, supervised learning tool have been used more frequently than the unsupervised ones. Financial statement fraud and bank fraud are the two largest financial applications being investigated in this area – about 63%, which corresponds to 41 articles out of the 65 reviewed articles. Also, the two primary journal outlets for this topic are ESA and DSS.
Conclusion: This review provides a fast and easy-to-use source for both researchers and professionals, classifies financial fraud applications into a high-level and detailed-level framework, shows the most significant data mining techniques in this domain, and reveals the most countries exposed to financial fraud.
Abstract: The development and application of computational data mining techniques in financial fraud detection and business failure prediction has become a popular cross-disciplinary research area in recent times involving financial economists, forensic accountants and computational modellers. Some of the computational techniques popularly used in the context of financial fraud detection and business failure prediction can also be effectively applied in the detection of fraudulent insurance claims and therefore, can be of immense practical value to the insurance industry. We provide a comparative analysis of prediction performance of a battery of data mining techniques using real-life automotive insurance fraud data. While the data we have used in our paper is US-based, the computational techniques we have tested can be adapted and generally applied to detect similar insurance frauds in other countries as well where an organized automotive insurance industry exists.