An overview of data science for fraud detection
Fraud detection and its application in data science is a burgeonning field. This article gives an overview of the business challenges data scientists solve when dealing with fraud detection.
Highlight: fraud detection, the machine learning serie
I have written a full serie of articles related to fraud detection based on a credit card fraud case study:
- Exploratory Data Analysis - fraud
- Using t-SNE, as dimensionality reduction for fraud detection
- Techniques to handle imbalanced dataset
- Comparing ensemble approaches to improve model performance
Fraud: a definition
Fraud is defined as the wrongful or criminal deception intended to result in financial or personal gain. In nowadays organizations, fraud attempts can be classified as:
- Accounting fraud
- Asset misappropriation
- Bribery and Corruption
- Business misconduct
- Consumer fraud
- Cybercrime
- Human resources fraud
- Internal fraud
- Money laundering
- Procurement fraud
- Third party fraud (from agents, vendors, shared service providers and customers)
Fraud in the financial services
Specifically, one of the industries most exposed to fraud is financial services, with fraud attempts, in declining order, such as:
- Consumer fraud
- Asset misappropriation
- Cybercrime
- Business misconduct
- Money laundering
Cyber-attack golden age
We have entered an era of high professionalism and industrialization of fraud for Cyber-attack leading to a variety of threats:
- Asset misappropriation (including chargeback frauds, stolen credit cards goods purchase or card-not-present fraud)
- Extortion
- Disruption of business processes
- Insider trading
- Intellectual property theft
- Political attacks
- Procurement fraud
The aim of the cyber hackers is to:
- Commercialize critical stolen customers’ data
- Commoditize stolen data with bots and data marketplaces so to generate faster and higher revenue from massive scale frauds
- Specialize with additional revenue streams (buy, sell, use; bundles; packaged fraud technology solutions/products, verticals; training)
- Offer Fraud-as-a-Service
The cyber fraud landscape is characterised by:
- Fraudulent Account Creation
- Account Takeover Fraud
- Transaction/Payment Fraud
- Risk-Based Authentication
- Limited Data Collaboration
Fraud impacts
The consequences of fraud for an organisation can be multiple: lower employee morale, impeding business relations, damaging reputation or brand, tainting relationships with policy makers and regulators, devaluating share price. When frauds are detected, their impacts go further than just simply the direct monetary losses or the cost of compensating customers and usually have secondary costs: it may induce the company to enter an enforced remediation programme or to address costly feedback from a regulatory inspection for years.
History of fraud detection
First fraud detection initiatives originated at sector at risk such as telecom, insurance and banks. Then the retail sector, particularly victim of internal and POS data related frauds, followed.
Our evolving overly connected economy, new fraud threats have surfaced as fraud is, indeed, an adaptive crime:
- New digital products are launched without the necessary protections and transparency so to protect the consumer and targeted public. It can even go further so to aid and host criminal activities undercover of mainstream purpose.
- Payments systems are in transition and developed by new entrants far way of any anti-fraud and anti-money laundering or any financial services knowledge or experience
- hacking has entered an era of sophisticated multichannel coordinated attacks (brute force attack, network scanning, malware, phishing, pharming, vishing, and smishing) leading to a booming advanced parallel and dark web fraudsters industry.
- With major data loss or leak, identity markers (date of birth, security questions, state emitted identity numbers) are no longer a safe way to identify an individual. In the same way, the digital divide makes impossible to jump to new advanced id systems (biometrics, digital ID) for all.
A lower tolerance to fraud
As per 2018, the general public and regulators, in the private and public sector, scrutinized under the prism of social media, online journalism and cyber activism, are strongly in favor of senior and top managers to be fully accountable and punished for any fraudulent activity, specifically from within one’s company. Sanctions against corporate misconduct or white collars crime are making the headlines and standards of transparency have changed thanks to a more mainstream whistle-blowers culture (‘leaks papers’).
Fraud detection strategy and mitigation techniques
In order to put in place an anti-fraud strategy, the company should have its own policies, procedures, people trainings, rewards and systems evolved so to integrate the risk of fraud. These measures can also be reinforced by enforcing anti-competitive/anti-trust scheme, anti-bribery/anticorruption (ABAC) legislation, anti-money laundering (AML), or sanctions and export controls, at the work place. This will not be complete without taking into consideration cybercrime and online fraudulent activities damaging the company, its revenue and its reputation. Not having a cyber response plan in order to shield from any cyber-attack vulnerability may, in addition, undermine any efforts the company may have to develop and grow by exposing and using insiders’ information at the benefits of cybercriminals. Practically, it will involve the following approach:
- Fraud risk assessment. As per a larger Enterprise Risk Management strategy, specific initiatives such as getting the support and involvement of top management, so to organise an annual stress test and to assess fraud related risk via annual workshop, or as routine in strategic business process and internal controls, documenting and agreeing on risk level as reported in FMEA or by operations and field team consultations. When needed, in very critical business line, a comprehensive auditing neutral practice can also bring light to underestimated fraud risks. Developing an open company culture of honesty is also key with visible open-door policy or possibility to reach out to a dedicated hotline. Also, switching form passive, crisis management, to anticipation is the field of knowledge discovery in databases, predictive and preventive analytics.
- Fraud recognition. This can be done by educating the staff at every level of the organisation to spot any fraud attempt, internal or external. And also to protect and even reward any lead that may need to exposing a fraud by whistle-blower or no blame/no shame programmes. For internal fraud, monitoring at risk group including unethical decision makers or senior managers is to be considered. When you it comes to data science, this is where events-based anomaly detection algorithm, or persona and geo signal extraction, designed to process large stream of data in real time are playing a key role. Also mastering clustering and classification techniques so to expose hidden patterns. Using adaptive behavioral analytics to infer associations among groups of data and similar behavior of customers to assign them to proper risk category.
- Global and dynamic fraud risk monitoring. A shift from a very silos oriented, fragmented fraud information systems and culture, inside organization to global united dashboards, including red flags systems. It means being able to process time-series analysis of time-dependent and device changing data and to draw insights from it through relevant metrics so to limit chargebacks.
- Leverage technology. Acknowledge tech vulnerability#. Companies operations dependence to tech make them vulnerable to potential threats, as there is no airtight way of doing business, and to limit blind spots, but also can be their potential protector as long as tech development are made so to assess, spot and reduce frauds. Advanced predictive analytics, Natural Language Processing (NLP) or Generation (NLG), voice recognition, machine learning models and artificial intelligence techniques can be activated so to spot and anticipate any fraud risk.
- Generate revenue from your in-house fraud tech. Whereas it is unstructured data fast processing pipelines, stream of communication monitoring, routine analysis, transactions flow watching, anomaly detection, dashboards development, regulations triggered flags, pattern recognition and review, combating fraud has a tangible impact, one that can directly be linked to the bottom line, by lowering large operational costs and even values as per consultancy services. In business to consumer markets, such as financial services, fraud detection has ideally to be geared to a tangible cause to avoid losing legitimate customers. Tech implementation and forensic analytics have to be designed so to reach a balance between raising red flags, and alienating wrongly a volatile customer base. This is achieved by ensuring smooth customer service as long as the fraud investigation is in progress, so to identify quickly false positive and to avoid jumping to devastating conclusions. Carefully designed data pre-processing, data mining, and overall data science and artificial intelligence programs, managed by and staffed with the right talents, informed by the business understanding, whose tech is human-driven, based on human expertise and not only rules, and people-focused, can help fine-tuning an efficient fraud programme and reduce frictions.
Conclusion: from Credit Bureau to Risk Bureau
With new entrants in the field of data science driven fraud prevention, specifically in the financial services, we observe a shift in paradigm, where historical players are stimulated by data science and artificial intelligence. The approach is multilevel: device ID, user behavior, identity data and network data with the development of compound confidence and rating scores for omnichannel and multichannel commerce.