Graphs to Graph Neural Networks — Setting the New Gold Standard to Detect Fraud in Financial Services

By: Abhishek Mehta and Greg Steck

February 11, 2022

Graphs to Graph Neural Networks — Setting the New Gold Standard to Detect Fraud in Financial Services

Fraud detection and prevention methods for electronic payment systems are overburdened by "false positives" or "false alerts." It is imperative to reduce these alarms when over 50% of a company’s business comes from existing customers.

It does not matter if you are an established financial institution or a challenger bank; Customer Lifetime Value (LTV) becomes a crucial KPI (Key Performance Indicator) as the cost of losing customers is very high. Industry trends show that only one in five blocked transactions are actually fraudulent, because four out of five are false positives. This has led to a total of $118 billion of blocked sales in the U.S. alone. 40% (or $47B worth) of these denied users are attempting a transaction worth greater than $250.

"The effect of this on LTV, the hard cost of fraud from chargebacks and operational costs, and the regulatory risks give companies multiple reasons to improve their existing fraud detection models. The most noticeable among trends is the use of Graph Intelligence. These approaches can provide 15 to 20% improvement in fraud detection rates compared to traditional machine learning methods. That’s in addition to the 50 to 90% improvement that traditional machine learning approaches provide relative to rules-based approaches. For credit risk use cases where the KS score is used to measure accuracy, financial institutions have reported improvements between 4 and 10 KS points, which results in millions of dollars of savings in net charge-offs."

Graph enabled in-depth analysis

In the payment processing industry, potential fault lines emerge in the relationship between the speed of approving a transaction and the attribute scrutiny (a time window of a few seconds). The best examples of this would be “On the Spot” Loan Applications, Electronic Transactions, and Credit Card Approvals.

0_wQbYsdK7_I99r9w7A customer or merchant will send personal details over a device and those details must be verified against existing information. This verification process is a race between time and technology. In-depth analysis isn’t something that two-dimensional data stores such as RDBMS and DataFrames are optimized for. A task such as trying to find out if the incoming device is linked to any account or transactions that have been previously suspected is a complicated one, especially with millions of transactions per day that sum up to billions over in months. This complexity increases exponentially based on the checks that have to be performed over the number of attributes received from PoS (Point of Sale) systems.

For graph workloads, in-depth analysis is not an issue. For example, consider a lookup for a node that has the same device number; the system only needs to follow all outgoing edges to detect a blocking condition. With a graph this only requires a few memory hops rather than huge scans and painfully slow joins.

Feature engineering to the newer depths

There is a wider trend of adopting the graph technology to improve the accuracy of existing machine learning (ML) models by taking a depth-first approach for feature extraction.

"The approach consists of representing the problem in its graphical form, computing the network features, and using this information to enrich the dataset from which the ML algorithm learns. Hence, this approach has the potential to boost the model performance without compromising the imperative of interpretability (Lina Faik)."

0__kTQo3DioVrRVpb-This image shows the execution of an alert to detect a money transfer from one account to many smurf accounts bound for a Financial Action Task Force (FATF) listed country. This rule provides for all-path traversals between two nodes which are four hops apart. Traversal also involves path finding and computation along with the filters (for example, date, time, and amount). Such execution takes sub-second time to add “new features” in your machine learning models. It also makes better use of your existing data for feature extraction by overcoming the challenges of limited data availability and third-party data enrichment. This rule is just one example among the possibilities that are limited by our own imagination.

Gaining uplift with graph algorithm sequencing

Some of the world’s most successful companies have graph algorithms at their core, such as Alphabet, Meta, LinkedIn, and Twitter. These algorithms include supervised, unsupervised, and deep learning approaches including PageRank, Social Network Analysis, Louvain, Connected Components, Graph Neural Networks (GNNs), and many more. Each algorithm is powerful, and sequencing these algorithms together opens up new possibilities for the financial services.

0_ppyM4rFJ-0iGf6VXThe image above is an example of a credit card issuer trying to find the most important merchants to focus on for the sales and marketing spend. Let’s take a look at an approach for identifying the most popular merchants. First, we will infer connections represented as new graph edges between the merchants based on the minimum threshold of customer overlap (common credit and debit card usage). These new edges will enable self-referential links between the merchant nodes. Then, using the PageRank algorithm, we can find the most popular merchants based on the highest page rank, similar to how Google lists the most relevant results on the top. This approach is more sophisticated in finding high-value targets than aggregations, standard deviation, or simple statistical analysis of revenue-based row and column analysis.

0_OLZcHOo5HCyLLM_JTens of thousands of small communities of merchants can be identified using data properties such as products sold, zip codes, daily average sales, and more by using Louvain or other graph-based clustering algorithm. To take this a step further for detecting the damage caused by a fraud ring, these communities can be sorted by their average PageRank values to find the disruption to the most popular communities. Alternatively, this can also be used to analyze the fraud rings themselves, because birds of a feather flock together.

We have options of hundreds of graph algorithms that can be optimized and sequenced to run distributedly on a graph intelligence platform. Another factor to keep in mind is that graph algorithms can be compute or memory (or both) intensive. Apart from distributed execution, performance can be significantly improved by selecting a platform that can choose the most effective data partition policies during the execution of the program.

GNNs: the new frontier

In addition to improving feature engineering and in-depth analysis, graph data structures also enable new deep learning frameworks. Graph Neural Networks are a collection of new deep learning models that operate on graph data structures. At their core, GNNs propagate information across the graph, learning new, relationship-based, features about the data that cannot be captured in traditional machine learning or deep learning models.

0_RQvZZnm7XAwf0SbuFor identifying fraud in payments, this means that GNNs can better identify fraudulent transactions by learning the graph structure of a fraudulent transaction and determining if an incoming transaction has a similar graph structure. GNNs learn these features and graph structures by applying neural networks directly on the graph.

GNNs can be configured to do many different tasks, including node-level predictions such as identifying fraudulent transactions and subgraph-level predictions such as money laundering networks. This makes them a versatile tool for several fraud detection use cases.

GNN & the explainable AI

0_2tIki07Z9oo2uhs_Because GNNs can operate directly on a graph structure, they provide some of the most explainable AI methods. In the example here, the GNN predicted that the incoming transaction marked with a red “X” is a fraudulent transaction. An investigator could clearly see in the graph that the transaction was flagged as suspicious because of the history of the buyer using those particular payment tokens and email addresses for fraudulent transactions. Looking at a T-SNE plot, which projects high dimensional features from the graph into two-dimensional space for easier rendering as an image, can also show that the transaction in question has clear proximity to other fraudulent transactions.

https://arxiv.org/pdf/2011.12193.pdf

GNN performance

0_0YW5S9Cbk_nGE1kqUsing a real-world dataset, we can see the value that GNNs can provide to identify fraudulent transactions. As another example, the Elliptic dataset is a Bitcoin transaction dataset that has labeled accounts as either illicit (for example, money laundering and terrorist financing) or licit. The task is to predict whether or not an account is illicit using the transaction network (represented as a graph structure) and some features about the account.

In the first model, we use only the account features (local features). In the second model, we add some features from the graph, including graph algorithms such as Louvain and Connected Components. In the third model, we add in the features generated by the GNN. We can see the improvement that comes in model accuracy when we add in the graph features.

Conclusion

Graphs enable context awareness. This brings enormous value to hundreds of financial use cases that have been ignored earlier because of technical limitations or prohibitive TCO (total cost of ownership).

Now is also the time for the financial industry to meet this challenge head-on with technologies like Katana Graph’s graph intelligence platform. Katana’s platform is built for on-demand heterogeneous scalability. It can integrate directly with existing Machine Learning workflows and provides native support for Graph Algorithms and Graph Neural Networks. It is hardened with decades of DARPA-funded projects at terabytes of scale and can run on up to 256 machines.

Feel free to contact us to read further the possibilities about Katana Graph’s platform. Contact us directly to know more about graph’s uses in financial services. An hour of investment can turn into a career-changing event.

This blogpost was co-authored by Abhishek Mehta, Head of Field Engineering at Katana Graph, and Greg Steck, Solutions Architect at Katana Graph, and originally posted at Medium.

share

Newsletter Sign Up

Graph Neural Networks for Credit Modeling

The financial services sector has many early adopters of sophisticated analytics techniques.

Read More
Managing Financial Services with Graph Computing: Fraud Detection, AML, and Credit Risk

Intelligent graph computing approaches are at the fore of numerous mission-critical financial.

Read More
AI-Curated Models Bridge the Credit Decisioning Gap

The digital transformation of the financial services industry is one of the biggest things.

Read More

View All Resources

Let’s Talk

Turn Your Unmanageable
Data Into Answers

Find out how Katana Graph can help provide the foundation for your future of data-driven innovation.

Contact Sales