Banks are losing more than USD 442 billion every year to fraud according to the LexisNexis True Cost of Fraud Study. Traditional rule-based systems are failing to keep up, and Gartner reports that they miss more than 50% of new fraud patterns as attackers adapt faster than the rules can update. At the same time, false positives continue to rise. Aite-Novarica found that almost 90% of declined transactions are actually legitimate, which frustrates customers and increases operational costs. Fraud is also becoming more coordinated. Feedzai recorded a 109% increase in fraud ring activity within a single year.
To stay ahead, banks need models that understand relationships across users, merchants, devices, and transactions. This is why we are building a next-generation fraud detection system powered by Graph Neural Networks and Neo4j. Instead of treating transactions as isolated events, this system analyzes the full network and uncovers complex fraud patterns that traditional ML often misses.
First, let’s try to understand why do we need to migrate towards this new approach. Most fraud detection systems use traditional ML models that isolate the transactions to analyze.
Below is a very standard rule-based fraud detection system:
def detect_fraud(transaction):
if transaction.amount > 1000:
return "FRAUD"
if transaction.hour in [0, 1, 2, 3]:
return "FRAUD"
if transaction.location != user.home_location:
return "FRAUD"
return "LEGITIMATE"
The problems here are pretty straightforward:
Random Forest and XGBoost were better but are still analyzing each transaction independently. They may not realize! User_A, User_B, and User_C are all compromised accounts, they are all controlled by one fraudulent ring, they all appear to be targeting the same questionable merchant in the span of minutes.
Important insight: Fraud is relational. Fraudsters are not working alone: they work as networks. They share resources. And their patterns only become visible when observed across relationships between entities.
Specifically built for learning from networked data, Graph Neural Networks analyze the entire graph structure where the transactions form a relationship between users and merchants, and additional nodes would represent devices, IP addresses and more, rather than analyzing one transaction at a time.

In our framework, we represent the fraud problem with a graph structure, with the following nodes and edges:
Nodes:
Edges:

This representation allows us to observe patterns like:
Our system has five main components that form a complete pipeline:
Technology stack:


Neo4j is a native graph database that stores relationships as first-class citizens. Here’s how we model our entities:
CREATE (u:User {
user_id: 'U0001',
age: 42,
account_age_days: 1250,
credit_score: 720,
avg_transaction_amount: 245.50
})
CREATE (m:Merchant {
merchant_id: 'M001',
name: 'Electronics Store',
category: 'Electronics',
risk_score: 0.23
})
CREATE (t:Transaction {
transaction_id: 'T00001',
amount: 125.50,
timestamp: datetime('2024-06-15T14:30:00'),
hour: 14,
is_fraud: 0
})
CREATE (u)-[:MADE_TRANSACTION]->(t)-[:AT_MERCHANT]->(m)

Using the embedded fraud patterns, we generate synthetic but realistic data:
class FraudDataGenerator:
def generate_transactions(self, users_df, merchants_df):
transactions = []
# Create fraud ring (coordinated attackers)
fraud_users = random.sample(list(users_df['user_id']), 50)
fraud_merchants = random.sample(list(merchants_df['merchant_id']), 10)
for i in range(5000):
is_fraud = np.random.random() < 0.15 # 15% fraud rate
if is_fraud:
# Fraud pattern: high amounts, odd hours, fraud ring
user_id = random.choice(fraud_users)
merchant_id = random.choice(fraud_merchants)
amount = np.random.uniform(500, 2000)
hour = np.random.choice([0, 1, 2, 3, 22, 23])
else:
# Normal pattern: business hours, typical amounts
user_id = random.choice(list(users_df['user_id']))
merchant_id = random.choice(list(merchants_df['merchant_id']))
amount = np.random.lognormal(4, 1)
hour = np.random.randint(8, 22)
transactions.append({
'transaction_id': f'T{i:05d}',
'user_id': user_id,
'merchant_id': merchant_id,
'amount': round(amount, 2),
'hour': hour,
'is_fraud': 1 if is_fraud else 0
})
return pd.DataFrame(transactions)
This function helps us in generating 5,000 transactions with 15% fraud rate, including realistic patterns like fraud rings and time-based anomalies.
We have chosen the GraphSAGE or Graph Sample and Aggregate Method for our GNN architecture as it not only scales well but handles new nodes without retraining as well. Here’s how we’ll implement it:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import SAGEConv
class FraudGNN(nn.Module):
def __init__(self, num_features, hidden_dim=64, num_classes=2):
super(FraudGNN, self).__init__()
# Three graph convolutional layers
self.conv1 = SAGEConv(num_features, hidden_dim)
self.conv2 = SAGEConv(hidden_dim, hidden_dim)
self.conv3 = SAGEConv(hidden_dim, hidden_dim)
# Classification head
self.fc = nn.Linear(hidden_dim, num_classes)
# Dropout for regularization
self.dropout = nn.Dropout(0.3)
def forward(self, x, edge_index):
# Layer 1: Aggregate from 1-hop neighbors
x = self.conv1(x, edge_index)
x = F.relu(x)
x = self.dropout(x)
# Layer 2: Aggregate from 2-hop neighbors
x = self.conv2(x, edge_index)
x = F.relu(x)
x = self.dropout(x)
# Layer 3: Aggregate from 3-hop neighbors
x = self.conv3(x, edge_index)
x = F.relu(x)
x = self.dropout(x)
# Classification
x = self.fc(x)
return F.log_softmax(x, dim=1)
What’s happening here:
We normalize all features to [0, 1] range for stable training:
def prepare_features(users, merchants):
# User features (4 dimensions)
user_features = []
for user in users:
features = [
user['age'] / 100.0, # Age normalized
user['account_age_days'] / 3650.0, # Account age (10 years max)
user['credit_score'] / 850.0, # Credit score normalized
user['avg_transaction_amount'] / 1000.0 # Average amount
]
user_features.append(features)
# Merchant features (padded to match user dimensions)
merchant_features = []
for merchant in merchants:
features = [
merchant['risk_score'], # Pre-computed risk
0.0, 0.0, 0.0 # Padding
]
merchant_features.append(features)
return torch.FloatTensor(user_features + merchant_features)
Here’s our training loop:
def train_model(model, x, edge_index, train_indices, train_labels, epochs=100):
optimizer = torch.optim.Adam(
model.parameters(),
lr=0.01, # Learning rate
weight_decay=5e-4 # L2 regularization
)
for epoch in range(epochs):
model.train()
optimizer.zero_grad()
# Forward pass
out = model(x, edge_index)
# Calculate loss on training nodes only
loss = F.nll_loss(out[train_indices], train_labels)
# Backward pass
loss.backward()
optimizer.step()
if epoch % 10 == 0:
print(f"Epoch {epoch:3d} | Loss: {loss.item():.4f}")
return model
Training dynamics:
After running the complete pipeline, here are our results:

Classification Report:

Let’s try to breakdown the results to understand it well.
What worked well:
What needs improvement:
The following confusion matrix shows how the model classified all transactions as legitimate in this particular run:

The ROC curve demonstrates strong discriminative ability (AUC = 0.961), meaning the model is learning fraud patterns even if the threshold needs adjustment:


The analysis we made was able to show unmistakable trends:
Temporal trends:
Amount distribution:
Network trends:
This approach is Ideal for:
This approach is not a good one for scenario like:
Graph Neural Networks change the game for fraud detection. Instead of treating the transactions as isolated events, companies can now model them as a network and this way more complex fraud schemes can be detected which are missed by the traditional ML.
The progress of our work proves that this way of thinking is not just interesting in theory but it’s useful in practice. GNN-based fraud detection with the figures of 91% accuracy, 0.961 AUC, and capability to detect fraud rings and coordinated attacks provides real value to the business.
All the code is available on GitHub, so feel free to modify it for your specific fraud detection issues and use cases.
A. GNNs capture relationships between users, merchants, and devices—uncovering fraud rings and networked behaviors that traditional ML or rule-based systems miss by analyzing transactions independently.
A. Neo4j stores and queries graph relationships natively, making it easy to model and traverse user–merchant–transaction connections essential for real-time fraud pattern detection.
A. The model reached 91% accuracy and an AUC of 0.961, successfully identifying coordinated fraud rings while keeping false positives low.