
Real-time fraud detection in banking transactions
A top-five European retail bank was losing an estimated EUR 47 million annually to transaction fraud. Their existing rule-based detection system — a set of 2,400 handcrafted rules maintained by a team of 8 analysts — caught obvious patterns but consistently missed sophisticated attacks, particularly synthetic identity fraud where criminals combine real and fabricated information to create new, seemingly legitimate identities.
Why rules weren't enough
The rule-based system had three fundamental limitations:
- Static thresholds — Rules like "flag transactions over EUR 10,000" were trivially circumvented by structuring payments just below the limit
- No behavioral context — Rules evaluated each transaction in isolation, missing patterns that only emerge over time (e.g., a gradual ramp-up of transaction amounts over weeks)
- Maintenance burden — Each new fraud pattern required manual rule creation and testing, with a 6–8 week lead time from detection to deployment
By the time a new rule was deployed, the fraudsters had already moved on to a different technique. We were always fighting the last war.
The streaming architecture
We replaced the batch-oriented rule engine with a real-time streaming pipeline built on Apache Kafka and Apache Flink:
pipeline:
ingestion:
source: kafka
topic: raw-transactions
throughput: 12,000 events/sec
format: avro
feature_engineering:
engine: flink
windows:
- type: sliding
size: 1h
slide: 5min
features: [tx_count, tx_sum, unique_merchants, avg_amount]
- type: session
gap: 30min
features: [session_duration, session_tx_count, geo_spread]
- type: tumbling
size: 24h
features: [daily_volume, new_merchant_ratio, cross_border_ratio]
scoring:
model: gradient_boosted_ensemble
latency_budget: 50ms
fallback: rule_engine_v2
action:
- threshold: 0.92 -> block_transaction
- threshold: 0.75 -> flag_for_review
- threshold: 0.50 -> enhanced_monitoring
The feature engineering layer is the core innovation. For each incoming transaction, Flink computes 147 features across multiple time windows — from 5-minute micro-patterns to 30-day behavioral baselines. This gives the model a rich temporal context that rule-based systems simply cannot replicate.
The model
We use a gradient-boosted ensemble (XGBoost) rather than deep learning for two reasons: interpretability requirements from the bank's compliance team, and the strict 50ms latency budget that rules out heavier architectures.
import xgboost as xgb
model = xgb.XGBClassifier(
n_estimators=350,
max_depth=8,
learning_rate=0.05,
subsample=0.8,
colsample_bytree=0.7,
scale_pos_weight=580, # severe class imbalance: 1 fraud per 580 legit
tree_method="hist",
eval_metric="aucpr",
)
model.fit(
X_train, y_train,
eval_set=[(X_val, y_val)],
early_stopping_rounds=20,
)
The scale_pos_weight parameter is critical — fraud is extremely rare (0.17% of transactions), and without proper handling of class imbalance, the model would learn to predict "legitimate" for everything and still achieve 99.83% accuracy.
Catching synthetic identities
The breakthrough came from a set of graph-based features we engineered from the bank's transaction network. Synthetic identities often share characteristics that are invisible at the individual level but emerge when you look at the network:
- Multiple new accounts sharing the same device fingerprint or IP address
- Transaction patterns that mirror each other too closely (automated behavior)
- Sudden appearance of a "well-established" credit history (fabricated through data manipulation)
We computed these features using a lightweight graph analysis running alongside the main pipeline, updating relationship scores every 5 minutes.
Results
After a 6-month deployment with parallel running (new system alongside the old one for validation):
- Fraud detection rate: 89.4% (up from 62.1% with rules alone)
- False positive rate: 0.031% (down from 0.089% — fewer legitimate customers inconvenienced)
- Average detection latency: 38ms (vs. 12–36 hours with the batch system)
- Estimated annual savings: EUR 31 million in prevented fraud losses
- Rule maintenance team: reduced from 8 analysts to 2, with the others redeployed to strategic fraud intelligence
The system processes over 1 billion transactions per month and continues to improve as the model is retrained weekly with confirmed fraud labels.


