Fraud Detection System
An AI-powered system that catches 84% of fraud while keeping false alarms under 0.05%, deployed in <50ms
2.7M
Annual Savings
83.8%
Fraud Caught
75.2%
Alert Accuracy
What I Built
Fraud detection system to test feature performance across multiple algorithms and optimize for the highest fraud detection rate
The dataset revealed the core challenge: while credit card fraud costs businesses $32 billion annually, only 0.17% of transactions are actually fraudulent
This extreme imbalance makes some traditional approaches ineffective, with most systems either missing fraud or drowning analysts in false alarms
Selecting a right approach
Instead of jumping straight to testing algorithms. I started by asking: 'What makes a transaction suspicious?' This human-centered question shaped everything that followed.
Data Analyses
Analyzed 284K transactions to uncover risk patterns
Feature engineering
Created 21 custom features combining domain knowledge with statistical methods
Algorithm testing
Compared three algorithms and selected XGBoost
Business results
Calculated $2.7M annual value and performed segment analysis to translate model performance
Data Analyses
Transactions
Analyzed 284K transactions over 2 days
Outliers
Discovered isolation forest outliers had 217× fraud concentration
High risk
Identified night transactions = 3× higher risk
Feature engineering
Created 21 custom features in 3 tiers. Top engineered feature (pca_magnitude) became #1 most important (34.5% model weight)
Statistical
Domain Specific
Advanced
Algorithm testing
Compared 3 algorithms and selected XGBoost: 83.8% recall, handling extreme class imbalance
| Algorithm | Recall | Precision | ROC-AUC | Status |
|---|---|---|---|---|
| Logistic Regression | 79.4% | 63.2% | 0.951 | Lower recall |
| Random Forest | 81.7% | 71.8% | 0.963 | Slower inference |
| XGBoost | 83.8% | 75.2% | 0.968 |
Business results
Real-time performance dashboard
Cost-Benefit Breakdown
How catching fraud impacts the business revenue and how much we can save?
Without a System
All 492 frauds succeed = -$3.3M lost per year
With a System
Fraud Prevented: 413 frauds → $2.77M saved
Missed: 79 frauds → $535K loss
Technical Performance
Comprehensive performance metrics and technical achievements of the fraud detection system.
Recall
Catches 413 out of 492 frauds
Precision
3 out of 4 alerts are real fraud
ROC-AUC
Near-perfect discrimination
False Alarm Rate
Only 41 false positives per 85K transactions
Latency
Real-time capable
Segment Analysis (Honest Assessment)
Balancing recall (catch fraud) vs. precision (minimize false alarms) without business context. Solved by calculating cost-benefit tradeoffs at different thresholds.
Strengths
High-value fraud (>$500): 94% recall
Medium transactions ($100-$500): 89% recall
Night transactions: 91% recall
Isolation Forest for feature creation: Outlier scores had 217x fraud concentration
Weaknesses
Micro-transactions (<$10): 78% recall
Very small frauds likely card testing patterns
What Worked Well
Feature engineering over algorithm choice
Business-driven threshold optimization
Segment analysis
Isolation Forest for feature creation

Technologies Used
Technologies
Categories
Machine Learning & AI
Hands-on experimentation with fraud detection, retrieval systems, and autonomous agents.




