Skip to content

jessicaromero-ctrl/Big-Data-Applications

Repository files navigation

Big Data Applications

Real-Time Insurance Fraud Detection Pipeline

Python Kafka Flink Institution

Overview

End-to-end real-time fraud detection system for insurance claims processing. Combines streaming data ingestion, in-stream ML inference, and low-latency storage for sub-second decision making.

Architecture

Claims Stream → Apache Kafka → PyFlink (XGBoost inference)
                                      ↓
                             Redis (feature cache)
                                      ↓
                            PostgreSQL (audit log)

Key Components

Component Role
Apache Kafka Event streaming & ingestion
PyFlink Stream processing & feature computation
XGBoost Real-time fraud scoring
Redis Sub-millisecond feature caching
PostgreSQL Persistent audit and results storage

Maestría en Inteligencia Artificial · Big Data · Universidad Politécnica Metropolitana de Hidalgo

About

Real-time insurance fraud detection pipeline: Apache Kafka · PyFlink · XGBoost · Redis · PostgreSQL

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors