Industry-grade computer vision for dense face detection, precise 5-point landmark localisation, and one-to-many identity verification — production-ready and deployed live.
RetinaFace Pro is a high-performance computer vision pipeline designed for dense face detection and identity verification. Unlike simple research scripts, this project is built as a modular ML system wrapping state-of-the-art models into a professional engineering architecture.
- ✅ Production Wrapper: Clean, documented Python API (
src/detector.py). - ✅ ML Monitoring: Integrated MLflow tracking for inference latency and confidence drift.
- ✅ Reliability: 100% testable logic with Pytest and automated CI/CD.
- ✅ Portability: Fully Dockerized for zero-friction deployment on Hugging Face Spaces.
Tested on: NVIDIA GTX 1650 GPU | 16GB RAM | Intel i7
| Task | Latency (ms) | Throughput (FPS) |
|---|---|---|
| Single Face Detection | ~45ms | ~22 FPS |
| Multi-Face (5+) Detection | ~65ms | ~15 FPS |
| Identity Verification | ~110ms | ~9 FPS |
To build a "Top Class" system, we prioritized Engineering Trade-offs over blind model selection. Here is the Why behind our choices:
While two-stage detectors (Faster R-CNN) are accurate, they are often too slow for real-time inference. We chose RetinaFace because its Feature Pyramid Network (FPN) handles multi-scale faces (tiny to large) in a single pass, eliminating the need for a redundant Region Proposal Network (RPN) overhead.
Standard Softmax loss functions struggle with face verification because they don't optimize for embedding compactness. We implemented ArcFace (Additive Angular Margin Loss) because it enforces a tighter distance between similar faces in the hyperspace, leading to superior identity separability compared to standard cosine similarity.
To achieve our ~45ms inference latency on a GTX 1650, we used the MobileNet-0.25 backbone. It offers the best performance-per-watt, ensuring the system remains responsive even in dense multi-face scenarios without requiring expensive A100 GPUs.
Verification accuracy drops by ~15-20% if faces are not aligned. Our pipeline performs an explicit 5-point landmark transformation (eyes, nose, mouth corners) before generating embeddings, ensuring that the ArcFace backbone receives spatially consistent input.
Ratina_Face/
├── src/ # Core logic — FaceDetector wrapper
├── tests/ # Automated test suite (pytest)
├── .github/workflows/ # CI/CD pipelines (GitHub Actions)
├── app.py # Streamlit web application
├── main.py # Command-line tool
├── Dockerfile # Production container setup
└── requirements.txt # Managed dependencies
git clone https://github.com/ImdataScientistSachin/RetinaFace-Detection
cd RetinaFace-Detection
python -m pip install -r requirements.txtTo view inference logs and latency benchmarks:
mlflow uidocker build -t retinaface-pro .
docker run -p 7860:7860 retinaface-proSachin Paunikar — LinkedIn | GitHub