A production-grade, end-to-end MLOps pipeline for detecting AI-generated (deepfake) human faces. This project demonstrates industry-standard practices including data versioning, experiment tracking, containerization, CI/CD automation, and cloud deployment.
- Project Overview
- Key Features
- Model Performance
- Application Interface
- MLOps Architecture
- Technology Stack
- Project Structure
- Installation and Usage
- CI/CD Pipeline
- AWS Infrastructure
- Live Demo
- Model Limitations
- License
With the rise of Generative Adversarial Networks (GANs) and Diffusion models, distinguishing between real and synthetic media has become a critical challenge. DeepGuard leverages a Deep Convolutional Neural Network (CNN) trained on the GenImage dataset to classify images as either "REAL" or "FAKE" (AI-Generated).
The project implements a complete machine learning lifecycle:
- Data Management: Automated ingestion, preprocessing, and versioning
- Model Development: Transfer learning with experiment tracking
- Deployment: Containerized application with CI/CD automation
- Infrastructure: Cloud-native deployment on AWS
| Feature | Description |
|---|---|
| Robust Detection Model | Trained on 140,000+ images achieving >95% validation accuracy |
| DVC Pipeline | 6-stage reproducible ML pipeline with data versioning |
| Experiment Tracking | MLflow + DagsHub integration for metrics and artifacts |
| Containerization | Docker image with optimized TensorFlow runtime |
| CI/CD Automation | GitHub Actions for testing, building, and deployment |
| Cloud Storage | AWS S3 for DVC remote, ECR for container registry |
| Dual Deployment | Flask API + Hugging Face Gradio interface |
| FFT Analysis | Frequency domain visualization for GAN artifact detection |
The model uses Transfer Learning (Xception architecture) fine-tuned for deepfake artifact detection.
| Metric | Value |
|---|---|
| Training Accuracy | ~99% |
| Validation Accuracy | ~95% |
| Test Accuracy | ~88% |
| Dataset Size | 140,000 images |
| Image Sources | Stable Diffusion, Midjourney, DALL-E |
The application provides a clean, user-friendly interface for real-time deepfake analysis.
The model provides a confidence score and prediction label for every uploaded image.
The model distinguishes between high-quality AI-generated faces and authentic photographs.
| AI-Generated (Deepfake) | Real Photograph |
|---|---|
![]() |
![]() |
┌─────────────────────────────────────────────────────────────────────────────┐
│ DeepGuard MLOps Pipeline │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ GitHub │───>│ GitHub │───>│ AWS │ │
│ │ Repository │ │ Actions │ │ ECR │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ DVC │───>│ Docker │───>│ AWS EKS │ │
│ │ Pipeline │ │ Build │ │ Kubernetes │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ AWS S3 │ │ Prometheus │ │
│ │ (Data) │ │ + Grafana │ │
│ └──────────────┘ └──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ DagsHub │ │
│ │ MLflow │ │
│ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
The ML pipeline is defined in dvc.yaml with 6 reproducible stages:
| Stage | Script | Output |
|---|---|---|
| 1. Data Ingestion | src/data/data_ingestion.py |
data/raw/ |
| 2. Preprocessing | src/data/data_preprocessing.py |
data/processed/ |
| 3. Feature Engineering | src/features/feature_engineering.py |
data/features/ |
| 4. Model Training | src/model/model_building.py |
models/ |
| 5. Evaluation | src/model/model_evaluation.py |
reports/ |
| 6. Registration | src/model/register_model.py |
MLflow Registry |
- Framework: TensorFlow 2.x, Keras
- Architecture: Xception (Transfer Learning)
- Data Processing: NumPy, Pandas, Pillow
- Data Versioning: DVC (Data Version Control)
- Experiment Tracking: MLflow, DagsHub
- Pipeline Orchestration: DVC Pipelines
- Containerization: Docker
- CI/CD: GitHub Actions
- Container Registry: AWS ECR
- Orchestration: AWS EKS (Kubernetes)
- Web Framework: Flask, Gradio
- Storage: AWS S3
- Compute: AWS EKS
- Monitoring: Prometheus, Grafana
DeepGuard-MLOps-Pipeline/
├── .github/
│ └── workflows/
│ └── ci.yaml # GitHub Actions CI/CD
├── data/
│ ├── raw/ # Raw dataset (DVC tracked)
│ ├── processed/ # Preprocessed data
│ └── features/ # Feature engineered data
├── flask_app/
│ ├── app.py # Flask application
│ ├── model/ # Production model
│ ├── samples/ # Sample images for demo
│ ├── static/ # CSS styles
│ ├── templates/ # HTML templates
│ └── requirements.txt # Flask dependencies
├── models/ # Trained model checkpoints
├── notebooks/ # Experiment notebooks
├── reports/
│ ├── figures/ # Evaluation plots
│ └── metrics.json # Model metrics
├── src/
│ ├── data/ # Data processing modules
│ ├── features/ # Feature engineering
│ ├── model/ # Model training and evaluation
│ └── logger/ # Logging utilities
├── tests/ # Test scripts
├── Dockerfile # Container build instructions
├── dvc.yaml # DVC pipeline definition
├── params.yaml # Pipeline parameters
└── requirements.txt # Project dependencies
- Python 3.10+
- Git
- Docker (optional, for containerized deployment)
git clone https://github.com/HarshTomar1234/DeepGuard-MLOps-Pipeline.git
cd DeepGuard-MLOps-Pipelinepython -m venv atlas
# Windows
atlas\Scripts\activate
# Linux/Mac
source atlas/bin/activatepip install -r requirements.txtpython flask_app/app.pyAccess the application at http://localhost:5000
# Pull data from DVC remote
dvc pull
# Run the entire pipeline
dvc repro# Build the image
docker build -t deepguard-app:latest .
# Run the container
docker run -p 8888:5000 deepguard-app:latestAccess at http://localhost:8888
The project uses GitHub Actions for automated testing, building, and deployment.
Trigger: Push to main branch
│
├── Job 1: Test
│ ├── Checkout code
│ ├── Setup Python 3.11
│ ├── Install dependencies
│ └── Run tests
│
└── Job 2: Build & Push (if tests pass)
├── Configure AWS credentials
├── Login to Amazon ECR
├── Build Docker image
└── Push to ECR registry| Secret | Description |
|---|---|
AWS_ACCESS_KEY_ID |
IAM user access key |
AWS_SECRET_ACCESS_KEY |
IAM user secret key |
AWS_REGION |
AWS region (e.g., us-east-1) |
AWS_ACCOUNT_ID |
12-digit AWS account ID |
ECR_REPOSITORY |
ECR repository name |
| Service | Purpose | Cost |
|---|---|---|
| S3 | DVC data storage | ~$0.02/GB/month |
| ECR | Docker image registry | ~$0.10/GB/month |
| EKS | Kubernetes cluster | ~$0.10/hr + EC2 |
| EC2 | Prometheus/Grafana | ~$0.04/hr |
┌─────────────────────────────────────────────────────────┐
│ AWS Cloud │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ S3 │ │ ECR │ │
│ │ (DVC Data) │ │ (Docker Img) │ │
│ └──────────────┘ └──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────┐ │
│ │ EKS Cluster │ │
│ │ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Pod 1 │ │ Pod 2 │ │ │
│ │ │ Flask App │ │ Flask App │ │ │
│ │ └─────────────┘ └─────────────┘ │ │
│ │ │ │ │ │
│ │ └────────┬────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────┐ │ │
│ │ │LoadBalancer │ │ │
│ │ └─────────────┘ │ │
│ └──────────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────┐ │ ┌──────────────┐ │
│ │ Prometheus │<───────┴───────>│ Grafana │ │
│ │ (Metrics) │ │ (Dashboard) │ │
│ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
The application was successfully deployed to AWS EKS with LoadBalancer service:
| Component | Details |
|---|---|
| Cluster | deepguard-cluster |
| Node Type | t3.small (1 node) |
| Memory Limit | 1.5Gi (TensorFlow requirement) |
| Service Type | LoadBalancer |
For detailed deployment guide, see docs/EKS_DEPLOYMENT.md.
The project includes observability setup with Prometheus for metrics collection and Grafana for visualization.
Prometheus Metrics:
| Metric | Type | Description |
|---|---|---|
up |
Gauge | Service availability |
prometheus_http_requests_total |
Counter | Total HTTP requests |
process_resident_memory_bytes |
Gauge | Memory usage |
Grafana Dashboard Examples:
| Service Status | HTTP Requests | Memory Usage |
|---|---|---|
![]() |
![]() |
![]() |
For detailed monitoring setup, see docs/MONITORING.md.
Try the model on Hugging Face Spaces:
This model is trained specifically on the GenImage dataset and may not generalize to:
- Out-of-distribution images (different generators, styles)
- Heavy compression or low-resolution images
- Images with significant post-processing
For production use, consider:
- Continuous monitoring for distribution drift
- Regular retraining with new synthetic image generators
- Ensemble methods for improved robustness
See docs/MODEL_LIMITATIONS.md for detailed analysis.
All experiments are tracked on DagsHub:
| Document | Description |
|---|---|
| ARCHITECTURE.md | System architecture and design |
| QUICKSTART.md | Quick start guide |
| SETUP.md | DagsHub/MLflow setup |
| EKS_DEPLOYMENT.md | Kubernetes deployment guide |
| MONITORING.md | Prometheus & Grafana setup |
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
- GenImage Dataset for training data
- DVC for data versioning
- DagsHub for MLflow hosting
- Hugging Face for Gradio hosting








.png)