AITextGuard

AI-Generated Text Detection using Machine Learning & Production-Ready MLOps

AITextGuard is an end-to-end NLP system designed to detect AI-generated text using a stacked machine learning approach.
The project demonstrates how to design, train, evaluate, containerize, monitor, and deploy a production-style ML system using modern MLOps practices.

The entire system can be launched locally with a single command using Docker Compose.

🚀 Live Demo & Source Code

Hugging Face Space:
https://huggingface.co/spaces/Sahil4818/ai-text-guard
GitHub Repository:
https://github.com/Sip4818/AICheatTextGuard

📌 Problem Statement

With the increasing use of large language models, distinguishing between human-written and AI-generated text has become critical for:

Academic integrity
Content moderation
Plagiarism detection
Information reliability

AITextGuard explores a machine learning–based approach combining statistical features and transformer embeddings to classify AI-generated text.

🏗️ System Architecture

The system runs as a multi-service Docker setup:

User → Streamlit UI → FastAPI Backend → Redis (Cache)
↓
Prometheus (Monitoring)

🔄 Training Pipeline

Data ingestion from Google Cloud Storage (GCS)
Schema validation
Train–test split
Feature engineering:
- Statistical text features
- Transformer-based sentence embeddings
Model training:
- Level-1: Logistic Regression, XGBoost
- Level-2: Meta Logistic Regression (Stacking)
Hyperparameter tuning (Optuna)
Experiment tracking (MLflow)
Model evaluation (ROC-AUC)
Approved model stored in cloud storage

Training is reproducible using DVC.

⚡ Inference Pipeline

Text submitted via Streamlit UI
FastAPI backend processes request
Feature transformation applied
Stacked model predicts probability
Result returned via REST API
Prediction cached in Redis (10-minute TTL)
Prometheus collects performance metrics

✨ Key Features

End-to-end ML pipeline
Stacked ensemble learning
Transformer-based embeddings
Reproducible training with DVC
Experiment tracking with MLflow
FastAPI inference API
Redis caching
Prometheus monitoring
Fully containerized architecture
CI/CD with GitHub Actions
Automated Docker image builds with model injection

🛠️ Tech Stack

Programming

Python

Machine Learning & NLP

Scikit-learn
XGBoost
Sentence Transformers

Data & MLOps

Pandas
NumPy
Optuna
MLflow
DVC

Backend & API

FastAPI

Monitoring

Prometheus

Caching

Redis

Deployment & DevOps

Docker
Docker Compose
GitHub Actions
Docker Hub

Cloud

Google Cloud Storage (Model artifacts)
Hugging Face Spaces (Live demo)

🧠 Model Overview

Features

Text statistics
Punctuation distribution
Sentence embeddings

Architecture

Level-1: Logistic Regression + XGBoost
Level-2: Meta Logistic Regression

Evaluation Metric

ROC-AUC

🔄 CI/CD Pipeline

On every push to main:

Authenticate with Google Cloud
Download trained model from GCS
Build backend image (model included)
Build UI image
Push images to Docker Hub

▶️ Run Locally

Requirements

Docker
Docker Compose

Start Everything

git clone https://github.com/Sip4818/AICheatTextGuard.git
cd AICheatTextGuard
docker compose up

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.dvc		.dvc
.github/workflows		.github/workflows
Docker		Docker
config		config
data_schema		data_schema
docs		docs
metrics		metrics
notebooks		notebooks
reports		reports
requirements		requirements
src		src
tests		tests
ui		ui
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENT.md		AGENT.md
Dockerfile-old		Dockerfile-old
LICENSE		LICENSE
README.md		README.md
app.py		app.py
compose.yml		compose.yml
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
important_files.txt		important_files.txt
main.py		main.py
nohup.out		nohup.out
params.yaml		params.yaml
project_structure.py		project_structure.py
prometheus.yml		prometheus.yml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py
start.sh		start.sh
structure.txt		structure.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AITextGuard

🚀 Live Demo & Source Code

📌 Problem Statement

🏗️ System Architecture

🔄 Training Pipeline

⚡ Inference Pipeline

✨ Key Features

🛠️ Tech Stack

🧠 Model Overview

🔄 CI/CD Pipeline

▶️ Run Locally

Requirements

Start Everything

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AITextGuard

🚀 Live Demo & Source Code

📌 Problem Statement

🏗️ System Architecture

🔄 Training Pipeline

⚡ Inference Pipeline

✨ Key Features

🛠️ Tech Stack

🧠 Model Overview

🔄 CI/CD Pipeline

▶️ Run Locally

Requirements

Start Everything

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages