Skip to content

Hassanibrar632/Pose_Classification

Repository files navigation

🏋️‍♂️ Pose-Based Action Recognition API

A robust, production-ready pipeline for classifying human exercises like HammerCurl, DeadLift, LegExtension, and ChestFlyMachine from video input using pose landmarks, CNNs, and LSTM-based models.


🚀 Features

  • 🎥 Upload video and get back predicted action and confidence.
  • 🧍‍♂️ Pose estimation powered by MediaPipe.
  • 🧠 Deep Learning models using ConvLSTM, LSTM, and CNNs.
  • 🔌 Built as a Flask API for easy integration.
  • 📊 Inference via pose, image, or both (configurable).
  • 🗂️ Modular structure for future expansion (e.g. camera, live webcam, etc.).

📁 Repository Structure

Pose_Classification/
├── api/                      # Core Flask API
│   ├── api.py                # Flask server and endpoints
│   ├── Poseclassifier.py     # Pose-based action classification logic
│   ├── detect_pose.py        # MediaPipe wrapper for pose extraction
│   ├── results/              # Stores temporary video uploads and outputs
│   └── README.md             # Detailed API usage
│
├── models/                   # Pretrained Keras/TensorFlow models
│   ├── final/                # Combined (pose + image) models
│   ├── img/                  # CNN models for raw frame-based classification
│   ├── pose/                 # LSTM models for pose-only classification
│   └── README.md             # Explanation of model formats and limitations
│
├── requirements.txt          # Python dependencies
└── README.md                 # (You're here!)

🔧 Setup Guide

🐍 Step 1: Clone the Repository

git clone https://github.com/<your-username>/Pose_Classification.git
cd Pose_Classification

🧪 Step 2: Create a Conda Environment

conda create --name pose-env python=3.10 -y
conda activate pose-env

📦 Step 3: Install Dependencies

pip install -r requirements.txt

Optional: For MediaPipe on some Linux distros, if mediapipe errors out:

pip install mediapipe --no-cache-dir

🧪 Running the Flask API

▶️ Step 4: Run the Server

cd api
python api.py

The server will start at:

http://127.0.0.1:5000

🎯 API Usage

📬 Endpoint: /api/infer

Method: POST Content-Type: multipart/form-data

✅ Parameters:

Key Type Required Description
video file Input video file (.mp4, .avi)
orientation string One of PORTRAIT, LANDSCAPE-LEFT, etc.

🧪 Sample cURL Request:

curl -X POST -F "video=@test.mp4" -F "orientation=LANDSCAPE-LEFT" http://localhost:5000/api/infer

🟢 Sample Response:

{
  "action": "DeadLift",
  "confidence": [0.02, 0.93, 0.03, 0.02]
}

Refer to the api/README.md for a detailed explanation of inference flow, frame processing, and pose detection.


🤖 Model Architecture

  • Models are trained using sequences of pose landmarks or video frames.
  • Based on LSTM, BiLSTM, ConvLSTM2D, and CNN architectures.
  • Cannot be converted to ONNX or TFLite due to dynamic time-step layers.

Refer to the models/README.md for full model breakdown.


✅ Supported Classes

label_map = ['HammerCurl', 'DeadLift', 'LegExtension', 'ChestFlyMachine']

Add more by retraining models (training pipeline not included here).


💡 Tips

  • 📸 The API shows live OpenCV video frames while processing. Press q to exit early.
  • 🧼 Videos are auto-deleted or renamed based on prediction for minimal disk usage.
  • 🧠 Switch between mode='pose', 'action', and 'both' in Pose_Classifier.

📚 Requirements Summary

Main packages (full list in requirements.txt):

  • tensorflow
  • mediapipe
  • flask
  • opencv-python
  • numpy

📌 To-Do / Future Work

  • Add training pipeline and scripts
  • Add support for live webcam input
  • Extend to more activities and datasets
  • Convert models to ONNX-friendly formats with simplified architectures

📜 License

This project is provided under an open license. Modify and use freely, but attribution is appreciated.


🤝 Contributing

Feel free to fork and submit pull requests. Create issues for bugs or enhancement requests.


🧑‍💻 Author

Made with ❤️ by M. Hassan Ibrar