Skip to content

Latest commit

 

History

History
187 lines (123 loc) · 4.29 KB

File metadata and controls

187 lines (123 loc) · 4.29 KB

📦 Pose Classification API

This module is a Flask-based video action classification API that uses pose detection and a pretrained model to predict exercises like DeadLift, HammerCurl, etc., from video input.


📁 Folder Structure

Pose_Classification/api/
├── api.py                 # Main Flask API server
├── Poseclassifier.py      # Pose-based action classification logic
├── detect_pose.py         # MediaPipe-based pose landmark extractor
├── results/               # Output directory for saved/renamed videos

🚀 Functionality Overview

The API:

  1. Receives a video.
  2. Optionally rotates the video based on device orientation.
  3. Extracts pose landmarks from frames using MediaPipe.
  4. Feeds the landmarks to a deep learning model for classification.
  5. Returns the predicted action class and confidence scores.

🔧 Setup & Running

1. Install Dependencies

pip install flask numpy opencv-python mediapipe

2. Run the API Server

cd Pose_Classification/api
python api.py

You’ll see Flask start at: 👉 http://127.0.0.1:5000


📥 API Endpoint

POST /api/infer

Classifies action from uploaded video.

Request

  • video: The MP4 video file.

  • orientation (optional): One of

    • PORTRAIT
    • LANDSCAPE-LEFT (default/no rotation)
    • LANDSCAPE-RIGHT
    • PORTRAIT-UPSIDEDOWN

Example using curl:

curl -X POST -F "video=@your_video.mp4" -F "orientation=LANDSCAPE-LEFT" http://127.0.0.1:5000/api/infer

Response

{
  "action": "HammerCurl",
  "confidence": [0.95, 0.03, 0.01, 0.01]
}

🧠 How It Works (Internal Flow)

🔸 api.py — Flask Server

  • Handles video upload & orientation rotation.

  • Uses OpenCV to read & display video.

  • Calls:

    • detect_pose.mediapipe_detection() → to get pose landmarks.
    • pose_model.predict() → to classify action from pose sequence.
  • Returns predicted label and scores as JSON.

  • Saves processed video to results/.


🔸 Poseclassifier.py — Pose Classification Module

Defines the class Pose_Classifier which handles all model-related logic:

✅ Key Methods:

  • __init__(self, label_map, mode='both') Initializes the classifier with a label list and mode:

    • 'pose': Only pose-based classification
    • 'action': Only image-based (frame) classification
    • 'both': Combines both
  • predict(self, img_sequence=None, pose_sequence=None) Takes a sequence of 30 pose vectors and returns:

    • predicted_probs: class probabilities
    • predicted_label: class with highest probability
  • __get_class_labels(self, probs) Converts model logits into a label.

  • __draw_action_list(self, image, predicted_action) Overlays predicted action + menu on video frame.

  • concat_frames_horizontally(self, img, predicted_action) Merges frames side-by-side with label overlay (for UI/visualization).

🔁 This file wraps a trained model and makes predictions given processed sequences.


🔸 detect_pose.py — Pose Detection Utility

Contains the DetectPose class, responsible for detecting human keypoints.

✅ Key Method:

  • mediapipe_detection(image, draw=True) Uses MediaPipe to detect human pose landmarks in a given frame. Returns:

    • img: Frame with drawn pose
    • pose: A list/vector of keypoints for use in classification
    • visibility: Keypoint visibility score (optional)

🧠 This is the feature extractor that converts raw video frames to pose sequences.


🧪 Output

After classification:

  • The predicted action is returned in the response.

  • The input video is saved to:

    results/{PredictedAction}_{RandomID}.mp4
    

📝 Notes

  • A valid prediction requires ~30 frames with a detectable pose.

  • The current label map supports:

    ['HammerCurl', 'DeadLift', 'LegExtension', 'ChestFlyMachine']
  • If pose detection fails, the API responds with:

    {
      "action": "",
      "confidence": [],
      "error": "No pose detected in the video"
    }

📌 Tips

  • Keep input videos short (~2–3 seconds).
  • Good lighting and body visibility improve accuracy.
  • You can expand the label_map and model to include more actions.