@SiddharthRiot
@ArokyaMatthew
@Devnil434
Assign this issue to me under GSSOC'26.
Problem Statement
Currently Eagle performs object detection, tracking, and reasoning effectively, but it lacks temporal understanding of activities across multiple frames.
The system cannot detect actions such as:
- fighting
- running
- loitering
- suspicious movement patterns
Adding temporal action recognition would improve surveillance intelligence and enable activity-level understanding.
Proposed Solution
Implement a lightweight temporal action recognition module using OpenCV + PyTorch.
Suggested implementation:
- Maintain frame-history buffers for tracked persons
- Use temporal models such as CNN+LSTM or MoViNet
- Predict actions from frame sequences
- Integrate action labels into the reasoning pipeline
Example output:
{
"track_id": 12,
"action": "running",
"confidence": 0.93
}
Affected Component
Tracking (ByteTrack / DeepSORT — services/tracking/)
Estimated Difficulty
🔴 Advanced — Spans multiple services or needs ML expertise
Alternatives Considered
Simple frame-by-frame analysis was considered, but temporal models can better understand activities and motion patterns across multiple frames.
Additional Context
Models:
- CNN + LSTM
- MoViNet
- SlowFast
Tech Stack:
Contribution
Checklist
@SiddharthRiot
@ArokyaMatthew
@Devnil434
Assign this issue to me under GSSOC'26.
Problem Statement
Currently Eagle performs object detection, tracking, and reasoning effectively, but it lacks temporal understanding of activities across multiple frames.
The system cannot detect actions such as:
Adding temporal action recognition would improve surveillance intelligence and enable activity-level understanding.
Proposed Solution
Implement a lightweight temporal action recognition module using OpenCV + PyTorch.
Suggested implementation:
Example output:
{
"track_id": 12,
"action": "running",
"confidence": 0.93
}
Affected Component
Tracking (ByteTrack / DeepSORT — services/tracking/)
Estimated Difficulty
🔴 Advanced — Spans multiple services or needs ML expertise
Alternatives Considered
Simple frame-by-frame analysis was considered, but temporal models can better understand activities and motion patterns across multiple frames.
Additional Context
Models:
Tech Stack:
Contribution
Checklist