Problem Statement
Currently Eagle supports models in .pt and .onnx formats only. While these formats work well, they are not fully optimized for high-performance GPU inference on NVIDIA devices.
For real-time applications like surveillance and violence detection, inference speed and low latency are very important. Without TensorRT .engine support, users may experience:
- Higher inference latency
- Lower FPS during video processing
- Increased GPU memory usage
- Reduced deployment efficiency on NVIDIA GPUs
Adding TensorRT engine support would make Eagle more suitable for production and edge AI deployments.
Proposed Solution
Add support for TensorRT .engine model loading and inference alongside existing .pt and .onnx formats.
Suggested implementation:
if model_path.endswith(".engine"):
load_tensorrt_model()
elif model_path.endswith(".onnx"):
load_onnx_model()
elif model_path.endswith(".pt"):
load_pytorch_model()
Additional improvements:
- Add TensorRT inference utilities
- Add model conversion documentation
- Support FP16 optimization
- Add benchmark comparison between .pt, .onnx, and .engine
Example conversion:
trtexec --onnx=model.onnx --saveEngine=model.engine --fp16
Affected Component
Detection (YOLOv8/v9 — services/detection/)
Estimated Difficulty
🟡 Intermediate — Requires understanding of one service
Alternatives Considered
Current workaround is using:
- PyTorch .pt
- ONNX Runtime with .onnx
However, these do not provide the same level of optimization and low-latency inference as TensorRT engines on NVIDIA GPUs.
Additional Context
TensorRT provides:
- Faster GPU inference
- Better CUDA optimization
- Reduced latency
- Lower memory usage
- FP16 / INT8 acceleration
Useful references:
This feature would improve Eagle’s real-time AI performance and deployment capabilities significantly.
Contribution
Checklist
Problem Statement
Currently Eagle supports models in .pt and .onnx formats only. While these formats work well, they are not fully optimized for high-performance GPU inference on NVIDIA devices.
For real-time applications like surveillance and violence detection, inference speed and low latency are very important. Without TensorRT .engine support, users may experience:
Adding TensorRT engine support would make Eagle more suitable for production and edge AI deployments.
Proposed Solution
Add support for TensorRT .engine model loading and inference alongside existing .pt and .onnx formats.
Suggested implementation:
Additional improvements:
Example conversion:
trtexec --onnx=model.onnx --saveEngine=model.engine --fp16Affected Component
Detection (YOLOv8/v9 — services/detection/)
Estimated Difficulty
🟡 Intermediate — Requires understanding of one service
Alternatives Considered
Current workaround is using:
However, these do not provide the same level of optimization and low-latency inference as TensorRT engines on NVIDIA GPUs.
Additional Context
TensorRT provides:
Useful references:
This feature would improve Eagle’s real-time AI performance and deployment capabilities significantly.
Contribution
Checklist