Skip to content

[Feature]: Add TensorRT .engine Model Support for Faster Inference #94

@kashviporwal-byte

Description

@kashviporwal-byte

Problem Statement

Currently Eagle supports models in .pt and .onnx formats only. While these formats work well, they are not fully optimized for high-performance GPU inference on NVIDIA devices.

For real-time applications like surveillance and violence detection, inference speed and low latency are very important. Without TensorRT .engine support, users may experience:

  • Higher inference latency
  • Lower FPS during video processing
  • Increased GPU memory usage
  • Reduced deployment efficiency on NVIDIA GPUs

Adding TensorRT engine support would make Eagle more suitable for production and edge AI deployments.

Proposed Solution

Add support for TensorRT .engine model loading and inference alongside existing .pt and .onnx formats.

Suggested implementation:

if model_path.endswith(".engine"):
    load_tensorrt_model()
elif model_path.endswith(".onnx"):
    load_onnx_model()
elif model_path.endswith(".pt"):
    load_pytorch_model()

Additional improvements:

  • Add TensorRT inference utilities
  • Add model conversion documentation
  • Support FP16 optimization
  • Add benchmark comparison between .pt, .onnx, and .engine

Example conversion:

trtexec --onnx=model.onnx --saveEngine=model.engine --fp16

Affected Component

Detection (YOLOv8/v9 — services/detection/)

Estimated Difficulty

🟡 Intermediate — Requires understanding of one service

Alternatives Considered

Current workaround is using:

  • PyTorch .pt
  • ONNX Runtime with .onnx

However, these do not provide the same level of optimization and low-latency inference as TensorRT engines on NVIDIA GPUs.

Additional Context

TensorRT provides:

  • Faster GPU inference
  • Better CUDA optimization
  • Reduced latency
  • Lower memory usage
  • FP16 / INT8 acceleration

Useful references:

This feature would improve Eagle’s real-time AI performance and deployment capabilities significantly.

Contribution

  • I would like to implement this feature and submit a PR.

Checklist

  • I have searched existing issues and this is not a duplicate.
  • I have read the CONTRIBUTING.md guidelines.

Metadata

Metadata

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions