Localized-Perception-Constrained-Vision-Systems

Implementation companion to Resource Efficient Perception for Vision Systems. The work targets high-resolution vision under tight GPU memory: images are processed in patches, combined with a global context, and fed to downstream heads so models can be trained and deployed on large fields of view—including on resource-constrained hardware (e.g. Jetson-class devices). Results are reported across seven benchmarks spanning classification, object detection, and segmentation.

Paper: https://arxiv.org/abs/2405.07166

Repository layout

Path	Description
`classification_detection_patchgd/`	PatchGD for classification (`patchGD.py`, `oursPatchGDv1.py`, …) and detection (`detection.py`, `fcos.py`). See classification_detection_patchgd/README.md.
`segmentation_patchgd/`	PatchGD-style segmentation experiments.

Install

1. Python 3 (3.8+ recommended).

2. PyTorch — install the wheel that matches your CUDA or CPU from pytorch.org.

3. Per subproject

Classification & detection (this repo’s PatchGD scripts):
```
cd classification_detection_patchgd
pip install -r requirements.txt
```
Dependencies covered: torch/torchvision, Pillow, numpy, opencv-python, matplotlib, torchmetrics, fvcore, pytorch-warmup, ultralytics. (Install torch first as above.)

Segmentation subfolder (pinned stack used there):

pip install -r segmentation_patchgd/requirements.txt

Before training, set basePath in classification_detection_patchgd/constants.py to your data root.

Experiments

Image classification (PatchGD, UltraMNIST, PANDA, ImageFolder datasets such as AID)
Object detection (FCOS + patch-based features)
Image segmentation (see segmentation_patchgd/)

Citation

@article{subramanyam2024resource,
  title={Resource Efficient Perception for Vision Systems},
  author={Subramanyam, A V and Singal, Niyati and Verma, Vinay K},
  journal={arXiv preprint arXiv:2405.07166},
  year={2024}
}

Conclusion

This repository studies localized, memory-bounded perception: patch-based computation preserves fine structure while global context stabilizes semantics, enabling competitive accuracy on large images within practical memory budgets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Localized-Perception-Constrained-Vision-Systems

Repository layout

Install

Experiments

Citation

Conclusion

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Localized-Perception-Constrained-Vision-Systems

Repository layout

Install

Experiments

Citation

Conclusion