Gaze2Nav Code

This repository contains the code for Learning from Human Gaze: Human-like Robot Social Navigation in Dense Crowds (AAAI 2026). Gaze2Nav learns from egocentric RGB video, human gaze, pedestrian instance masks, and 2D trajectories to produce more human-like navigation in dense crowds.

Method Overview

Gaze2Nav follows a three-stage pipeline:

Gaze Predictor predicts where a human would look in the current egocentric frame from recent RGB frames and previous gaze maps.
Semantic Saliency Matching matches predicted gaze peaks with tracked pedestrian masks and keeps recently attended pedestrians in a sliding window.
Motion Planner predicts future waypoints from RGB history plus gaze maps or salient pedestrian masks.

The code also includes ViNT/GNM baselines and gaze/person-aware variants.

Repository Layout

configs/                  Training and detector/tracker configs
gaze2nav/data/            Dataset loaders and preprocessing scripts
gaze2nav/models/          Gaze, saliency-matching, and action models
gaze2nav/training/        Losses, training loops, logging, visualization
detector/, deep_sort/     Mask R-CNN and Deep SORT tracking components
tracking_utils/           Shared detector/tracker drawing, IO, and logging helpers
run_deep_sort.py          Batch tracking script that exports mask CSV files
train.py                  Main train/generate entrypoint
evaluate_salient_ids.py   Evaluation for generated salient-person IDs
create_data_splits.py     Utility for train/test trajectory splits

Expected Data Format

Each trajectory folder should contain frame-level data:

data/<trajectory_name>/
  0.jpg, 1.jpg, ...
  0.csv, 1.csv, ...          # first row: tracked person IDs; following rows: flattened masks
  traj_data.pkl              # 2D trajectory dataframe
  fixations.pkl              # fixation dataframe, x/y per frame
  person_ids.pkl             # generated from mask CSVs
  select_ids.pkl             # ground-truth salient IDs

Split folders contain traj_names.txt:

data_splits/train/traj_names.txt
data_splits/test/traj_names.txt

Create splits with:

python create_data_splits.py --data_dir /path/to/data --data_splits_dir /path/to --dataset_name data_splits

Acknowledgements

Parts of the navigation model structure and training workflow are inspired by robodhruv/visualnav-transformer. The tracking stack includes Deep SORT-derived components together with Mask R-CNN detection code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gaze2Nav Code

Method Overview

Repository Layout

Expected Data Format

Acknowledgements

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
configs		configs
deep_sort		deep_sort
detector		detector
gaze2nav		gaze2nav
tracking_utils		tracking_utils
.gitignore		.gitignore
README.md		README.md
coco_classes.json		coco_classes.json
create_data_splits.py		create_data_splits.py
dataset_statistics.py		dataset_statistics.py
environment.yml		environment.yml
evaluate_salient_ids.py		evaluate_salient_ids.py
run_deep_sort.py		run_deep_sort.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Gaze2Nav Code

Method Overview

Repository Layout

Expected Data Format

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages