Skip to content

zerotonin/stag

Repository files navigation

STAG — Sensor-based Tracking and Analysis of Gait

License: MIT Python 3.10+ DOI

An unsupervised machine-learning pipeline for classifying farmed red deer (Cervus elaphus) behaviour from wearable tri-axial accelerometer data.

STAG discovers prototypical movement patterns directly from sensor streams using k-means clustering, chains them into higher-order behavioural sequences via a Hidden Markov Model, and runs on a 16 MHz microcontroller at over 4 × 10⁸ classifications per second — no GPU, cloud link, or labelled training data required at inference time.


Pipeline overview

Stage Module Description
1 stag.sync Synchronise head & ear accelerometer streams via calibration-drop events
2 stag.database Ingest synchronised .h5 files into a SQLite database (SQLAlchemy ORM)
3 stag.gps Compute ground speed and path tortuosity from GPS fixes (NZMG projection)
4 stag.clustering GPU-accelerated k-means with contiguous leave-out stability analysis
5 stag.analysis Transition matrices, bout statistics, and HMM super-prototypes

Repository structure

stag/
├── stag/                   # Python package
│   ├── sync/               # Sensor synchronisation
│   ├── database/           # SQLAlchemy ORM & database construction
│   ├── gps/                # GPS trajectory analysis & plotting
│   ├── clustering/         # k-means clustering & meta-analysis
│   ├── analysis/           # Label analysis, HMM, preprocessing
│   └── utils/              # Logging, filename generation, helpers
├── scripts/                # Runnable entry-point scripts
├── slurm/                  # HPC job submission scripts (NeSI / Aoraki)
├── notebooks/              # Jupyter notebooks
│   ├── data_merging/       # Per-deer data merging notebooks
│   └── exploratory/        # Exploratory analysis & tortuosity
├── data/                   # Deer code CSVs and auxiliary data
│   └── deer_codes/         # Animal identification lookup tables
├── docs/                   # Sphinx documentation source
├── tests/                  # Unit tests (to be expanded)
├── CITATION.cff            # Machine-readable citation metadata
├── LICENSE                 # MIT License
├── pyproject.toml          # Build system & dependencies
└── environment.yml         # Conda environment specification

Installation

With conda (recommended for GPU clustering)

conda env create -f environment.yml
conda activate stag
pip install -e .

With pip (CPU-only, no RAPIDS)

pip install -e .

For GPU-accelerated clustering you additionally need RAPIDS cuML installed in your environment.

Quick start

1. Synchronise sensor data

from stag.sync.data_sync import BetterDataSync

syncer = BetterDataSync(
    deer_id="R1_D1",
    head_data=head_df,
    ear_data=ear_df,
    window_dict={"start": 0, "end": 50000},
)
syncer.run_synchronization()

2. Run clustering

python scripts/run_clustering.py \
    -t deer8 -nc 8 -ds 0 -dp 0 -rs 0 \
    -df data/clust_data_deer8.npy \
    -sd results/

Or submit a SLURM sweep:

sbatch slurm/run_slurm_main_clustering.sh

3. Analyse behavioural sequences

from stag.analysis.label_analysis import LabelAnalyser

analyser = LabelAnalyser("results/labels.npy", fps=50)
analyser.main(cutoff=2, save_path="results/label_analysis.json")

Hardware deployment

The trained nearest-centroid classifier was benchmarked on embedded microcontrollers:

Processor Classifications / sec
Arduino Uno (16 MHz ATmega328P) 4.3 × 10⁸
Arduino Micro 4.3 × 10⁸
Intel i7-13700H (single core) 2.6 × 10⁵

See also CITATION.cff for machine-readable citation metadata.

License

This project is licensed under the MIT License — see LICENSE for details.

Authors

  • Alexander R. H. Matthews — Department of Zoology, University of Otago, Dunedin, New Zealand
  • Lindsay R. Matthews — Matthews Research International LP, New Zealand
  • Bart R. H. Geurten — Department of Zoology, University of Otago, Dunedin, New Zealand (corresponding author: bart.geurten@otago.ac.nz)

Development history

This pipeline was developed at the Department of Zoology, University of Otago. Early development took place under github.com/alexrhmatthews/headshake_project; the codebase was reorganised and renamed to STAG for publication.

About

STAG — Sensor-based Tracking and Analysis of Gait. An unsupervised ML pipeline for classifying farmed red deer behaviour from wearable accelerometers. k-means clustering → HMM super-prototypes → real-time on-animal deployment at 4×10⁸ classifications/sec on a 16 MHz microcontroller.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors