Skip to content

Ramcharan747/cursor-trajectory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ–±οΈ Cursor Trajectory

Learn how humans move β€” then generate it.

A research system that captures real cursor movement at per-pixel resolution, decomposes it into mathematical primitives, and trains a Neural ODE to generate naturalistic trajectories between any two points.

Build Rust PyTorch License Platform


CursorCapture records every pixel your cursor visits Β· TrajectoryGen learns your movement patterns Β· Generate realistic paths on demand


🎯 What is this?

Most trajectory generation uses straight lines or BΓ©zier curves. Real human cursor movement is far more complex β€” it has acceleration phases, micro-corrections, overshoots, and a rhythmic quality unique to each person.

This project takes a different approach:

  1. Record every pixel the cursor visits during daily computer use (per-pixel, ΞΌs timestamps)
  2. Decompose trajectories into mathematical building blocks (motion primitives) using SIREN neural networks
  3. Learn the vocabulary of how you move using VQ-VAE
  4. Generate new trajectories using Latent ODEs that are statistically indistinguishable from real movement

The result: given any two points on screen, output a continuous, naturalistic trajectory in under 500ms β€” with no seams, no straight lines, and no unnatural transitions.

Domain-agnostic by design. Cursor movement is the test domain. The same architecture generalizes to game NPC movement, animation, handwriting synthesis, and robotics.


πŸ“¦ CursorCapture β€” Data Collection

A 1.6MB Rust binary that silently records cursor movement. Install once, forget forever.

Why per-pixel?

Most recorders sample at a fixed rate (e.g. 60Hz = every 16ms). This loses data β€” if your cursor moves 200 pixels in 16ms, you only see the start and end, missing 198 points of the actual trajectory.

CursorCapture takes a different approach: record every distinct pixel the cursor visits. No time-based throttling. The OS reports a new position β†’ we record it. Period.

Each event includes a microsecond-precision timestamp so you can compute velocity, acceleration, and jerk from the data without any interpolation guesswork.

Features

Feature Detail
πŸ”¬ Per-pixel capture Records every distinct pixel position β€” zero spatial data loss
⏱️ Microsecond timestamps μs-precision timing for velocity & acceleration analysis
πŸ’Ύ Efficient storage JSONL format, hourly file rotation, auto-gzip after 24h
πŸ”’ Privacy first Position + timestamp only. No screenshots, keystrokes, or window titles
πŸ”„ Auto-start Runs on every login β€” macOS LaunchAgent / Windows Startup
πŸ›‘οΈ Self-healing Crash recovery via KeepAlive (macOS) / auto-restart
πŸ“Š Storage cap 500MB limit, oldest compressed files auto-deleted
πŸͺΆ Tiny footprint < 5MB RAM, single static binary, zero dependencies

Quick Install

🍎 macOS (Apple Silicon & Intel)

Option 1: One-click installer

  1. Download the latest .tar.gz from Releases
  2. Extract and double-click install_mac.command
  3. Grant Accessibility permission when the settings window opens
  4. Done βœ“ β€” runs in the background forever

Option 2: Manual

# Download and extract
tar -xzf cursor_capture-macos-*.tar.gz
cd dist

# Install (registers auto-start, opens permission dialog)
./cursor_capture install

Note: macOS requires Accessibility permission for cursor monitoring. The installer opens the settings panel automatically β€” just add and enable cursor_capture.

πŸͺŸ Windows

Option 1: One-click installer

  1. Download the latest .zip from Releases
  2. Extract and double-click install_win.bat
  3. Done βœ“ β€” no special permissions needed on Windows

Option 2: Manual

# Just run the exe β€” it auto-installs on first launch
.\cursor_capture.exe
πŸ”§ Build from source
git clone https://github.com/Ramcharan747/cursor-trajectory.git
cd cursor-trajectory/cursor_capture
cargo build --release

# Binary at: target/release/cursor_capture
./target/release/cursor_capture install

Usage

cursor_capture              # Smart default: auto-install if needed, then run
cursor_capture status       # Check if running, data size, etc.
cursor_capture uninstall    # Remove auto-start (preserves data)

Data Format

Each line is a distinct pixel position the cursor visited, with a microsecond timestamp:

{"x":1024.0,"y":768.0,"t":1714857600123456}
{"x":1025.0,"y":769.0,"t":1714857600124012}
{"x":1026.0,"y":770.0,"t":1714857600124589}
Field Type Description
x f64 Horizontal position (integer pixels)
y f64 Vertical position (integer pixels)
t i64 Microseconds since Unix epoch (ΞΌs precision)

Why microseconds? At per-pixel capture rates, consecutive events can be <1ms apart. Microsecond precision lets you compute instantaneous velocity and acceleration without rounding artifacts.

Storage estimates (8 hours active use/day):

  • Standard mouse (125Hz): ~160MB/day raw β†’ ~30MB compressed
  • 500MB cap β‰ˆ 2-3 weeks of continuous collection
  • Auto-compresses files older than 24h, auto-deletes oldest when cap reached

🧠 TrajectoryGen β€” ML Pipeline

Status: Model architecture complete. Training begins after data collection.

Architecture Overview

Raw Data ──→ Segmentation ──→ SIREN INR ──→ VQ-VAE ──→ Latent ODE ──→ Trajectory
 (x,y,ΞΌs)     micro-cuts      per-segment    primitive    sequence      continuous
  per-pixel    at velocity     weight vector  library      generator    output
               dips, turns     compression    128 codes    ODE-RNN      (x,y,t)

Model Details

All model implementations are paper-verified against the original research:

Stage 1: Segmentation

Cut continuous per-pixel recordings at meaningful boundaries:

  • Direction reversals (>45Β° angle change between velocity vectors)
  • Velocity dips (<5% of local peak speed β€” near-pauses)
  • Curvature inflection points (2nd derivative magnitude > 3Οƒ)
  • Segments shorter than 50ms merged, longer than 3s split
Stage 2: SIREN INR Fitting

Paper: Implicit Neural Representations with Periodic Activation Functions (Sitzmann et al., 2020)

Each micro-segment (0.05–3s) is compressed into a SIREN network:

Parameter Value Source
Layers 1 input + 3 hidden + 1 linear output β€”
Hidden width 64 neurons β€”
Activation sin(Ο‰β‚€ Β· (Wx + b)) Paper Eq. 4
Ο‰β‚€ (first layer) 30.0 Paper Section 3.2
Ο‰β‚€ (hidden layers) 30.0 Supplement Section 1.5
Init (first layer) U(-1/fan_in, 1/fan_in) Paper Section 3.2
Init (hidden) U(-√(6/n)/Ο‰β‚€, √(6/n)/Ο‰β‚€) Supplement Theorem 1.8
Total params ~12,738 per segment β€”
Optimizer Adam, lr=1e-4 Paper Section 3.2
Training 500 iterations β€”

Key property: Derivatives of SIRENs are SIRENs (since d/dx sin = cos = sin(Β· + Ο€/2)), so velocity and acceleration are computed analytically β€” no finite differences.

Stage 3: VQ-VAE Primitive Library

Paper: Neural Discrete Representation Learning (van den Oord et al., 2017)

Compresses SIREN weight vectors into a discrete codebook of motion primitives:

Parameter Value Source
Encoder 12738 β†’ 2048 β†’ ReLU β†’ 1024 β†’ ReLU β†’ 512 β†’ ReLU β†’ 256 β€”
Codebook (K) 512 entries Γ— 256 dims β€”
Decoder 256 β†’ 512 β†’ ReLU β†’ 1024 β†’ ReLU β†’ 2048 β†’ ReLU β†’ 12738 β€”
Gradient Straight-through estimator Paper Section 3.2
Codebook updates EMA, Ξ³=0.99 Appendix A.1, Eq. 6-8
Commitment cost (Ξ²) 0.25 Paper Section 3.2
Loss MSE + Ξ²Β·commitment Paper Eq. 3

Each codebook entry = a reusable motion primitive (arc, line, hook, correction, etc.)

Stage 4: Latent ODE Generator

Papers:

Sequences motion primitives using continuous-time dynamics:

Parameter Value Source
Encoder ODE-RNN (backwards in time) Rubanova Eq. 8, Algorithm 1
Recognition dim 256 (> latent dim) Supplement Section 5
Latent dim 64 β€”
ODE function 4-layer MLP Γ— 512 hidden, Tanh activation Supplement Section 4
ODE solver dopri5 (adaptive RK 4/5) Supplement Section 4
Tolerances rtol=1e-3, atol=1e-4 Supplement Section 4
Training ELBO with KL annealing (coeff 0.99) Supplement Section 6
Memory O(1) via adjoint method Chen et al. Section 2
Optimizer Adamax, lr=0.01, decay=0.999 Supplement Section 6

Why Tanh? From Rubanova supplement: "Tanh activation constrains the output and prevents the ODE gradients from taking large values... we do not recommend using ReLU."

Training Strategy

Optimized for Google Colab (16GB VRAM, 4-hour sessions):

  • Checkpoints every 10 minutes (survives disconnects)
  • Mixed precision (fp16) for ~2Γ— speedup on T4
  • Gradient accumulation (4 steps) for larger effective batch
  • VQ-VAE: batch 4096, effective 16384 with grad accumulation (~3-4GB)
  • Latent ODE: batch 256, seq_len 20, adjoint method (~10-12GB)
  • Precomputed SIREN weight vectors (no redundant INR fitting during training)
  • Checkpoint storage on HuggingFace Hub

πŸ—οΈ Architecture

CursorCapture (Rust)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Listener Thread     β”‚    mpsc channel      β”‚  Writer Thread     β”‚
β”‚                      β”‚ ──────────────────→  β”‚                    β”‚
β”‚  rdev::listen()      β”‚   CursorEvent        β”‚  BufWriter<File>   β”‚
β”‚  β€’ Per-pixel capture β”‚   {x, y, t_ΞΌs}       β”‚  β€’ Batch writes    β”‚
β”‚  β€’ Pixel dedup only  β”‚                      β”‚  β€’ File rotation   β”‚
β”‚  β€’ No time throttle  β”‚                      β”‚  β€’ Hourly gzip     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                      β”‚  β€’ 500MB cap       β”‚
         β”‚                                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
  β”‚  Watchdog    β”‚  Detects missing permissions
  β”‚  Thread      β”‚  Periodic health logging
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

TrajectoryGen Pipeline

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Record   │──→│  Segment    │──→│  SIREN     │──→│  VQ-VAE   │──→│  Latent  β”‚
β”‚  Per-pixelβ”‚   β”‚  Direction  β”‚   β”‚  3Γ—64      β”‚   β”‚  512 codes β”‚   β”‚  Latent  β”‚
β”‚  (x,y,ΞΌs) β”‚   β”‚  Velocity   β”‚   β”‚  sin(Ο‰β‚€x)  β”‚   β”‚  EMA+CL   β”‚   β”‚  ODE-RNN β”‚
β”‚           β”‚   β”‚  Curvature  β”‚   β”‚ ~12.7K wts β”‚   β”‚ 256d embed β”‚   β”‚  Adjoint β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Research Papers

The model architecture is built on these four papers:

Paper Year Role in Pipeline Key Contribution
SIREN 2020 Trajectory segment encoding Periodic activations + principled init (Ο‰β‚€=30)
VQ-VAE 2017 Motion primitive codebook Discrete latent learning + EMA codebook
Neural ODE 2018 Continuous dynamics backbone Adjoint method for O(1) memory
Latent ODE 2019 Sequence generation ODE-RNN encoder + VAE framework

πŸ“ Project Structure

cursor-trajectory/
β”œβ”€β”€ cursor_capture/              # Rust data collection daemon
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ main.rs              # CLI + smart auto-install
β”‚   β”‚   β”œβ”€β”€ recorder.rs          # Per-pixel capture, no time throttle
β”‚   β”‚   β”œβ”€β”€ storage.rs           # JSONL writer, rotation, compression
β”‚   β”‚   └── platform.rs          # Cross-platform auto-start
β”‚   β”œβ”€β”€ install_mac.command      # macOS one-click installer
β”‚   β”œβ”€β”€ install_win.bat          # Windows one-click installer
β”‚   β”œβ”€β”€ Cargo.toml
β”‚   └── README.md
β”œβ”€β”€ trajectory_gen/              # ML pipeline
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ siren.py             # SIREN INR (Sitzmann et al. 2020)
β”‚   β”‚   β”œβ”€β”€ vqvae.py             # VQ-VAE codebook (van den Oord et al. 2017)
β”‚   β”‚   β”œβ”€β”€ latent_ode.py        # Latent ODE generator (Rubanova et al. 2019)
β”‚   β”‚   └── inference.py         # End-to-end generation pipeline
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ preprocessing.py     # JSONL loading, idle filtering
β”‚   β”‚   └── segmentation.py      # Direction/velocity/curvature cuts
β”‚   β”œβ”€β”€ training/
β”‚   β”‚   └── colab_trainer.py     # Colab-optimized training loop
β”‚   └── requirements.txt         # Python dependencies
β”œβ”€β”€ papers/                      # Source research papers
β”œβ”€β”€ .github/workflows/build.yml  # CI: auto-build Mac + Windows
└── README.md                    # ← You are here

πŸš€ Quick Start (ML Pipeline)

# Install dependencies
pip install -r trajectory_gen/requirements.txt

# Load and segment your data
python -c "
from trajectory_gen.data.preprocessing import load_all_recordings
from trajectory_gen.data.segmentation import segment_trajectory

x, y, t = load_all_recordings()
segments = segment_trajectory(x, y, t)
print(f'Found {len(segments)} segments from {len(x)} points')
"

# Fit SIRENs to segments (example)
python -c "
from trajectory_gen.models.siren import SIREN, SIRENFitter
fitter = SIRENFitter()
print(f'SIREN parameters: {SIREN().num_parameters}')
"

🀝 Contributing

This is a research project in active development. Contributions welcome:

  • Data collection improvements β€” Multi-monitor support, click events
  • Segmentation algorithms β€” New cut-point heuristics
  • Model architecture β€” Alternative primitive representations
  • Platform support β€” Linux support, system tray UI
  • Training recipes β€” Hyperparameter tuning, new datasets

πŸ“„ License

MIT License β€” see LICENSE for details.


If this project is useful to you, consider giving it a ⭐

Built with πŸ¦€ Rust and πŸ”₯ PyTorch

About

πŸ–±οΈ Learn how humans move, then generate it. Captures cursor movement at 60Hz, decomposes into SIREN primitives, generates naturalistic trajectories via Neural ODE.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors