Learn how humans move β then generate it.
A research system that captures real cursor movement at per-pixel resolution, decomposes it into mathematical primitives, and trains a Neural ODE to generate naturalistic trajectories between any two points.
CursorCapture records every pixel your cursor visits Β· TrajectoryGen learns your movement patterns Β· Generate realistic paths on demand
Most trajectory generation uses straight lines or BΓ©zier curves. Real human cursor movement is far more complex β it has acceleration phases, micro-corrections, overshoots, and a rhythmic quality unique to each person.
This project takes a different approach:
- Record every pixel the cursor visits during daily computer use (per-pixel, ΞΌs timestamps)
- Decompose trajectories into mathematical building blocks (motion primitives) using SIREN neural networks
- Learn the vocabulary of how you move using VQ-VAE
- Generate new trajectories using Latent ODEs that are statistically indistinguishable from real movement
The result: given any two points on screen, output a continuous, naturalistic trajectory in under 500ms β with no seams, no straight lines, and no unnatural transitions.
Domain-agnostic by design. Cursor movement is the test domain. The same architecture generalizes to game NPC movement, animation, handwriting synthesis, and robotics.
A 1.6MB Rust binary that silently records cursor movement. Install once, forget forever.
Most recorders sample at a fixed rate (e.g. 60Hz = every 16ms). This loses data β if your cursor moves 200 pixels in 16ms, you only see the start and end, missing 198 points of the actual trajectory.
CursorCapture takes a different approach: record every distinct pixel the cursor visits. No time-based throttling. The OS reports a new position β we record it. Period.
Each event includes a microsecond-precision timestamp so you can compute velocity, acceleration, and jerk from the data without any interpolation guesswork.
| Feature | Detail |
|---|---|
| π¬ Per-pixel capture | Records every distinct pixel position β zero spatial data loss |
| β±οΈ Microsecond timestamps | ΞΌs-precision timing for velocity & acceleration analysis |
| πΎ Efficient storage | JSONL format, hourly file rotation, auto-gzip after 24h |
| π Privacy first | Position + timestamp only. No screenshots, keystrokes, or window titles |
| π Auto-start | Runs on every login β macOS LaunchAgent / Windows Startup |
| π‘οΈ Self-healing | Crash recovery via KeepAlive (macOS) / auto-restart |
| π Storage cap | 500MB limit, oldest compressed files auto-deleted |
| πͺΆ Tiny footprint | < 5MB RAM, single static binary, zero dependencies |
π macOS (Apple Silicon & Intel)
Option 1: One-click installer
- Download the latest
.tar.gzfrom Releases - Extract and double-click
install_mac.command - Grant Accessibility permission when the settings window opens
- Done β β runs in the background forever
Option 2: Manual
# Download and extract
tar -xzf cursor_capture-macos-*.tar.gz
cd dist
# Install (registers auto-start, opens permission dialog)
./cursor_capture installNote: macOS requires Accessibility permission for cursor monitoring. The installer opens the settings panel automatically β just add and enable
cursor_capture.
πͺ Windows
Option 1: One-click installer
- Download the latest
.zipfrom Releases - Extract and double-click
install_win.bat - Done β β no special permissions needed on Windows
Option 2: Manual
# Just run the exe β it auto-installs on first launch
.\cursor_capture.exeπ§ Build from source
git clone https://github.com/Ramcharan747/cursor-trajectory.git
cd cursor-trajectory/cursor_capture
cargo build --release
# Binary at: target/release/cursor_capture
./target/release/cursor_capture installcursor_capture # Smart default: auto-install if needed, then run
cursor_capture status # Check if running, data size, etc.
cursor_capture uninstall # Remove auto-start (preserves data)Each line is a distinct pixel position the cursor visited, with a microsecond timestamp:
{"x":1024.0,"y":768.0,"t":1714857600123456}
{"x":1025.0,"y":769.0,"t":1714857600124012}
{"x":1026.0,"y":770.0,"t":1714857600124589}| Field | Type | Description |
|---|---|---|
x |
f64 |
Horizontal position (integer pixels) |
y |
f64 |
Vertical position (integer pixels) |
t |
i64 |
Microseconds since Unix epoch (ΞΌs precision) |
Why microseconds? At per-pixel capture rates, consecutive events can be <1ms apart. Microsecond precision lets you compute instantaneous velocity and acceleration without rounding artifacts.
Storage estimates (8 hours active use/day):
- Standard mouse (125Hz): ~160MB/day raw β ~30MB compressed
- 500MB cap β 2-3 weeks of continuous collection
- Auto-compresses files older than 24h, auto-deletes oldest when cap reached
Status: Model architecture complete. Training begins after data collection.
Raw Data βββ Segmentation βββ SIREN INR βββ VQ-VAE βββ Latent ODE βββ Trajectory
(x,y,ΞΌs) micro-cuts per-segment primitive sequence continuous
per-pixel at velocity weight vector library generator output
dips, turns compression 128 codes ODE-RNN (x,y,t)
All model implementations are paper-verified against the original research:
Stage 1: Segmentation
Cut continuous per-pixel recordings at meaningful boundaries:
- Direction reversals (>45Β° angle change between velocity vectors)
- Velocity dips (<5% of local peak speed β near-pauses)
- Curvature inflection points (2nd derivative magnitude > 3Ο)
- Segments shorter than 50ms merged, longer than 3s split
Stage 2: SIREN INR Fitting
Paper: Implicit Neural Representations with Periodic Activation Functions (Sitzmann et al., 2020)
Each micro-segment (0.05β3s) is compressed into a SIREN network:
| Parameter | Value | Source |
|---|---|---|
| Layers | 1 input + 3 hidden + 1 linear output | β |
| Hidden width | 64 neurons | β |
| Activation | sin(Οβ Β· (Wx + b)) |
Paper Eq. 4 |
| Οβ (first layer) | 30.0 | Paper Section 3.2 |
| Οβ (hidden layers) | 30.0 | Supplement Section 1.5 |
| Init (first layer) | U(-1/fan_in, 1/fan_in) |
Paper Section 3.2 |
| Init (hidden) | U(-β(6/n)/Οβ, β(6/n)/Οβ) |
Supplement Theorem 1.8 |
| Total params | ~12,738 per segment | β |
| Optimizer | Adam, lr=1e-4 | Paper Section 3.2 |
| Training | 500 iterations | β |
Key property: Derivatives of SIRENs are SIRENs (since d/dx sin = cos = sin(Β· + Ο/2)), so velocity and acceleration are computed analytically β no finite differences.
Stage 3: VQ-VAE Primitive Library
Paper: Neural Discrete Representation Learning (van den Oord et al., 2017)
Compresses SIREN weight vectors into a discrete codebook of motion primitives:
| Parameter | Value | Source |
|---|---|---|
| Encoder | 12738 β 2048 β ReLU β 1024 β ReLU β 512 β ReLU β 256 | β |
| Codebook (K) | 512 entries Γ 256 dims | β |
| Decoder | 256 β 512 β ReLU β 1024 β ReLU β 2048 β ReLU β 12738 | β |
| Gradient | Straight-through estimator | Paper Section 3.2 |
| Codebook updates | EMA, Ξ³=0.99 | Appendix A.1, Eq. 6-8 |
| Commitment cost (Ξ²) | 0.25 | Paper Section 3.2 |
| Loss | MSE + Ξ²Β·commitment | Paper Eq. 3 |
Each codebook entry = a reusable motion primitive (arc, line, hook, correction, etc.)
Stage 4: Latent ODE Generator
Papers:
- Neural Ordinary Differential Equations (Chen et al., 2018)
- Latent ODEs for Irregularly-Sampled Time Series (Rubanova et al., 2019)
Sequences motion primitives using continuous-time dynamics:
| Parameter | Value | Source |
|---|---|---|
| Encoder | ODE-RNN (backwards in time) | Rubanova Eq. 8, Algorithm 1 |
| Recognition dim | 256 (> latent dim) | Supplement Section 5 |
| Latent dim | 64 | β |
| ODE function | 4-layer MLP Γ 512 hidden, Tanh activation | Supplement Section 4 |
| ODE solver | dopri5 (adaptive RK 4/5) |
Supplement Section 4 |
| Tolerances | rtol=1e-3, atol=1e-4 | Supplement Section 4 |
| Training | ELBO with KL annealing (coeff 0.99) | Supplement Section 6 |
| Memory | O(1) via adjoint method | Chen et al. Section 2 |
| Optimizer | Adamax, lr=0.01, decay=0.999 | Supplement Section 6 |
Why Tanh? From Rubanova supplement: "Tanh activation constrains the output and prevents the ODE gradients from taking large values... we do not recommend using ReLU."
Optimized for Google Colab (16GB VRAM, 4-hour sessions):
- Checkpoints every 10 minutes (survives disconnects)
- Mixed precision (fp16) for ~2Γ speedup on T4
- Gradient accumulation (4 steps) for larger effective batch
- VQ-VAE: batch 4096, effective 16384 with grad accumulation (~3-4GB)
- Latent ODE: batch 256, seq_len 20, adjoint method (~10-12GB)
- Precomputed SIREN weight vectors (no redundant INR fitting during training)
- Checkpoint storage on HuggingFace Hub
βββββββββββββββββββββββ βββββββββββββββββββββ
β Listener Thread β mpsc channel β Writer Thread β
β β βββββββββββββββββββ β β
β rdev::listen() β CursorEvent β BufWriter<File> β
β β’ Per-pixel capture β {x, y, t_ΞΌs} β β’ Batch writes β
β β’ Pixel dedup only β β β’ File rotation β
β β’ No time throttle β β β’ Hourly gzip β
βββββββββββββββββββββββ β β’ 500MB cap β
β βββββββββββββββββββββ
ββββββββ΄βββββββ
β Watchdog β Detects missing permissions
β Thread β Periodic health logging
βββββββββββββββ
ββββββββββββ βββββββββββββββ βββββββββββββ βββββββββββββ ββββββββββββ
β Record βββββ Segment βββββ SIREN βββββ VQ-VAE βββββ Latent β
β Per-pixelβ β Direction β β 3Γ64 β β 512 codes β β Latent β
β (x,y,ΞΌs) β β Velocity β β sin(Οβx) β β EMA+CL β β ODE-RNN β
β β β Curvature β β ~12.7K wts β β 256d embed β β Adjoint β
ββββββββββββ βββββββββββββββ βββββββββββββ βββββββββββββ ββββββββββββ
The model architecture is built on these four papers:
| Paper | Year | Role in Pipeline | Key Contribution |
|---|---|---|---|
| SIREN | 2020 | Trajectory segment encoding | Periodic activations + principled init (Οβ=30) |
| VQ-VAE | 2017 | Motion primitive codebook | Discrete latent learning + EMA codebook |
| Neural ODE | 2018 | Continuous dynamics backbone | Adjoint method for O(1) memory |
| Latent ODE | 2019 | Sequence generation | ODE-RNN encoder + VAE framework |
cursor-trajectory/
βββ cursor_capture/ # Rust data collection daemon
β βββ src/
β β βββ main.rs # CLI + smart auto-install
β β βββ recorder.rs # Per-pixel capture, no time throttle
β β βββ storage.rs # JSONL writer, rotation, compression
β β βββ platform.rs # Cross-platform auto-start
β βββ install_mac.command # macOS one-click installer
β βββ install_win.bat # Windows one-click installer
β βββ Cargo.toml
β βββ README.md
βββ trajectory_gen/ # ML pipeline
β βββ models/
β β βββ siren.py # SIREN INR (Sitzmann et al. 2020)
β β βββ vqvae.py # VQ-VAE codebook (van den Oord et al. 2017)
β β βββ latent_ode.py # Latent ODE generator (Rubanova et al. 2019)
β β βββ inference.py # End-to-end generation pipeline
β βββ data/
β β βββ preprocessing.py # JSONL loading, idle filtering
β β βββ segmentation.py # Direction/velocity/curvature cuts
β βββ training/
β β βββ colab_trainer.py # Colab-optimized training loop
β βββ requirements.txt # Python dependencies
βββ papers/ # Source research papers
βββ .github/workflows/build.yml # CI: auto-build Mac + Windows
βββ README.md # β You are here
# Install dependencies
pip install -r trajectory_gen/requirements.txt
# Load and segment your data
python -c "
from trajectory_gen.data.preprocessing import load_all_recordings
from trajectory_gen.data.segmentation import segment_trajectory
x, y, t = load_all_recordings()
segments = segment_trajectory(x, y, t)
print(f'Found {len(segments)} segments from {len(x)} points')
"
# Fit SIRENs to segments (example)
python -c "
from trajectory_gen.models.siren import SIREN, SIRENFitter
fitter = SIRENFitter()
print(f'SIREN parameters: {SIREN().num_parameters}')
"This is a research project in active development. Contributions welcome:
- Data collection improvements β Multi-monitor support, click events
- Segmentation algorithms β New cut-point heuristics
- Model architecture β Alternative primitive representations
- Platform support β Linux support, system tray UI
- Training recipes β Hyperparameter tuning, new datasets
MIT License β see LICENSE for details.
If this project is useful to you, consider giving it a β
Built with π¦ Rust and π₯ PyTorch