MR_RL — Reinforcement Learning for Magnetic Micro-Robot Control

A physics-based simulation environment and RL training framework for controlling magnetic micro-robots (MRs). The robot is steered by an external rotating magnetic field — the agent learns to choose the field frequency and angle at each timestep to navigate to a target position.

Two control approaches are implemented:

DDPG (Deep Deterministic Policy Gradients) — model-free RL via actor-critic with continuous actions
Gaussian Process learning — model identification from circular trajectories, then GP-guided control

How the Robot Moves

The MR's motion is governed by:

ẋ = a₀ · f · cos(α)
ẏ = a₀ · f · sin(α)

where f is the field frequency (Hz) and α is the field angle (rad). a₀ is a robot-specific mobility constant that must be identified from data. The simulator integrates these equations using SciPy's RK45 solver at dt = 30 ms per timestep (matching the experimental sensing rate).

Project Structure

MR_RL/
├── MR_simulator.py       # Physics engine — RK45 integration of MR dynamics
├── MR_env.py             # OpenAI Gym environment wrapping the simulator
├── MR_viewer.py          # Matplotlib-based visualizer for trajectories
├── MR_data.py            # Experiment data loader
├── MR_experiment.py      # Interface to pass actions to real hardware and read outputs
├── Learning_module.py    # GP model learning (1D, learns a0 + drift correction)
├── Learning_module_2d.py # GP model learning (2D variant)
├── main.py               # GP learning pipeline: generate data → learn → test
├── main_2d.py            # 2D variant of main
├── utils.py              # Shared utilities: run_sim, plotting helpers
├── RL/
│   ├── MR_ddpg.py        # DDPG agent (actor, critic, replay buffer, OU noise)
│   ├── MR_ppo_keras_rl.py# PPO variant using keras-rl
│   ├── CustomTrackerV10.py # Training callback for logging metrics
│   ├── evaluate_learning.py# Post-training evaluation script
│   └── read_data.py      # Loads saved experiment histories
├── _experiments/         # Saved experiment results from DDPG training runs
├── h5f_files/            # Saved Keras model weights (.h5)
├── old/                  # Archived earlier implementations (DQN, Xbox controller, etc.)
└── lib/                  # Reference Gym environments (gridworld, blackjack, cliff walking)

Gym Environment (`MR_env.py`)

MR_Env is an OpenAI Gym-compatible environment that wraps the simulator.

Property	Value
Action space	`Box([0, 0], [20, 2π])` — frequency (Hz) and field angle (rad)
Observation space	`(x, y, x_target, y_target, distance)`
Max timesteps	50 per episode
Goal	Reach within 30 units of target
Boundaries	±510 units in x and y

The environment supports both simulation mode (using MR_simulator.py) and real-hardware mode (using MR_experiment.py).

GP Learning Pipeline (`main.py`)

The Gaussian Process approach identifies the robot's a₀ and residual disturbances from data, then uses the learned model to compute optimal control inputs online.

Steps:

Estimate drift — hold the robot still for ~3 seconds; fit a GP to the measured drift velocity (Dx, Dy)
Collect training data — drive the robot in 3 circles over 60 seconds at a fixed frequency; record (α, vx, vy) pairs
Learn a₀ — fit GPs for residual x/y dynamics; solve for a₀ that minimises prediction error
Test — given a desired velocity vector, solve:

min_α  (a₀·f·cos(α) + GP_x(α) + Dx − v_desired_x)²
      + (a₀·f·sin(α) + GP_y(α) + Dy − v_desired_y)²

To run:

python main.py

Key parameters at the top of main.py:

Variable	Default	Description
`freq`	4 Hz	Field frequency for training and testing
`a0_def`	1.5	Nominal `a₀` used to generate simulated data
`dt`	0.030 s	Timestep (30 ms)
`noise_var`	0.5	Noise added to simulate model mismatch
`cycles`	3	Number of training circles

DDPG Agent (`RL/MR_ddpg.py`)

A standard DDPG implementation for continuous control:

Actor — fully connected network: state (5) → 400 → 300 → action (2) with tanh output
Critic — state (5) + action (2) → 400 → 300 → Q-value (1)
Replay buffer — circular deque storing (s, a, r, done, s')
Ornstein-Uhlenbeck noise — temporally correlated exploration noise (θ=0.15, σ=0.3)
Target networks — soft updates (τ=0.001) for training stability

To train:

cd RL
python MR_ddpg.py

Saved model weights go to h5f_files/. Training histories are saved to _experiments/.

Requirements

pip install numpy scipy gym tensorflow tflearn scikit-learn matplotlib

Note: RL/MR_ddpg.py and RL/MR_ppo_keras_rl.py use TensorFlow 1.x / TFLearn. If running on TF2, use compatibility mode (tf.compat.v1).

References

Physics simulation approach: Simulating Dynamical Systems with Python
Gym environment structure: OpenAI Gym from Scratch

Related Work

MRs_suhail — MATLAB pipeline using RRT* + nonlinear MPC for the same robot
DQN4MRs — DQN grid-world prototype with STREL spec-guided RL extension

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MR_RL — Reinforcement Learning for Magnetic Micro-Robot Control

How the Robot Moves

Project Structure

Gym Environment (`MR_env.py`)

GP Learning Pipeline (`main.py`)

DDPG Agent (`RL/MR_ddpg.py`)

Requirements

References

Related Work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
RL		RL
__pycache__		__pycache__
_experiments		_experiments
h5f_files		h5f_files
lib		lib
old		old
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Learning_module.py		Learning_module.py
Learning_module_2d.py		Learning_module_2d.py
MR_data.py		MR_data.py
MR_env.py		MR_env.py
MR_experiment.py		MR_experiment.py
MR_simulator.py		MR_simulator.py
MR_viewer.py		MR_viewer.py
README.md		README.md
ReadExperiment.py		ReadExperiment.py
main.py		main.py
main_2d.py		main_2d.py
sandbox.py		sandbox.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

MR_RL — Reinforcement Learning for Magnetic Micro-Robot Control

How the Robot Moves

Project Structure

Gym Environment (MR_env.py)

GP Learning Pipeline (main.py)

DDPG Agent (RL/MR_ddpg.py)

Requirements

References

Related Work

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Gym Environment (`MR_env.py`)

GP Learning Pipeline (`main.py`)

DDPG Agent (`RL/MR_ddpg.py`)

Packages