Skip to content

flappygolf/cheese-defender

Cat and Mouse MARL: POMDP Grid World

CI License: MIT Python 3.10 PettingZoo

A bilingual, open-source research repository for a partially observable cat-and-mouse grid world built on the PettingZoo Parallel API, with recurrent MARL baselines, diagnostics, and reproducible training utilities.

Highlights

  • Partially observable PettingZoo Parallel API environment
  • Three agents: cat_0, mouse_0, mouse_1
  • Asymmetric actions, local observations, shared mouse policies, and recurrent actor-critic baselines
  • Mainline training scripts for MAPPO, IPPO, and QMIX
  • Diagnostics for task events, recording, regression checks, and visualization
  • Bilingual documentation for public release

中文简介

这是一个面向多智能体强化学习研究的猫鼠网格世界项目。

  • 环境基于 POMDPPettingZoo Parallel API
  • 参与智能体为 cat_0mouse_0mouse_1
  • 支持局部观测、RNN 记忆、非对称动作与对抗训练
  • 提供 MAPPOIPPOQMIX 主线实现
  • 历史博弈算法探索已归档,保持主仓库结构清晰

中文入口:docs/README.md

English Overview

This project studies multi-agent pursuit, evasion, navigation, and resource collection under partial observability.

English entry: README_EN.md

Quick Start

conda env create -f environment.yml
conda activate cat_mouse_marl

Run a regression check:

python -m unittest tests.test_grid_core_regressions

Train a baseline:

python scripts/train_mappo.py --run-dir runs/quick_mappo --device cuda

Evaluate a checkpoint:

python scripts/evaluate_task_events.py --mouse-policy model --mouse-checkpoint runs/quick_mappo/checkpoints/mappo_final.pt --cat-policy random --device cuda

Record a match:

python scripts/play_match.py --cat-checkpoint runs/quick_mappo/checkpoints/mappo_final.pt --mouse-checkpoint runs/quick_mappo/checkpoints/mappo_final.pt

Documentation

Repository Layout

  • envs/: environment core and PettingZoo wrapper
  • algorithms/: recurrent MARL implementations
  • scripts/: training, evaluation, recording, and diagnostics
  • utils/: policies, logging, and visualization helpers
  • docs/: tracked public documentation
  • tests/: regression checks and API tests
  • archive/: archived exploratory work outside the current mainline
  • runs/: local experiment outputs, ignored by git by default
  • demos/: local recorded videos, ignored by git by default

Release Notes

  • The repository is organized so source code, docs, and tests are tracked in git.
  • Large or fast-changing experiment artifacts stay local under runs/ and demos/.
  • Artifact indexes that should appear on GitHub are mirrored under docs/.

License

MIT. See LICENSE.

About

POMDP cat-and-mouse PettingZoo grid world with recurrent MARL baselines, diagnostics, and demo assets.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages