From 87226cc11cefdef72d9bfc86fe424dc90f8561b2 Mon Sep 17 00:00:00 2001 From: Tomas Laurenzo Date: Sat, 9 May 2026 20:26:39 -0300 Subject: [PATCH] Update project status audit --- STATUS.md | 464 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 373 insertions(+), 91 deletions(-) diff --git a/STATUS.md b/STATUS.md index 13d7165..6502a19 100644 --- a/STATUS.md +++ b/STATUS.md @@ -1,159 +1,441 @@ # BatLLM Status -Last updated: 2026-05-09 18:35 +Last updated: 2026-05-09 23:25 ## Project Purpose -BatLLM is a Python/Kivy research, education, and game project focused on AI-mediated gameplay, prompt quality, LLM behaviour, and local-model workflows. The repository provides a playable turn-based app and a read-only analyser for replaying and inspecting saved sessions. +BatLLM is a Python/Kivy research, education, and game project for exploring AI-mediated play, prompt quality, LLM behaviour, and local-model workflows. The repository currently contains a playable local desktop game, a standalone read-only Game Analyzer, local Ollama lifecycle and model-management helpers routed through `modelito`, release-bundle tooling, Homebrew formula generation, generated API reference artefacts, and maintained user/developer documentation. + +The project should remain practical, critical, and educational. Destructive or expensive local-model actions must stay explicit because BatLLM can start and stop a real Ollama service, download or delete models, and save user-created sessions. ## Setup And Run Instructions -macOS/Linux: +### Supported Runtime + +- Python: `>=3.10` and `<3.13` enforced by the launcher compatibility helper. +- Main UI framework: Kivy `2.3.1` plus KivyMD `1.2.0`. +- LLM/runtime integration: Ollama through `modelito==1.4.0` and `ollama==0.5.3`. +- Default shipped model: `smollm2` with first-run `last_served_model` intentionally blank. +- Repository version: `0.3.5`. + +### macOS/Linux ```bash git clone https://github.com/krahd/BatLLM.git cd BatLLM python3 -m venv .venv_BatLLM source .venv_BatLLM/bin/activate +python -m pip install --upgrade pip pip install -r requirements.txt python run_batllm.py ``` -Windows: +### Windows PowerShell ```powershell git clone https://github.com/krahd/BatLLM.git cd BatLLM py -m venv .venv_BatLLM .\.venv_BatLLM\Scripts\Activate.ps1 +python -m pip install --upgrade pip pip install -r requirements.txt python run_batllm.py ``` -Standalone analyser: +### Standalone Game Analyzer ```bash python run_game_analyzer.py ``` +### Test Runner + +```bash +python -m pytest -q +python run_tests.py core +python run_tests.py full +``` + +`run_tests.py full` requires `.venv_BatLLM` and may start/stop a real local Ollama service. Use it only when local Ollama state is safe to exercise. + +### Useful Environment Variables + +- `BATLLM_HOME`: redirects mutable config and saved-session data away from the repository or package install location. +- `PYTHONPATH=src`: needed when running modules directly without the root launchers. +- `KIVY_WINDOW=mock`, `KIVY_NO_ARGS=1`, `KIVY_NO_CONSOLELOG=1`: useful for headless CI/test runs. +- `BATLLM_RUN_OLLAMA_SMOKE=1`: enables gated Ollama smoke tests. + +## Thorough Audit Snapshot + +This status update followed a repository-wide audit on 2026-05-09. The audit inspected tracked files, top-level project structure, maintained documentation, source modules, tests, packaging tools, CI workflows, configuration defaults, generated documentation artefacts, and current git state. + +### Repository Inventory + +- Tracked files: 611. +- Tracked source/test/application files under `src/`: 82. +- Tracked documentation files and generated API artefacts under `docs/`: 505. +- Test files: 13 `test_*.py` files with 153 collected test functions by static scan. +- Key top-level launchers and tooling: `run_batllm.py`, `run_game_analyzer.py`, `run_tests.py`, `create_release_bundles.py`, `create_homebrew_formula.py`, `validate_packaging_smoke.py`, `start_ollama.sh`, `stop_ollama.sh`, `scripts/cmr-r`, and `tools/ollama_mock_server.py`. +- CI workflows present: `.github/workflows/multiplatform.yml` and `.github/workflows/publish-homebrew-tap.yml`. +- Packaging subtree present: `packaging/homebrew/README.md` and `packaging/homebrew/requirements.txt`. + +### Notable Audit Findings + +- `STATUS.md` previously used external SVG image links for diagrams. This update replaces that with inline SVG diagrams to match repository agent instructions. +- `src/configs/config-llama.yaml` still contains two stale `src/headers/system_instructions/...` paths for augmented modes. The shipped default `src/configs/config.yaml` points to existing `src/assets/system_instructions/...` files. +- `src/configs/config.yaml.bak` is tracked and appears to be a mutable configurator backup with a different model (`mistral-small:latest`) and altered gameplay values. It should be reviewed before release because backup/config artefacts are normally risky to keep in source control. +- A tracked top-level file named `sdf` appears to contain captured Ollama/model-pull terminal output with ANSI control sequences and model warm-up text. It is not referenced by the audited launchers, docs, or tests and should be reviewed for removal or archival. +- `run_tests.py` defines `OLLAMA_HELPER = ROOT / "src" / "ollama_service.py"`, but no such file exists and the constant is unused in the inspected runner. +- `docs/ROADMAP.md` still describes the "current 0.2.x line" even though `VERSION` is `0.3.5`; it should be refreshed in a future documentation pass. +- `docs/images/architecture-modelito.svg` still says "Modelito 1.2.2 Cleanup" in its title while the dependency is `modelito==1.4.0`; the inline diagram below uses current wording, but the standalone image should be updated or removed later. +- Generated Doxygen output under `docs/code/` is large and tracked. It appears intentional, but it dominates repository size and should be regenerated only as part of deliberate API-documentation updates. +- The current git worktree was clean before this status update. + ## Current Implementation State -- Main launcher: `run_batllm.py` -- Standalone analyser launcher: `run_game_analyzer.py` -- Source tree under `src/` with Kivy UI, gameplay logic, model orchestration, history, and replay support -- Local Ollama workflow for install/start/stop, model selection, model download/delete -- Saved-session v2 envelope with per-round gameplay snapshots and saved LLM metadata snapshot support -- Cross-platform release-bundle generation and Homebrew packaging support -- Maintained docs in `docs/` +### Main Application -The codebase is currently aligned on `modelito==1.4.0` and now uses structured readiness handling plus configurable warmup timeout in the local Ollama startup path. +- `run_batllm.py` bootstraps `src/` onto `sys.path`, enforces the supported Python range, imports `main.main`, and starts the desktop app. +- `src/main.py` builds the KivyMD application, registers Kivy resource paths, loads KV screens, manages the screen manager, handles startup Ollama flow, and supports guarded shutdown behaviour. +- `src/view/home_screen.py`, `src/view/settings_screen.py`, `src/view/history_screen.py`, and `src/view/ollama_config_screen.py` provide the main gameplay, settings, history, and model-management screens. +- `src/game/game_board.py`, `src/game/bot.py`, and `src/game/bullet.py` implement the live arena, bots, movement, shooting, shields, turns, rounds, and rendering behaviour. +- `src/game/ollama_connector.py` manages per-bot/shared prompt history, builds modelito-compatible messages, resolves request options, invokes the LLM provider, and handles timeout errors. -## Architecture Overview +### Game Analyzer + +- `run_game_analyzer.py` bootstraps `src/`, enforces Python compatibility, imports `analyzer_main.main`, and starts the standalone analyzer. +- `src/analyzer_main.py` provides the KivyMD analyzer app shell. +- `src/analyzer_model.py` loads validated saved-session payloads and exposes game/round/turn navigation data. +- `src/view/analyzer_load_screen.py`, `src/view/analyzer_review_screen.py`, and `src/view/analyzer_board.py` provide recent-session loading, replay navigation, read-only board rendering, metadata inspection, and playback controls. + +### Session, History, And Replay + +- `src/game/history_manager.py` records games, rounds, turns, prompt/response histories, state snapshots, winners, and saved-session exports. +- `src/game/session_schema.py` builds and validates the saved-session v2 envelope and rejects unsupported legacy sessions. +- `src/game/replay_engine.py` is a pure replay helper layer used by both gameplay tests and the analyzer. It normalises bot state, parses commands, applies movement/rotation/shooting, resolves shots, and compares replayed state with captured state. +- Saved sessions include gameplay settings snapshots and LLM metadata snapshots so analyzer review can explain the model/runtime context used by a session. -- Launchers: `run_batllm.py`, `run_game_analyzer.py` -- Main app entrypoint: `src/main.py` -- LLM/runtime orchestration: `src/llm/service.py` -- Ollama config UI: `src/view/ollama_config_screen.py` and KV layout -- Session export and schema: `src/game/history_manager.py`, `src/game/session_schema.py` -- Analyzer model/inspector UI: `src/analyzer_model.py`, `src/view/analyzer_review_screen.py` -- Tests: `src/tests/` +### Configuration And Mutable State -### Architecture Diagram +- `src/configs/config.yaml` is the shipped default configuration. +- `src/configs/app_config.py` loads hard-coded defaults, overlays shipped config, and optionally overlays a mutable user config resolved through `BATLLM_HOME`. +- `src/util/paths.py` centralises repository paths, asset/view paths, user-writable `BATLLM_HOME` paths, and saved-session directory resolution. +- `src/configs/configurator.py` is a separate Kivy configuration GUI with YAML editing, snapshots, Ollama controls, model utilities, and a console panel. It is present but not wired through the main root launcher. -![BatLLM architecture](docs/images/architecture-modelito.svg) +### Ollama And Modelito Integration -### Runtime Flow Diagram +- `src/llm/service.py` is the central facade over `modelito.ollama_service` and `modelito.providers.ollama`. +- The facade handles config loading, endpoint construction, service state inspection, local model listing, remote model listing, downloads, deletes, start/stop, warm-up, timeout resolution, common model timeout defaults, metadata snapshots, and CLI lifecycle commands. +- The Ollama config screen uses structured readiness results and configurable warm-up timeout to report startup phases and errors more clearly. +- Homebrew packaging intentionally keeps mutable config and saved sessions outside the install cellar by using `BATLLM_HOME`. + +### Packaging And Release Tooling + +- `create_release_bundles.py` creates versioned source and platform archives under `dist/releases/`. +- `create_homebrew_formula.py` renders a source-based `batllm` formula, supports worktree archive generation, and can target tags or branches. +- `src/util/packaging_smoke.py` and `validate_packaging_smoke.py` validate expected release artefacts, required launcher members, and optional installer/Homebrew smoke paths. +- `.github/workflows/multiplatform.yml` runs Linux, Windows, and macOS tests/build checks, a Homebrew dry run, and mock-Ollama smoke coverage. +- `.github/workflows/publish-homebrew-tap.yml` publishes the formula to `krahd/homebrew-tap` for version tags or manual dispatch when `HOMEBREW_TAP_TOKEN` is configured. + +## Architecture Overview -![BatLLM runtime flow](docs/images/request-flow-modelito.svg) +### Inline Architecture Diagram + + + BatLLM architecture + Architecture diagram showing launchers, Kivy surfaces, game logic, replay logic, configuration, the BatLLM LLM service facade, modelito, Ollama, packaging, tests, and documentation. + + + + + + + + BatLLM Current Architecture + + + Root launchers + run_batllm.py + run_game_analyzer.py + + + Gameplay Kivy app + main.py, home/settings/history + game_board.py, bot.py, bullet.py + ollama_config_screen.py + + + Standalone analyzer + analyzer_main.py + analyzer_model.py + load/review/board screens + + + Config and state + config.yaml + app_config.py + BATLLM_HOME overlays + saved sessions + prompt/system assets + + + LLM service facade + src/llm/service.py + timeouts, readiness, metadata + start/stop/download/delete + + + Runtime + modelito 1.4.0 + local Ollama + models/server + + + Replay/session layer + history_manager.py + session_schema.py + replay_engine.py + + + Docs and API reference + docs/*.md + docs/code generated output + + + Tests and smoke tools + src/tests/*.py + ollama_mock_server.py + packaging_smoke.py + + + Release packaging + release bundles + Homebrew formula + GitHub Actions + + + + + + + + + + + + + + +### Inline Runtime Flow Diagram + + + BatLLM runtime flow + Runtime flow from configuration and model selection through model readiness, prompt execution, turn resolution, persistence, and analyzer replay. + + + + + + + + BatLLM Runtime Flow + + + Load configuration + defaults + shipped YAML + optional BATLLM_HOME overlay + + + Select model + Ollama screen or defaults + local or remote inventory + + + Local model + available? + + + Ensure model readiness + resolve timeout and warm-up + record last_served_model + report structured errors + + + Download remote model + stream progress through modelito + refresh local inventory + + + Submit player prompt + build command context + use independent/shared history + + + Model provider call + OllamaConnector to modelito + assistant response text + + + Resolve turn + parse move/rotate/shoot/shield + update board and history + + + Save or analyse + v2 session payload + replay in analyzer + + + + + + + + + + + ## Important Files And Directories -- `AGENTS.md`: canonical agent instructions for this repository -- `STATUS.md`: complete project status report -- `requirements.txt`: root Python dependency pins -- `packaging/homebrew/requirements.txt`: Homebrew runtime dependency subset -- `src/configs/`: shipped and runtime configuration defaults -- `docs/README.md`, `docs/USER_GUIDE.md`, `docs/CONTRIBUTING.md`: maintained user and developer docs - -## Recent Changes - -- Re-ran the full automatable release-readiness pass on 2026-05-09; all non-manual validation and packaging steps completed successfully. -- Release-preparation work is functionally complete for the automatable path: release bundles build, Homebrew formula generation works, and packaging smoke validation passes. -- The repository remains on version `0.3.5`; the earlier `1.0.0` notes are retained only as draft reference material and are not the active release line. -- Root launchers and `run_tests.py` were stabilised so repository-root execution works without ad-hoc `PYTHONPATH` setup. -- `src/llm/service.py` now keeps explicit install-command mapping deterministic for tests while preserving modelito runtime auto-detection when no platform override is supplied. -- `.github/workflows/multiplatform.yml` is aligned with the Homebrew dry-run job's headless Kivy/PYTHONPATH requirements. -- Mock-Ollama smoke coverage now tolerates the expected CI state where the service can be reachable with `installed=False` and no version string. -- GitHub branch protection on `main` now uses the live required check names published by GitHub Actions: `ubuntu-latest`, `windows-latest`, `macos-latest`, `Homebrew dry-run`, and `Smoke: Ollama integration`. -- Maintained release documents were updated to reflect the current `0.x` hold pending manual maintainer sign-off. +- `AGENTS.md`: canonical operating instructions for coding agents in this repository. +- `STATUS.md`: this complete project status report; must be updated with any project-state change. +- `VERSION`: active repository version (`0.3.5`). +- `requirements.txt`: root development/runtime dependency pins. +- `pytest.ini`: pytest path and discovery configuration. +- `.github/workflows/`: CI and Homebrew tap publication workflows. +- `run_batllm.py`: main application launcher. +- `run_game_analyzer.py`: standalone Game Analyzer launcher. +- `run_tests.py`: cross-platform core/full test runner. +- `src/`: application, game, analyzer, utility, and test source. +- `src/app.kv` and `src/view/*.kv`: Kivy layout definitions. +- `src/assets/`: images, prompts, sounds, and system instructions. +- `src/configs/`: shipped/default config, alternate config, app config loader, and configurator GUI. +- `src/game/`: live game state, bot/bullet primitives, history, LLM connector, replay engine, and session schema. +- `src/llm/service.py`: central Ollama/modelito service facade. +- `src/tests/`: automated tests and smoke helpers. +- `src/util/`: compatibility, path, packaging-smoke, version, and UI utility helpers. +- `src/view/`: Kivy screen classes and UI helpers. +- `docs/`: maintained user/developer docs, screenshots, diagrams, and generated API docs. +- `packaging/homebrew/`: Homebrew distribution docs and pinned formula requirements. +- `tools/ollama_mock_server.py`: local mock server for Ollama integration smoke tests. + +## Documentation State + +- `docs/README.md` is the canonical project overview and includes setup, Homebrew install notes, concepts, compatibility, troubleshooting, and glossary material. +- `docs/USER_GUIDE.md` is the user-facing manual for gameplay, commands, screens, settings, analyzer use, sessions, and troubleshooting. +- `docs/CONTRIBUTING.md` is the developer manual for setup, architecture, tests, release workflow, docs workflow, coding conventions, and troubleshooting. +- `docs/ROADMAP.md` describes 1.0 local desktop hardening and 2.0 networked-play direction, but its opening version wording needs a minor refresh from `0.2.x` to the current `0.3.x` line. +- `docs/RELEASE_CRITERIA_1_0.md` defines CI, reliability, UX, bundle, and documentation gates for a future 1.0 candidate. +- `docs/CHANGELOG.md` keeps active unreleased notes on the `0.x` hold and draft 1.0 notes. +- `docs/FIRST_RUN_RELEASE_CHECKLIST.md` and `docs/UI_UNIFICATION_PLAN_1_0.md` remain release-preparation references. +- `docs/code/` contains generated Doxygen HTML/LaTeX output and should be treated as generated documentation. ## Tests And Verification Status -Recent verified commands: +### Latest Commands Run For This Audit -- `/Users/tom/devel/ml-llm/llm/BatLLM/.venv_BatLLM/bin/python -m pytest -q` -> `151 passed, 2 skipped`. -- `/Users/tom/devel/ml-llm/llm/BatLLM/.venv_BatLLM/bin/python run_tests.py full` -> core smoke `4 passed`; full suite `153 passed`; live-Ollama lifecycle start/stop verified. -- `/Users/tom/devel/ml-llm/llm/BatLLM/.venv_BatLLM/bin/python create_release_bundles.py` -> generated `dist/releases/BatLLM-v0.3.5-{source.zip,source.tar.gz,windows.zip,macos.zip,linux.tar.gz}`. -- `/Users/tom/devel/ml-llm/llm/BatLLM/.venv_BatLLM/bin/python create_homebrew_formula.py --create-worktree-archive /tmp/BatLLM-homebrew-source.tar.gz --formula-out /tmp/batllm.rb` -> completed successfully and wrote `/tmp/batllm.rb`. -- `/Users/tom/devel/ml-llm/llm/BatLLM/.venv_BatLLM/bin/python -m pytest -q src/tests/test_homebrew_packaging.py` -> `7 passed`. -- `/Users/tom/devel/ml-llm/llm/BatLLM/.venv_BatLLM/bin/python validate_packaging_smoke.py` -> `Packaging smoke validation passed.` -- `gh api repos/krahd/BatLLM/branches/main/protection --jq '.required_status_checks.contexts'` -> confirmed required checks: `ubuntu-latest`, `windows-latest`, `macos-latest`, `Homebrew dry-run`, `Smoke: Ollama integration`. +- `pwd && rg --files -g 'AGENTS.md' -g '!**/.git/**' -g '!**/__pycache__/**' && git status --short` -> passed; confirmed repository path, only root `AGENTS.md`, and initially clean worktree. +- `find . -maxdepth 2 -type f ...` plus `rg --files -g '*.py' ...` -> passed; inventoried top-level files and Python files. +- Documentation/source inspection commands using `sed`, `find`, `git ls-files`, and AST parsing -> passed; informed this status report. +- `rg '^def test_' src/tests -c | awk ...` -> passed; statically counted 153 test functions across 13 test files. +- `python -m pytest -q` -> failed during collection in this container because the default `python` is Python 3.14.4 and does not have required dependencies installed (`ModuleNotFoundError: No module named 'yaml'`). +- `python3.12 -m compileall -q src run_batllm.py run_game_analyzer.py run_tests.py create_release_bundles.py create_homebrew_formula.py validate_packaging_smoke.py` -> passed; source and launcher files compile under Python 3.12. +- `python - <<'PY' ...` timestamp-format check -> passed; top and bottom `Last updated` lines match and use the required format. + +### Recent Previously Recorded Validation + +The previous status report recorded these successful checks from the same release-hardening period. They remain useful historical evidence but were not rerun as part of this documentation-only audit unless listed above. + +- `python -m pytest -q` -> `151 passed, 2 skipped`. - `python run_tests.py full` -> core smoke `4 passed`; full suite `153 passed`; live-Ollama lifecycle start/stop verified. -- `pytest -q` -> `151 passed, 2 skipped`. -- `pytest -q src/tests/test_homebrew_packaging.py` -> `7 passed`. -- `pytest -q src/tests/test_multiplatform_support.py::test_install_command_for_current_platform_is_platform_specific src/tests/test_ollama_config_screen_logic.py::test_build_ollama_install_command_is_platform_specific src/tests/test_homebrew_packaging.py` -> `9 passed`. -- `python create_release_bundles.py` -> generated expected platform and source archives under `dist/releases/`. -- `python create_homebrew_formula.py --create-worktree-archive /tmp/BatLLM-homebrew-source.tar.gz --formula-out /tmp/batllm.rb` -> completed successfully and wrote `/tmp/batllm.rb`. +- `python create_release_bundles.py` -> generated expected `BatLLM-v0.3.5` source and platform archives under `dist/releases/`. +- `python create_homebrew_formula.py --create-worktree-archive /tmp/BatLLM-homebrew-source.tar.gz --formula-out /tmp/batllm.rb` -> completed successfully. +- `python -m pytest -q src/tests/test_homebrew_packaging.py` -> `7 passed`. - `python validate_packaging_smoke.py` -> `Packaging smoke validation passed.` -- `python validate_packaging_smoke.py --run-installer-smoke` -> `Packaging smoke validation passed.` -- `python validate_packaging_smoke.py --skip-release-bundles --skip-homebrew --run-homebrew-install-smoke --homebrew-install-timeout 1800` -> `Packaging smoke validation passed.` -- `python -m py_compile run_batllm.py run_game_analyzer.py run_tests.py create_release_bundles.py create_homebrew_formula.py validate_packaging_smoke.py src/llm/service.py src/util/packaging_smoke.py` -> completed successfully (no syntax errors). -- `doxygen docs/code/dox_config.properties` -> completed successfully; generated docs align with version `0.3.5`. -- `gh api repos/krahd/BatLLM/branches/main/protection` -> verified the live required-check configuration on `main`. +- GitHub branch-protection API check previously confirmed required check names: `ubuntu-latest`, `windows-latest`, `macos-latest`, `Homebrew dry-run`, and `Smoke: Ollama integration`. -## Known Issues, Risks, And Limitations +### Validation Not Run In This Audit -- `src/llm/service.py` still owns BatLLM-specific overlay logic and timeout policy; avoid re-expanding it into a generic provider abstraction already handled by `modelito`. -- Homebrew install-level smoke depends on temporary local tap setup because current Homebrew rejects direct file-based formula installs outside a tap. -- First-run checklist execution is complete for command-level macOS validation but still requires manual Linux and Windows first-run pass/sign-off. +- The Kivy desktop app was not launched interactively with `python run_batllm.py` in this non-interactive environment. +- The standalone analyzer was not launched interactively with `python run_game_analyzer.py` in this non-interactive environment. +- Live Ollama lifecycle tests were not run during this audit to avoid mutating local model/service state. +- Release bundle generation and Homebrew install smoke tests were not rerun during this audit because this change updates only `STATUS.md`. -## Recurring Tasks +## Known Issues, Risks, And Limitations -- Keep hygiene checks current as packaging and release scripts evolve. -- Keep STATUS and maintained docs aligned with any additional runtime or packaging changes. -- Keep maintenance-level monitoring in place for future `modelito` and Homebrew policy changes. +- The project is still on `0.3.5`; 1.0 materials are release-planning/draft references, not an active shipped 1.0 release line. +- Local Ollama operations are inherently stateful. Starting/stopping the service, warming models, downloading models, or deleting models can affect real user state. +- GUI validation is limited in headless/non-interactive environments; many UI paths rely on Kivy event-loop behaviour and manual spot checks. +- `run_tests.py full` can affect a real Ollama service and should be run only with explicit maintainer intent. +- `src/configs/config-llama.yaml` contains stale augmented-system-instruction paths under `src/headers/...` that do not match the current asset tree. +- Tracked `src/configs/config.yaml.bak` appears to be mutable local backup data and should be reviewed before any release freeze. +- Tracked `sdf` appears to be a captured Ollama terminal log and is likely accidental or at least unexplained repository material. +- `run_tests.py` has an unused `OLLAMA_HELPER` constant pointing to a missing `src/ollama_service.py` path. +- `docs/ROADMAP.md` has stale wording that describes the current line as `0.2.x`. +- Standalone diagram files under `docs/images/` still include modelito 1.2.2 wording even though current requirements pin modelito 1.4.0. +- Generated API docs under `docs/code/` may become stale when source changes unless regenerated deliberately. +- Homebrew distribution remains source-based and macOS/Apple-Silicon oriented. +- The saved-session v2 schema is the supported path; unsupported legacy sessions are intentionally rejected by schema helpers. ## Pending Tasks -- Complete manual first-run checklist execution on Linux and Windows hosts and update `docs/FIRST_RUN_RELEASE_CHECKLIST.md` sign-off to fully complete. -- Final maintainer release actions: freeze scope on a release branch, rerun required CI checks, and tag `v0.3.5` only after the manual platform sign-offs are complete. +### High Priority Before Release Freeze -## Next Steps — Remaining Before Tagging v0.3.5 +- Decide whether to remove, ignore, or document the tracked `sdf` terminal-capture file. +- Decide whether tracked `src/configs/config.yaml.bak` is intentional; remove it or document why it is shipped. +- Fix stale `src/configs/config-llama.yaml` augmented-system-instruction paths. +- Remove or update the unused `OLLAMA_HELPER` constant in `run_tests.py`. +- Refresh `docs/ROADMAP.md` opening version wording from `0.2.x` to the current `0.3.x` line. +- Update or retire standalone `docs/images/*modelito.svg` files whose titles mention modelito 1.2.2. -Automatable release-preparation work already completed: +### Validation Pending -1. Release-bundle creation, Homebrew formula rendering, packaging smoke checks, and full automated test coverage have been exercised successfully. -2. Active repository versioning, Doxygen metadata, and maintained release documents are aligned on `0.3.5` rather than a `1.0` tag. -3. Branch protection on `main` now uses the live GitHub Actions check names and strict up-to-date enforcement. -4. Launcher, runtime-install, and mock-smoke regressions identified during CI review have been fixed and revalidated. +- Rerun `python -m pytest -q` in a supported Python environment with `requirements.txt` installed; the container default Python 3.14 environment lacks required dependencies. +- Optionally rerun `python validate_packaging_smoke.py` if release artefacts are expected to remain valid in the current environment. +- Rerun `python run_tests.py full` only in an environment where live Ollama start/stop is acceptable. +- Perform manual GUI smoke checks for `python run_batllm.py` and `python run_game_analyzer.py` before release tagging. -Remaining external maintainer actions (not fully automatable from this workspace): +### Documentation Pending -1. Complete Linux and Windows manual first-run checklist execution and finalise checklist sign-off as complete. -2. Perform final release branch freeze/tag workflow (`v0.3.5`) once the manual platform sign-offs are complete. +- Keep `STATUS.md` current after every project-state change. +- Ensure `docs/README.md`, `docs/USER_GUIDE.md`, and `docs/CONTRIBUTING.md` stay aligned with UI labels, release workflow, and config defaults. +- Keep `docs/CHANGELOG.md` clear that 1.0 notes remain draft until an actual `v1.0.0` tag is prepared. +- Regenerate `docs/code/` only when API documentation updates are intentional. -No additional automatable next steps remain open in this workspace. +## Next Steps -## Longer-Term Steps +1. Run the narrow validation for this documentation update: `python -m pytest -q`. +2. Review and address the repository-hygiene findings (`sdf`, `config.yaml.bak`, stale alternate config paths, unused test-runner constant). +3. Refresh stale documentation/diagram wording identified by the audit. +4. Run full non-live CI-equivalent checks across source, tests, packaging smoke, and Homebrew formula generation. +5. Schedule a maintainer-owned live Ollama validation pass before any release candidate. -1. Decide GUI direction for web-app surface versus Kivy-only roadmap. Analysis: this is the highest-leverage product decision because multiplayer, prompt sharing, and deployment choices all depend on whether Kivy remains the only client. Kivy and a web surface can coexist, but the added complexity yields no clear gain before 1.0. Decision: defer a web surface until after 1.0; revisit only if cross-device or teacher-mode use cases demand it. Comparable projects (Jan, GPT4All) gained traction without a web surface first. -2. LAN multiplayer support. Analysis: best first networking milestone; deterministic replay must be preserved by introducing an authoritative turn-order protocol and a schema-stable, versioned event log before any socket code is written. The replay schema must be frozen (versioned and migration-tested) as a hard precondition — any multiplayer session that references an unstable schema is fragile. Multiplayer will also require new GUI modes. -3. Internet multiplayer support. Analysis: should follow LAN once identity, matchmaking, and anti-abuse controls are defined. Additional precondition not in earlier steps: once strangers can play together, prompt injection into the LLM path becomes a meaningful attack surface and must be explicitly scoped and mitigated before public exposure. -4. Prompt library/repository and examples. Analysis: the session schema/version governance precondition remains correct, but this item is strategically more important than its position implies. A shared prompt library is the primary educational centrepiece of the project — the mechanism by which players compare model behaviours across scenarios. Promptfoo, DeepEval, and Langfuse build their community value around shared test cases; BatLLM's equivalent is game-framed prompts. Schema stability is a precondition, not the goal. -5. Additional computer-controlled player modes. Analysis: lowest infrastructure risk; can progress in parallel with 1.x maintenance. Educational value is higher than it appears: distinct bot personalities (aggressive, defensive, random, rules-only) give players a direct way to observe and compare model behavioural differences, which ties directly to the core learning loop. GUI impact is manageable within the existing screen structure. -6. Experiment/Evaluation Mode. Analysis: batch-run scenarios with fixed seeds across model and prompt configurations and produce a scored outcome log. This is the highest-differentiating capability BatLLM could offer that no comparable project provides in a game context. A scoped first version is achievable before 1.0 because seed-based replay and outcome logging are already partially implied by the existing replay engine. Precondition: deterministic seed propagation must be verified across the full game loop. -7. GUI improvements and redesigns (two separate tracks). Track A — incremental polish: can happen continuously within the 1.x cycle without a discrete decision point. Track B — UI/UX redesign: a discrete strategic decision, best deferred to 1.1 alongside any webapp or multiplayer work. A reasonable sequencing is: reach 1.0 with Kivy-only, polished incrementally; then plan 1.1 around a unified rethink of UI/UX, webapp surface, and multiplayer simultaneously. +## Longer-Term Steps ---- +- Complete 1.0 local-desktop release hardening: UX consistency, first-run reliability, failure recovery, docs alignment, and platform bundle verification. +- Preserve deterministic replay and saved-session compatibility while improving gameplay and analyzer polish. +- Continue reducing Kivy-bound coupling in core game/session/replay logic to prepare for the planned 2.0 networked architecture. +- Design the 2.0 server contract before adding web or repository-backed prompt/game sharing. +- Add broader tests for malformed model responses, slow startup, missing models, session compatibility, analyzer edge cases, and packaged first-run behaviour. -Last updated: 2026-05-09 18:35 +Last updated: 2026-05-09 23:25