Skip to content

Latest commit

 

History

History
287 lines (215 loc) · 15.4 KB

File metadata and controls

287 lines (215 loc) · 15.4 KB

AGENTS.md

This file gives working instructions for AI-assisted development of molsemble.

Project summary

molsemble is a Python/CLI package for explicit, restartable molecular ensemble workflows that preserve computational chemistry artifacts and export structured computational-result and provenance tables for ML-oriented chemistry work. Descriptor-rich exports are a central post-alpha goal as postprocessing backends mature.

The first alpha focuses on CSV + SMILES input, RDKit conformer generation, ORCA geometry optimization, ORCA single-point calculation stages, filesystem provenance, restart/resume, and structured computational-product export. Descriptor-rich exports become a stronger post-alpha goal once Multiwfn or equivalent postprocessing is added.

Required working mode

Before implementation work, read the active project context:

  1. project/PROJECT.md
  2. project/DESIGN.md
  3. project/ARCHITECTURE.md
  4. project/STATUS.md
  5. project/BACKLOG.md
  6. the active task file in planning/tasks/, when one exists

Work from explicit task files in planning/tasks/. project/BACKLOG.md is planning context, not authorization to implement broad work.

Do not read historical context by default. Completed tasks, completed sessions, archived decisions, superseded ADRs, and old milestone notes are historical context. Read them only when the active task references them, when investigating a specific past decision, or when the user asks for project-history or process analysis.

Do not silently change project scope, design semantics, architecture, alpha boundaries, or dependency policy. If a design or architecture change is needed, stop the current implementation path, update the appropriate project document, and record the decision in project/decisions.md. Use ADRs for large architectural choices.

Respect each task's Scope and Non-scope sections. Do not implement adjacent features merely because they are convenient.

Task, branch, and commit discipline

Development should proceed task by task. The default branch model is:

  • main is the last stable or approved branch.
  • dev is the main development integration branch.
  • task branches normally start from dev and are named task-XXX-short-slug.
  • reviewed task branches are normally squash-merged into dev as one task-level commit.
  • promotion from dev to main is done manually through a GitHub pull request and release gate.

The default task workflow is:

  1. Select or create a task file in planning/tasks/.
  2. Verify the current branch and base branch before changing files.
  3. Create a task-scoped branch from dev, unless the user explicitly chooses a different branch workflow.
  4. Implement only the active task.
  5. Run the task's required validation commands.
  6. Update project/STATUS.md, the active task file, and other relevant project-control files when appropriate.
  7. Suggest a final task-level commit message.
  8. Merge, squash, tag, release, push, or open pull requests only after explicit user instruction.

During private solo pre-alpha development, the user may explicitly choose a chat-led direct-to-dev workflow for small or architecture-forming tasks. In that mode, keep the same task discipline, but treat the final integration as one atomic user-made commit on dev or an amend to that task commit. Do not use direct-to-dev mode for main, public release promotion, or multi-agent work unless the user explicitly updates the workflow.

The preferred final commit style is Conventional Commits:

type(optional-scope): imperative summary

Examples:

chore: add package skeleton
feat(core): add record models
feat(config): load workflow YAML
fix(storage): detect missing artifacts
docs(orca): document wrapper setup
test(workflow): cover fan-out dependencies

Common types:

  • feat: user-visible functionality;
  • fix: bug fix;
  • docs: documentation-only change;
  • test: tests only;
  • refactor: code restructuring without behavior change;
  • chore: maintenance/setup work;
  • build: packaging/build-system changes;
  • ci: CI configuration;
  • perf: performance improvement.

The summary should be imperative and should complete the phrase This commit will .... Prefer feat(core): add product records, not feat(core): adds product records.

AI assistants should not rewrite Git history, squash commits, create final commits, merge branches, rebase branches, reset branches, create tags, create releases, push branches, or create/close pull requests unless explicitly asked. If not committing directly, they should include a suggested final commit message in the completion summary.

Git-operation authorization

AI assistants may inspect Git state with read-only commands such as git status, git diff, git log, and git branch when useful. They may also prepare file changes requested by the user.

AI assistants must not do any of the following without an explicit direct user command for that operation:

  • work directly on main;
  • merge a task branch into dev;
  • promote dev to main;
  • squash commits;
  • rebase branches;
  • reset branches;
  • rewrite history;
  • force-push;
  • create or delete tags;
  • create releases;
  • push branches;
  • open, close, or merge pull requests.

If the repository is on main and implementation work is requested, stop before changing implementation files unless the user explicitly authorizes work on main. The normal recommendation should be to start from dev and use a task branch.

Project-control update policy

During implementation, update the smallest project-control surface that accurately reflects the change. Do not let implementation diverge silently from project-control docs.

Required updates:

  • Update the active task file when task status, scope, validation results, completion notes, or known limitations change.
  • Update project/STATUS.md at the end of each completed task, and earlier if the repository enters a blocked or partially completed state.
  • Update planning/sessions/ notes for substantial AI-assisted sessions, especially when work is paused, transferred between chats or agents, or left incomplete.
  • Update project/BACKLOG.md only when roadmap order, planned phases, or follow-up work changes.
  • Update project/PROJECT.md only when project purpose, scope, alpha boundary, target users, or license stance changes.
  • Update project/DESIGN.md when accepted conceptual semantics change.
  • Update project/ARCHITECTURE.md when module layout, component boundaries, storage architecture, backend contracts, or execution architecture change.
  • Update project/decisions.md for lightweight accepted decisions.
  • Add or update an ADR for major decisions with meaningful alternatives or long-term consequences.
  • Update README or user docs only for user-facing behavior that exists now, not for planned behavior.

If implementation reveals that project-control docs are wrong, record the design, architecture, dependency, or governance change before continuing the affected implementation path.

When producing user-facing archives or patch bundles, exclude generated caches and local execution artifacts such as __pycache__/, .pytest_cache/, .ruff_cache/, build outputs, virtual environments, and local run directories. Prefer building archives from tracked or explicitly selected files.

Task completion and process lessons

At the end of a task, assistants must separate implementation work from project-control bookkeeping.

Implementation work includes code, tests, documentation, and validation required by the task itself. Project-control bookkeeping includes updating the active task file, the active session note when one exists, and project/STATUS.md.

If a task reveals a reusable lesson about task formulation, validation commands, testing style, review criteria, or AI-assisted development workflow, record the lesson in the task or session notes and promote the general rule to the relevant planning template or README.

Do not let process logging become feature scope creep. General process-policy changes should be small, clearly labeled, and limited to planning/control files unless the user explicitly requests broader governance changes. Historical session and task notes are not authoritative current instructions. Important current knowledge must be promoted into active project-control files before old notes are archived.

Context lifecycle and archiving policy

Project-control files are divided into active context and historical context.

Active context normally includes:

  • AGENTS.md;
  • project/PROJECT.md;
  • project/DESIGN.md;
  • project/ARCHITECTURE.md;
  • project/STATUS.md;
  • project/BACKLOG.md;
  • the active task file.

Historical context includes:

  • completed task files;
  • completed session notes;
  • archived decisions;
  • superseded ADRs;
  • old milestone notes.

Historical files should not be read by default during normal implementation. Read them only when explicitly referenced by active context, needed to investigate a specific past decision, or requested by the user.

Important current knowledge must be promoted from sessions, completed tasks, and archived decisions into authoritative active files before those historical files are archived. Archived files must not be treated as current instructions unless explicitly reactivated by a task or user command.

project/STATUS.md should describe the current repository state, not become a chronological development log. Completed session notes and completed task files may be archived after their important current information has been promoted.

Current development phase

Pre-alpha, design-controlled implementation. The initial implementation must build the core architecture before chemistry-specific features dominate the package.

Coding and tooling conventions

  • Target Python 3.11 or newer.
  • Use a src/ package layout.
  • Prefer absolute imports, for example from molsemble.core....
  • Keep package import side effects minimal. Importing molsemble must not import RDKit, launch executables, read user config, or create files.
  • Use Google-style docstrings for public modules, classes, functions, and methods.
  • Public functions/classes should document Args, Returns, and Raises where applicable.
  • Avoid non-ASCII characters in Python source files unless strictly necessary.
  • Use Ruff as the initial lint/style tool.
  • Add pytest tests for implemented behavior.
  • Keep implementation changes small and reviewable.
  • Prefer readable, boring Python over clever abstractions.
  • If a task reveals a design problem, stop and record the design issue instead of patching around it silently.

Architecture rules

  • Do not make ORCA-specific assumptions part of the workflow engine.
  • Do not make RDKit-specific assumptions part of core records, storage, or workflow planning.
  • RDKit is a standard alpha dependency by the first alpha release, but RDKit imports must stay localized to RDKit-specific modules and tests.
  • Keep the separation between StageKind, Backend, and BackendOperation.
  • Backends consume and produce typed Products and Artifacts through shared contracts.
  • The workflow engine owns planning, restart/reuse, invalidation, provenance policy, and downstream blocking behavior.
  • Backends own tool-specific input preparation, execution, artifact discovery, parsing, and error/status mapping.
  • Parser failure is not the same as backend execution failure.
  • Export logic should consume product registries, not raw backend files directly.
  • Directory paths are locations; manifests and registries are identity/provenance records.
  • A run directory is single-writer in alpha. Do not design concurrent mutation of the same run directory unless a concurrency/storage design is explicitly added.
  • External commands should be launched without shell=True by default. Wrapper scripts are allowed, but command construction must avoid shell injection risks.
  • Do not introduce a new StageKind casually. New StageKinds require design review because they may change planning, expansion, and dependency semantics.

Dependency rules

  • Do not add dependencies opportunistically.
  • Dependency additions require an updated pyproject.toml, relevant project-document updates, an entry in project/decisions.md when the dependency affects architecture or user installation, and tests/import checks when applicable.
  • RDKit is the standard alpha chemistry dependency because CSV + SMILES -> RDKit conformers is the primary alpha path. It should be added no later than the RDKit backend phase or alpha release packaging.
  • RDKit must not be imported by import molsemble, molsemble.core, molsemble.workflow, or molsemble.storage.
  • ORCA, xTB, Multiwfn, Gaussian, CREST, and similar programs are external executables, not Python package dependencies. They are configured through executable or wrapper paths.
  • Pandas may be used later if molsemble table I/O/export functionality justifies it, but do not add pandas merely because RDKit is present.
  • Do not create target-architecture modules before the active task needs them.

Storage and concurrency rules

  • Alpha storage is filesystem-based with manifests and JSONL registries.
  • Alpha assumes a single molsemble process writes to a run directory at a time.
  • Do not implement concurrent run-directory writers without a concurrency/storage design update.
  • Job attempts must not be overwritten. Reruns create new attempts.
  • Restart/resume is same-run continuation after interruption or partial failure.
  • Derived/follow-up runs are new runs consuming previous-run products; they are post-alpha as a user-facing feature.

Testing rules

Normal tests must not require ORCA, xTB, Multiwfn, Gaussian, Slurm, or other external QChem programs.

Required testing strategy:

  • Use fake backend tests for workflow-engine behavior.
  • Use parser fixtures for backend parsing.
  • Mark real external backend tests explicitly, for example external_orca.
  • Test restart, invalidation, status layering, artifact registration, and product fan-out independently of real QC software.
  • Test that importing molsemble does not import RDKit or ORCA-specific modules.

Documentation rules

  • Keep README concise.
  • Reflect major behavior changes in project docs.
  • Do not document planned features as implemented features.
  • Backend setup guides should describe installation/execution assumptions and smoke tests.
  • User-facing docs should distinguish alpha, post-alpha, and future functionality.

Alpha non-goals

The first alpha does not include:

  • SDF input.
  • General XYZ or OpenXYZ input.
  • xTB, CREST, Multiwfn, Gaussian, Psi4, or PySCF.
  • ORCA frequencies.
  • .orcacosmo generation.
  • Native Slurm submission.
  • Database storage.
  • Dedicated molsemble-level pruning.
  • Connectivity/stereochemistry validation.
  • Derived/follow-up runs as a user-facing feature.
  • Automatic protonation, tautomer, stereomer, or spin-state decision making.
  • Reaction networks or transition-state search.

Definition of done for implementation tasks

A task is not complete unless:

  • Scope and non-scope were respected.
  • Required tests were added or updated.
  • Relevant project docs were updated if behavior changed.
  • project/STATUS.md reflects the new repository state.
  • Any design/architecture/dependency/governance changes were recorded in project/decisions.md or an ADR.
  • Important current knowledge from sessions or completed task notes was promoted into active project-control files.
  • Reusable process lessons, if any, were recorded and promoted to planning docs or templates.
  • No planned/future functionality is presented as implemented.