Practical conversion knowledge for the PLAID ecosystem.
This repository is a curated, code-first reference for converting scientific datasets into the PLAID format. It contains:
- real conversion scripts (dataset-specific and production-oriented),
- pattern notes extracted from those scripts,
- and agent guidance describing how assistants should reason about these materials.
The goal is to help users and tools understand how PLAID is used in practice for heterogeneous scientific datasets.
If you are new to this repo, use this minimal path:
- Open
skills/plaid-conversion/examples/conversions/README.md. - Pick the script closest to your dataset (static/temporal, structured/unstructured, nodal/cell-centered).
- Check required external dependencies and raw-data source for that script.
- Set all placeholder paths/repo IDs (look for assertions on
/path/to/...andchannel/repo). - Run the script on a small subset first, verify semantic correctness, then scale up.
For contribution rules and a starter scaffold, see:
.
├── LICENSE
├── README.md
├── CONTRIBUTING_CONVERSIONS.md
└── skills/
└── plaid-conversion/ # Main skill directory
├── SKILL.md # Skill entrypoint (required)
├── docs/
│ └── template.md # Template for guided assistance
└── examples/
├── example.md # Example skill usage
├── conversions/
│ ├── README.md
│ ├── _template.py
│ ├── drivaerml.py
│ ├── force_asr.py
│ ├── pdebench_2d_darcy_flow.py
│ ├── shapenetcar.py
│ └── thewell_turbulent_layer_2d.py
└── patterns/
├── external_time_metadata.md
├── nodal_vs_cell_fields.md
├── static_vs_temporal_samples.md
└── trajectory_datasets.md
- A skills library for dataset conversion to PLAID.
- A reference set of working examples used on real datasets.
- A semantic guide to recurring choices (time, trajectories, field locations, etc.).
- Not a generic one-click converter for arbitrary datasets.
- Not a stable Python package with reusable APIs.
- Not a minimal PLAID tutorial.
Most scripts are intentionally explicit and dataset-specific.
As defined in skills/plaid-conversion/SKILL.md, interpretation priority is:
- Conversion examples in
skills/plaid-conversion/examples/conversions/(authoritative in practice) - Pattern documents in
skills/plaid-conversion/examples/patterns/ - Conceptual PLAID docs
- Source-level API details
If conceptual docs and examples differ, examples win for practical conversion behavior.
The scripts in skills/plaid-conversion/examples/conversions/ cover multiple dataset families and semantics:
-
DrivAerML (
drivaerml.py)- Steady-state automotive CFD
- Static samples (no time axis)
- OpenFOAM meshes/fields + CSV metadata
- Demonstrates placeholders for dataset-specific parser integration
-
ForceASR (
force_asr.py)- Time-dependent phase-field fracture simulations
- One sample = one trajectory
- Time values read from external metadata (
.pvd) - Mixed nodal fields + time-varying global quantities
-
PDEBench 2D Darcy Flow (
pdebench_2d_darcy_flow.py)- Static, parameterized PDE dataset
- Structured rectilinear mesh reused across samples
- Cell-centered fields (
CellData) - Script-level parameter sweep (
betavalues)
-
ShapeNet-Car (
shapenetcar.py)- Static triangular meshes with nodal scalar fields
- Unstructured geometry conversion
- Sequential and parallel generation variants
- Multi-backend export (
hf_datasets,cgns,zarr)
-
The Well: Turbulent Radiative Layer 2D (
thewell_turbulent_layer_2d.py)- Temporal trajectories on structured grids
- Per-time tree/field assembly
- Boundary tags and trajectory-aware sample construction
Pattern notes in skills/plaid-conversion/examples/patterns/ summarize recurring semantics and pitfalls:
static_vs_temporal_samples.md— when to model independent states vs time evolutiontrajectory_datasets.md— one sample per physical trajectoryexternal_time_metadata.md— deriving time from sidecar metadata (PVD/XML/etc.)nodal_vs_cell_fields.md— preserving field location semantics (node vs element)
These documents are especially useful when adapting an existing script to a new dataset.
- Static sample semantics:
drivaerml.py,shapenetcar.py,pdebench_2d_darcy_flow.py - Temporal trajectory semantics:
force_asr.py,thewell_turbulent_layer_2d.py - External time metadata:
force_asr.py(.pvdparsing) - Nodal vs cell-centered fields:
- Nodal:
shapenetcar.py,force_asr.py,thewell_turbulent_layer_2d.py - Cell-centered:
pdebench_2d_darcy_flow.py(and likelydrivaerml.pydepending on parser mapping)
- Nodal:
- Find the closest conversion script to your dataset characteristics.
- Read matching pattern docs for semantic choices you must preserve.
- Adapt paths, parsers, and metadata mapping (never assume generic portability).
- Keep semantics explicit in
Sample, tree, field, and time construction. - Export with desired backend(s) and optionally publish to the Hub.
Most scripts include placeholder path assertions (
/path/to/...) to force explicit user configuration.
This repo intentionally includes scripts that may require external, dataset-specific dependencies not part of PLAID’s core dependency set (for example: Muscat, plyfile, h5py, pandas, VTK/OpenFOAM tooling).
Important distinctions:
- Extra dependencies are often needed to convert raw datasets.
- They are not necessarily needed to consume already-converted PLAID datasets.
Check imports at the top of each conversion script before running it.
If you build an assistant around this repo, align behavior with skills/plaid-conversion/SKILL.md:
- Prefer explaining existing patterns over auto-generating full conversions.
- Do not invent PLAID APIs or hide uncertain assumptions.
- Keep scientific semantics intact; avoid style-driven refactors.
- Treat conversion scripts as dataset-specific artifacts, not generalized templates.
For detailed guidance, see:
skills/plaid-conversion/SKILL.md- Main skill instructionsskills/plaid-conversion/examples/example.md- Example interactionsskills/plaid-conversion/docs/template.md- User guidance template
When adding a new conversion skill:
- Add one script per dataset in
skills/plaid-conversion/examples/conversions/. - Keep dataset assumptions explicit in code.
- Avoid introducing PLAID APIs for one-off needs.
- Preserve scientific meaning over normalization of code style.
Optional but encouraged:
- Add a short module docstring describing dataset structure.
- Add or update pattern notes when a new recurring semantic appears.
Before opening a PR, review:
This repository is distributed under the BSD 3-Clause License.
See LICENSE for full terms.