From b6cbfa40905a63b42c015ca8f1708eed84344450 Mon Sep 17 00:00:00 2001 From: "George G. Vega Yon" Date: Thu, 5 Mar 2026 17:00:44 -0700 Subject: [PATCH 1/4] Adding the plan --- plan.md | 160 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 160 insertions(+) create mode 100644 plan.md diff --git a/plan.md b/plan.md new file mode 100644 index 0000000..2da9929 --- /dev/null +++ b/plan.md @@ -0,0 +1,160 @@ +# Development plan + +From Olivia's comments: + +```yaml +export default defineModel({ + metadata: { + title: "Measles Outbreak Cost Calculator", + description: "Estimates the economic cost of a measles outbreak across three outbreak-size scenarios.", + authors: [ + { name: "Jane Doe", email: "jane@example.org" }, + { name: "John Smith" } + ], + introduction: ` +## Background +Measles is a highly contagious viral disease ... + +## Assumptions +All wage figures are in 2024 USD and are taken +from the Bureau of Labor Statistics. +` + }, + + parameters: { + cost_hosp: integer({ + label: "Cost of measles hospitalization", + description: "Average direct medical cost per hospitalised measles case (USD).", + default: 31168, + min: 0, + max: 500000, + unit: "USD", + references: [ + "Ortega-Sanchez et al. (2014). Vaccine, 32(34)." + ] + }), + + prop_hosp: number({ + label: "Proportion of cases hospitalised", + description: "Fraction of confirmed cases requiring hospital admission.", + default: 0.20, + min: 0, + max: 1, + unit: "proportion", + references: [ + "CDC Measles surveillance data 2019." + ] + }), + + /* ... */ + + equations: { + eq_hosp: equation({ + label: "Hospitalisation cost", + unit: "USD", + output: "integer", + compute: ({ n_cases, prop_hosp, cost_hosp }) => + n_cases * prop_hosp * cost_hosp + }), + + /* ... */ + }, + + table: table({ + scenarios: [ + { id: "s_22", label: "22 Cases", vars: { n_cases: 22 } }, + { id: "s_100", label: "100 Cases", vars: { n_cases: 100 } }, + { id: "s_803", label: "803 Cases", vars: { n_cases: 803 } } + ], + + rows: [ + { label: "Hospitalisation cost", value: "eq_hosp" }, + { label: "Lost productivity", value: "eq_lost_prod" }, + { label: "Contact tracing cost", value: "eq_tracing" }, + { label: "TOTAL", value: "eq_total", emphasis: "strong" } + ] + }), + + ## Extras + figures: figure({ + title: "My figure", + alt-text: "Some text", + py-code: " + import matplotlib as mp + mp.plot(...) + " + }) +}); +``` + +## Tasks + +- Define what the structure is + - Metadata + - Title of the model + - Description + - Author of the model + - Report structure (markdown with some placeholders). These placeholders would be automatically replaced when rendering the report (For instance `{{ table:table1 }}`). + + ```md + # Title of the report + + ## Sub title + + some text, some number {{ equation:value1 }} + + {{ table:table1 }} + + Some more text + + {{ table:table2 }} + + And a pretty figure + + {{ figure:fig1 }} + ``` + - Assumptions (another markdown document) + - Paramteres + - Equations + - Tables + - Current Parameters (in the case that the user wants to save the current state of the model.) + - A way to describe what defines a column, for instance, a column could be number of cases, number of days of isolation. + +- Function to validate the yaml file: + 1. Validate the yaml file (need to write a yaml schema + validator?). If not, then just check the dictionary. + 2. Ensure that the Python code is not malicious. + 3. Eq. validation also checks for recursive calls of values + the right order of execution. + 4. Validate the units (for instance, money or whatever) + +- Function to generate the menu with the options from the yaml + - Build the menu based on the parameters. + - Populate the values using the defauls, unless the `current_parameters` are in the model file. + - This should trigger the warnings associated with the guardrails. + +- Function to watch the iteractivity with the model parameters: + - Essentially to check the boundaries of `safe_min` and `safe_max` when the user makes changes. + +- Function to run the model. + - Validate the ranges/breaks (how many cases.). Then for each case do: + 1. Compute the equations (for which you need to ensure that you are doing the proper order) -> generate values for the table + 2. This returns the tables as dictionaries. + - This returns a list of dictionaries (the columns for the table) + +- Function to generate the report + - Write/process the markdown document (mostly the text attached to the yaml). + - Write the tables resulting from the calculations (inserting them into the `{{}}` placeholders). + - Generate the figures, also base on placeholders `{{}}` + - Render the report as an HTML file. + +- Function to save as a pdf. + +- Functionality to temporarily store the models, so the user can go back. + - Could be a little window or list that shows some history or related. + +- Function to save the current model: + - Take the current yaml file. + - Attach the current parameters. + - Save it as a yml file to the disk. + +- Setup the framework for running on the browser with [`stlite`](https://stlite.net/) + From 4935d9f68695830144e5ebdaa768b4cd05c36190 Mon Sep 17 00:00:00 2001 From: "George G. Vega Yon" Date: Thu, 5 Mar 2026 17:02:21 -0700 Subject: [PATCH 2/4] Adding the plan --- plan.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/plan.md b/plan.md index 2da9929..8f3a1b3 100644 --- a/plan.md +++ b/plan.md @@ -1,5 +1,9 @@ # Development plan +Based on the meeting of March 5th + the discussion on GitHub ([link](https://github.com/EpiForeSITE/epiworldPythonStreamlit/discussions/2)), here is a list of tasks/things that need to be built for the project. + +## Example of yaml doc + From Olivia's comments: ```yaml From 74a71f30c469c9cc925d4907fa003ca931908f23 Mon Sep 17 00:00:00 2001 From: Olivia Banks Date: Fri, 6 Mar 2026 07:21:21 -0700 Subject: [PATCH 3/4] [plan] move from JS to YAML for sample (#6) --- plan.md | 160 +++++++++++++++++++++++++++----------------------------- 1 file changed, 78 insertions(+), 82 deletions(-) diff --git a/plan.md b/plan.md index 8f3a1b3..5eb05df 100644 --- a/plan.md +++ b/plan.md @@ -7,88 +7,84 @@ Based on the meeting of March 5th + the discussion on GitHub ([link](https://git From Olivia's comments: ```yaml -export default defineModel({ - metadata: { - title: "Measles Outbreak Cost Calculator", - description: "Estimates the economic cost of a measles outbreak across three outbreak-size scenarios.", - authors: [ - { name: "Jane Doe", email: "jane@example.org" }, - { name: "John Smith" } - ], - introduction: ` -## Background -Measles is a highly contagious viral disease ... - -## Assumptions -All wage figures are in 2024 USD and are taken -from the Bureau of Labor Statistics. -` - }, - - parameters: { - cost_hosp: integer({ - label: "Cost of measles hospitalization", - description: "Average direct medical cost per hospitalised measles case (USD).", - default: 31168, - min: 0, - max: 500000, - unit: "USD", - references: [ - "Ortega-Sanchez et al. (2014). Vaccine, 32(34)." - ] - }), - - prop_hosp: number({ - label: "Proportion of cases hospitalised", - description: "Fraction of confirmed cases requiring hospital admission.", - default: 0.20, - min: 0, - max: 1, - unit: "proportion", - references: [ - "CDC Measles surveillance data 2019." - ] - }), - - /* ... */ - - equations: { - eq_hosp: equation({ - label: "Hospitalisation cost", - unit: "USD", - output: "integer", - compute: ({ n_cases, prop_hosp, cost_hosp }) => - n_cases * prop_hosp * cost_hosp - }), - - /* ... */ - }, - - table: table({ - scenarios: [ - { id: "s_22", label: "22 Cases", vars: { n_cases: 22 } }, - { id: "s_100", label: "100 Cases", vars: { n_cases: 100 } }, - { id: "s_803", label: "803 Cases", vars: { n_cases: 803 } } - ], - - rows: [ - { label: "Hospitalisation cost", value: "eq_hosp" }, - { label: "Lost productivity", value: "eq_lost_prod" }, - { label: "Contact tracing cost", value: "eq_tracing" }, - { label: "TOTAL", value: "eq_total", emphasis: "strong" } - ] - }), - - ## Extras - figures: figure({ - title: "My figure", - alt-text: "Some text", - py-code: " - import matplotlib as mp - mp.plot(...) - " - }) -}); +model: + metadata: + title: "Measles Outbreak Cost Calculator" + description: "Estimates the economic cost of a measles outbreak across three outbreak-size scenarios." + authors: + - name: "Jane Doe" + email: "jane@example.org" + - name: "John Smith" + introduction: | + ## Background + Measles is a highly contagious viral disease ... + + ## Assumptions + All wage figures are in 2024 USD and are taken + from the Bureau of Labor Statistics. + + parameters: + cost_hosp: + type: integer + label: "Cost of measles hospitalization" + description: "Average direct medical cost per hospitalised measles case (USD)." + default: 31168 + min: 0 + max: 500000 + unit: "USD" + references: + - "Ortega-Sanchez et al. (2014). Vaccine, 32(34)." + + prop_hosp: + type: number + label: "Proportion of cases hospitalised" + description: "Fraction of confirmed cases requiring hospital admission." + default: 0.20 + min: 0 + max: 1 + unit: "proportion" + references: + - "CDC Measles surveillance data 2019." + + equations: + eq_hosp: + label: "Hospitalisation cost" + unit: "USD" + output: "integer" + compute: "n_cases * prop_hosp * cost_hosp" + + table: + scenarios: + - id: "s_22" + label: "22 Cases" + vars: + n_cases: 22 + - id: "s_100" + label: "100 Cases" + vars: + n_cases: 100 + - id: "s_803" + label: "803 Cases" + vars: + n_cases: 803 + + rows: + - label: "Hospitalisation cost" + value: "eq_hosp" + - label: "Lost productivity" + value: "eq_lost_prod" + - label: "Contact tracing cost" + value: "eq_tracing" + - label: "TOTAL" + value: "eq_total" + emphasis: "strong" + + figures: + - title: "My figure" + alt-text: "Some text" + py-code: | + import matplotlib as mp + mp.plot(...) ``` ## Tasks From 7f85aee25350b8552bc61c4aa0d43b964bb999b3 Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Fri, 6 Mar 2026 15:52:18 -0700 Subject: [PATCH 4/4] Restructure plan.md: function signatures, resource links, dependency diagram, and project scaffolding tasks (#7) * Initial plan * Update plan.md: function signatures, olivia-banks suggestions, new tasks Co-authored-by: gvegayon <893619+gvegayon@users.noreply.github.com> * Clarify YAML validation step wording in plan.md Co-authored-by: gvegayon <893619+gvegayon@users.noreply.github.com> * Add function summary table and Mermaid dependency diagram to plan.md Co-authored-by: gvegayon <893619+gvegayon@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: gvegayon <893619+gvegayon@users.noreply.github.com> --- plan.md | 212 ++++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 160 insertions(+), 52 deletions(-) diff --git a/plan.md b/plan.md index 5eb05df..fbb062c 100644 --- a/plan.md +++ b/plan.md @@ -87,74 +87,182 @@ model: mp.plot(...) ``` -## Tasks +## YAML document structure -- Define what the structure is - - Metadata - - Title of the model - - Description - - Author of the model - - Report structure (markdown with some placeholders). These placeholders would be automatically replaced when rendering the report (For instance `{{ table:table1 }}`). +The YAML document must define the following top-level sections: - ```md - # Title of the report +- **Metadata:** Title, description, author(s), optional introduction text, and a Markdown report template with placeholders (e.g., `{{ table:table1 }}`, `{{ figure:fig1 }}`). An `assumptions` field (Markdown string) may also be included. +- **Parameters:** Named numeric parameters with type, label, description, default, min/max bounds, unit, and optional references. +- **Equations:** Named expressions (as safe Python arithmetic strings) with label, unit, and output type. +- **Tables:** Scenario columns (each defining a set of variable overrides) and rows (each pointing to an equation result). +- **Figures:** A list of figures, each with a title, alt-text, and a small Python snippet to generate the plot. +- **Current Parameters** *(optional):* A snapshot of parameter values representing the saved state of the model. - ## Sub title +Example report template: - some text, some number {{ equation:value1 }} +```md +# Title of the report - {{ table:table1 }} +## Sub title - Some more text +some text, some number {{ equation:value1 }} - {{ table:table2 }} +{{ table:table1 }} - And a pretty figure +Some more text - {{ figure:fig1 }} - ``` - - Assumptions (another markdown document) - - Paramteres - - Equations - - Tables - - Current Parameters (in the case that the user wants to save the current state of the model.) - - A way to describe what defines a column, for instance, a column could be number of cases, number of days of isolation. +{{ table:table2 }} -- Function to validate the yaml file: - 1. Validate the yaml file (need to write a yaml schema + validator?). If not, then just check the dictionary. - 2. Ensure that the Python code is not malicious. - 3. Eq. validation also checks for recursive calls of values + the right order of execution. - 4. Validate the units (for instance, money or whatever) +And a pretty figure -- Function to generate the menu with the options from the yaml - - Build the menu based on the parameters. - - Populate the values using the defauls, unless the `current_parameters` are in the model file. - - This should trigger the warnings associated with the guardrails. +{{ figure:fig1 }} +``` + +## Task dependency diagram + +The diagram below shows how the infrastructure tasks and function-level tasks relate to each other. Infrastructure tasks (top block) must be finalized before implementation work can proceed; solid arrows indicate data/output dependencies and dashed arrows indicate that a function triggers another in the normal app flow. + +```mermaid +flowchart TD + subgraph INFRA["🏗️ Infrastructure — finalize first"] + direction LR + BP["Branch protection rules"] + AG["AGENTS.md"] + DC["Devcontainer"] + GH_CI["GitHub Actions: CI testing (uv)"] + GH_AG["GitHub Actions: agent environment"] + DC --> GH_CI + DC --> GH_AG + end + + subgraph FUNC["💻 Function implementation"] + STLITE["Setup stlite framework"] + VY["validate_yaml()"] + BM["build_menu()"] + WP["watch_parameters()"] + RM["run_model()"] + GR["generate_report()"] + PDF["save_as_pdf()"] + SMS["store_model_state()"] + SCM["save_current_model()"] + + STLITE --> VY + VY --> BM + BM -.->|"user values"| WP + VY -->|"model_dict"| WP + WP --> RM + RM --> GR + GR --> PDF + WP --> SMS + WP --> SCM + end + + INFRA ==> FUNC +``` + +## Function summary + +| Function | Input → Output | +|---|---| +| `validate_yaml(yaml_content: str)` | raw YAML string → validated `dict` | +| `build_menu(model_dict: dict)` | model dict → Streamlit widgets (side-effects) | +| `watch_parameters(model_dict, current_values)` | model dict + user values → validated dict + warnings | +| `run_model(model_dict, parameters)` | model dict + params → `list[dict]` (per-scenario results) | +| `generate_report(model_dict, results)` | model dict + results → HTML string | +| `save_as_pdf(html_content: str)` | HTML string → PDF bytes | +| `store_model_state(model_dict, parameters)` | model + params → persisted state (side-effects) | +| `save_current_model(model_dict, current_parameters)` | model + params → YAML string on disk | + +## Tasks + +### `validate_yaml(yaml_content: str) -> dict` + +- **Input:** Raw YAML document as a string (e.g., file contents read from disk or uploaded by the user). +- **Output:** A validated Python dictionary representing the model, or raises a descriptive error on failure. +- **Steps:** + 1. Parse the YAML string into a dictionary and validate its structure against the expected schema (required keys, value types, etc.). See the [SO reference on YAML validation in Python](https://stackoverflow.com/questions/3262569/validating-a-yaml-document-in-python/22231372#22231372) for schema-based approaches; if a full schema validator is too heavy, at minimum walk the parsed dictionary and check each required field manually. + 2. Ensure that any embedded Python code (e.g., figure snippets) is not malicious. Use the [CPython AST module](https://docs.python.org/3/library/ast.html) (`ast.walk`) to inspect the parse tree, whitelist allowed node types/names, and reject anything outside that set. + 3. Check equations for recursive references and determine a safe execution order (topological sort). + 4. Validate units for consistency (e.g., ensure monetary values are not mixed with proportions without explicit conversion). + +### `build_menu(model_dict: dict) -> None` + +- **Input:** Validated model dictionary (output of `validate_yaml`). +- **Output:** No return value; renders the Streamlit sidebar/parameter panel with appropriate input widgets for each parameter. +- **Steps:** + - Build input widgets from the `parameters` section. + - Populate values using the `default` fields, unless `current_parameters` are present in the model dictionary (in which case those values take precedence). + - Trigger guardrail warnings when parameter values approach `safe_min` / `safe_max` boundaries. + +### `watch_parameters(model_dict: dict, current_values: dict) -> dict` + +- **Input:** Validated model dictionary and a dictionary of current parameter values entered by the user. +- **Output:** A dictionary of validated parameter values, with warning messages attached for any values that fall outside `safe_min` / `safe_max` bounds. +- **Steps:** + - For each parameter, check that its current value lies within `[safe_min, safe_max]`. + - Return the validated values along with any triggered warnings. + +### `run_model(model_dict: dict, parameters: dict) -> list[dict]` + +- **Input:** Validated model dictionary and a validated parameter dictionary (output of `watch_parameters`). +- **Output:** A list of dictionaries, one per scenario column, where each dictionary maps equation/row names to computed numeric values. +- **Steps:** + - Validate scenario ranges and column definitions. + - For each scenario: + 1. Merge scenario-specific variable overrides into the base parameters. + 2. Evaluate equations in topologically-sorted order to produce row values. + - Return the list of per-scenario result dictionaries. + +### `generate_report(model_dict: dict, results: list[dict]) -> str` + +- **Input:** Validated model dictionary and the list of scenario results (output of `run_model`). +- **Output:** An HTML string representing the full rendered report. +- **Steps:** + - Process the Markdown report template, replacing `{{ equation:* }}`, `{{ table:* }}`, and `{{ figure:* }}` placeholders with computed values, formatted tables, and rendered figures respectively. + - **Figures:** Consider using [Streamlit's built-in charting functions](https://docs.streamlit.io/develop/api-reference/charts) in preference to raw `matplotlib` calls, to simplify dependencies and keep the interface consistent with the Streamlit app. + - Render the final document as an HTML string. + +### `save_as_pdf(html_content: str) -> bytes` + +- **Input:** HTML string (output of `generate_report`). +- **Output:** PDF file as bytes, ready to be offered as a download. +- **Notes:** + - Since the app targets WASM (via `stlite`), dependencies that require native binaries (e.g., most headless-browser or Chromium-based libraries) are not available. + - Potential approaches to evaluate: + - Call a REST API service for HTML-to-PDF conversion. + - Emit an intermediate TeX document and use a TeX-to-PDF pipeline where available. + - Use browser-native APIs (e.g., `window.print()` / the `print` CSS media query) to trigger a client-side PDF save — this has been seen in production Streamlit apps and may be the most WASM-friendly option. + - ReportLab and similar direct-to-PDF Python libraries may require paid licenses or native extensions; evaluate licensing before adopting. + +### `store_model_state(model_dict: dict, parameters: dict) -> None` + +- **Input:** Validated model dictionary and the current parameter values. +- **Output:** No return value; persists the model state so the user can navigate back to it. +- **Notes:** + - `localStorage` (or `sessionStorage`) via Streamlit's JavaScript component API is the most likely mechanism in a WASM context, since there is no server-side filesystem. Investigate whether Streamlit exposes an API to this effect or whether a custom component is needed. + - The UI could surface this as a small history panel or list of previously visited states. -- Function to watch the iteractivity with the model parameters: - - Essentially to check the boundaries of `safe_min` and `safe_max` when the user makes changes. +### `save_current_model(model_dict: dict, current_parameters: dict) -> str` -- Function to run the model. - - Validate the ranges/breaks (how many cases.). Then for each case do: - 1. Compute the equations (for which you need to ensure that you are doing the proper order) -> generate values for the table - 2. This returns the tables as dictionaries. - - This returns a list of dictionaries (the columns for the table) +- **Input:** Validated model dictionary and the current parameter values. +- **Output:** A YAML string (the original model with the `current_parameters` section populated) saved to disk or offered as a file download. +- **Steps:** + - Merge the current parameter values into the model dictionary under `current_parameters`. + - Serialise back to a YAML string. + - Write to disk or trigger a browser download. -- Function to generate the report - - Write/process the markdown document (mostly the text attached to the yaml). - - Write the tables resulting from the calculations (inserting them into the `{{}}` placeholders). - - Generate the figures, also base on placeholders `{{}}` - - Render the report as an HTML file. +### Setup `stlite` framework -- Function to save as a pdf. +- Configure the project to run entirely in the browser using [`stlite`](https://stlite.net/). +- Verify that all dependencies (pure-Python or available as Pyodide wheels) are compatible with the WASM runtime. -- Functionality to temporarily store the models, so the user can go back. - - Could be a little window or list that shows some history or related. +## Other tasks (non-function) -- Function to save the current model: - - Take the current yaml file. - - Attach the current parameters. - - Save it as a yml file to the disk. +These are project-level tasks that need to be addressed but do not map directly to a single function: -- Setup the framework for running on the browser with [`stlite`](https://stlite.net/) +- **Branch protection rules:** Update the repository's branch protection rules to require all changes to be submitted via pull requests (no direct pushes to the main branch). +- **`AGENTS.md` file:** Draft a simple `AGENTS.md` file that describes the autonomous agents involved in the project, their roles, and the conventions they should follow. +- **GitHub Actions workflow — agent environment:** Create a GitHub Actions workflow that sets up the environment required by the agent (tools, credentials, runtime dependencies). +- **GitHub Actions workflow — CI testing:** Create a GitHub Actions workflow that installs project dependencies using [`uv`](https://github.com/astral-sh/uv) and runs the test suite on every push/PR. +- **Devcontainer environment:** Create a `.devcontainer` configuration (e.g., `devcontainer.json` + Dockerfile or feature list) so contributors can open the project in a fully configured, reproducible development container.