babs-status: replace pandas with dataclasses#359
babs-status: replace pandas with dataclasses#359asmacdo wants to merge 2 commits intoPennLINC:mainfrom
Conversation
563f239 to
d27680b
Compare
| sources every run. The CSV persists two bits of state destroyed by later | ||
| operations: `submitted` (scheduler forgets completed jobs) and `has_results` | ||
| (branches deleted after merge). Everything else is convenience artifact. | ||
| Is this the right contract, or should we rethink? |
There was a problem hiding this comment.
per discussion the question boils down on either it is to constitute some form of "interface" and thus "interface data model" which we should not change.
4714f37 to
9ee19ab
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #359 +/- ##
==========================================
+ Coverage 79.36% 79.51% +0.15%
==========================================
Files 16 17 +1
Lines 1846 1953 +107
Branches 311 330 +19
==========================================
+ Hits 1465 1553 +88
- Misses 267 279 +12
- Partials 114 121 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Drop pandas from the status path. DataFrames become dataclasses + dicts, csv module replaces pandas I/O, magic strings become enums, and is_failed is derived in one place (a property) instead of three. Key changes: - New babs/status.py: JobStatus dataclass, SchedulerState enum, CSV read/write, update_from_branches, update_from_scheduler - run_squeue returns raw text; parsing moves to status.py - run_squeue job_id is now required (prevents accidental full-queue queries and crashes from non-array jobs) - Strict validation of squeue array job format (jobid_taskid) - Uses dataclasses.replace to reduce boilerplate in update functions Bug fix: completed job with results no longer counted as failed when squeue is empty (the ghost-job dtype bug). CSV format is unchanged — no migration needed for existing projects. See design/status-without-pandas.md for full rationale and data flow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9ee19ab to
4f9e308
Compare
test_babs_init_raw_bids[subject] often exceeded the default 300s timeout in CI. Increase to 450s. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
One additional manual test we did: On a dataset with 2 subjects, we initially did a final csv state: |
|
Issues raised by Codex review 1) High: Removing all inclusion rows can leave stale job_status.csv on disk
2) Medium: ses_id truthiness checks can crash or mis-key on NA valuesSeveral places use if ses_id to decide whether key is subject-level or session-level.
ses_id = row.get('ses_id') if 'ses_id' in merged_zip_df.columns else None
key = (row['sub_id'], ses_id) if ses_id else (row['sub_id'],)
if key in statuses:
statuses[key] = replace(statuses[key], has_results=True)
3) Medium: New strict scheduler-state parsing can break on valid non-covered Slurm states
Backward-compat angle:
4) Low: Internal helper API compatibility changedreport_job_status signature changed from DataFrame-based inputs to dict-based status model:
CLI behavior is updated, but downstream code importing babs.scheduler.report_job_status directly may break. |
claude-assisted design doc in this PR:
design/status-without-pandas.md for full context and planning.
no changes to (see more in design doc on scope)
job_status.csv. TODO: how badly we want to keep as is?pandasis still in requirements, as used for other componentschanged:
Similar previous effort
Drop DataFrames from the babs status pipeline. Job state is now represented by a SchedulerState enum and a JobStatus dataclass, with is_failed and submitted as derived properties instead of stored values.
TODOs: