Summary
JOB_TEMPLATE in src/hpc/job.py injects scheduler-bookkeeping
output and error directives in Slurm form
(--output=..., --error=...) for both Slurm and PJM jobs.
PJM (pjsub) uses -o <path> and -e <path> instead, so the
emitted lines are not valid PJM directives.
Repro
hpc.toml with [cluster] scheduler = "pjm", then
hpc submit "echo hi" and inspect the rendered job script
(uploaded under <workdir>/.hpc/runs/<run_id>/job.sh):
#PJM -L node=12
...
#PJM --output=/.../job-%j.out
#PJM --error=/.../job-%j.err
The last two lines are Slurm-style; PJM's pjsub does not honor
them. Job stdout / stderr therefore land at PJM's default
locations rather than under
<workdir>/.hpc/runs/<run_id>/job-<job_id>.out, which breaks
hpc job-output <run_id> (it reads from the hpc-tracked path).
Expected
For PJM, the bookkeeping lines should be emitted as #PJM -o <path>
and #PJM -e <path>, with the path / wildcard substitution
adjusted to PJM conventions
(%j is Slurm's; PJM uses %j in some configurations and the
job-id substitution differs across PJM versions — needs
investigation).
Notes
- This was identified during research for a separate issue but kept
out of scope to keep that fix minimal.
- The fix likely needs a per-scheduler hook for emitting output /
error directives, since the syntax and substitution conventions
diverge.
hpc job-output resolution
(get_job_output in src/hpc/job.py) assumes the hpc-controlled
path layout, so this needs to stay aligned with whatever PJM
actually writes.
Summary
JOB_TEMPLATEinsrc/hpc/job.pyinjects scheduler-bookkeepingoutput and error directives in Slurm form
(
--output=...,--error=...) for both Slurm and PJM jobs.PJM (
pjsub) uses-o <path>and-e <path>instead, so theemitted lines are not valid PJM directives.
Repro
hpc.tomlwith[cluster] scheduler = "pjm", thenhpc submit "echo hi"and inspect the rendered job script(uploaded under
<workdir>/.hpc/runs/<run_id>/job.sh):The last two lines are Slurm-style; PJM's
pjsubdoes not honorthem. Job stdout / stderr therefore land at PJM's default
locations rather than under
<workdir>/.hpc/runs/<run_id>/job-<job_id>.out, which breakshpc job-output <run_id>(it reads from the hpc-tracked path).Expected
For PJM, the bookkeeping lines should be emitted as
#PJM -o <path>and
#PJM -e <path>, with the path / wildcard substitutionadjusted to PJM conventions
(
%jis Slurm's; PJM uses%jin some configurations and thejob-id substitution differs across PJM versions — needs
investigation).
Notes
out of scope to keep that fix minimal.
error directives, since the syntax and substitution conventions
diverge.
hpc job-outputresolution(
get_job_outputinsrc/hpc/job.py) assumes the hpc-controlledpath layout, so this needs to stay aligned with whatever PJM
actually writes.