Skip to content

submit: verify scheduler last-occurrence-wins for user-vs-config directive conflicts on real clusters #18

@ultimatile

Description

@ultimatile

Summary

The directive-hoisting logic added for hpc submit -s script.sh
emits user-supplied #SBATCH / #PJM directives after the
config-supplied ones in the rendered job-script prologue. On
conflict (same option appears in both), this relies on the
scheduler's "last occurrence wins" semantics for duplicate
directives so that the user's value takes effect.

This assumption is standard option-parser convention and is
plausible for both sbatch and pjsub, but has not been
empirically verified on actual clusters.

What to verify

For each target scheduler, submit a job whose [<scheduler>.options]
config and user-script directive disagree on a non-trivial option
(e.g. partition / resource group, wall time, memory), and confirm
which value the scheduler honors.

  • Slurm:
    config has partition = "<A>",
    user script has #SBATCH --partition=<B>.
    Confirm via scontrol show job <jobid> that the running job
    uses <B>.
  • PJM: same idea with the equivalent option pair (e.g.
    [pjm].options has [["-L", "rscgrp=<A>"]],
    user script has #PJM -L "rscgrp=<B>").

Expected resolution

  • If both schedulers are last-occurrence-wins: implementation is
    correct as-is, close this issue.
  • If a scheduler is first-occurrence-wins on duplicates: invert
    the emit order in _render_job_script for that scheduler so
    user directives are emitted before config ones.

Why it's deferred

The fix that introduced the hoisting was scoped to a single PR; a
real-cluster probe was outside that scope and depends on having
access to the target environments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions