Skip to content

fix: HyperparameterTuner to pass enable_managed_spot_training flag to training jobs#5597

Open
toddstep wants to merge 1 commit intoaws:masterfrom
toddstep:fix_hypertuner_managed_spot_350
Open

fix: HyperparameterTuner to pass enable_managed_spot_training flag to training jobs#5597
toddstep wants to merge 1 commit intoaws:masterfrom
toddstep:fix_hypertuner_managed_spot_350

Conversation

@toddstep
Copy link

@toddstep toddstep commented Mar 4, 2026

HyperparameterTuner._build_training_job_definition() is not copying parameters needed for managed spot training:

  • enable_managed_spot_training
  • max_wait_time_in_seconds

This causes training jobs to launch with on-demand instances.

*Issue #5584

Description of changes:

  • Include the additional parameters in the job definition
  • Add a unit test

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…ot parameters

HyperparameterTuner._build_training_job_definition() was not copying
parameters needed for managed spot training:
  - enable_managed_spot_training
  - max_wait_time_in_seconds
This caused training jobs to launch with on-demend instances.

- Include the additional parameters in the job definition
- Add a unit test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant