Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion orchestrator.py
Original file line number Diff line number Diff line change
Expand Up @@ -722,7 +722,7 @@ def job_calibrate_model():
job_bug_checker,
CronTrigger(hour=10, minute=0, timezone="America/Los_Angeles"),
id="bug_checker",
misfire_grace_time=0, # never fire if restarted after 10 AM — prevents 8 PM surprises
misfire_grace_time=1, # skip if >1s late (effectively skip on restart) — prevents 8 PM surprises
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Setting misfire_grace_time=1 is extremely restrictive. A 1-second window is susceptible to minor system jitter or event loop delays, which could cause this daily health check to be skipped even during normal operation. Since the goal is to prevent runs that are hours late (e.g., "8 PM surprises"), a more robust value like 30 or 60 would be safer while still effectively skipping the job if the service was down for a significant period. Note that the global job_defaults already provides a 30 second grace time (line 93), so you could also consider removing this parameter to use that default.

Suggested change
misfire_grace_time=1, # skip if >1s late (effectively skip on restart) — prevents 8 PM surprises
misfire_grace_time=30, # skip if >30s late (prevents stale runs on restart) — prevents 8 PM surprises

coalesce=True, # fire once max even if multiple misfires stacked
)

Expand Down