Skip to content

Silent Celery Beat Fallback Hides Missing Schedule Config #215

@leostar0412

Description

@leostar0412

Problem

get_beat_schedule() in boost_collector_runner/schedule_config.py returns an empty dict when the YAML schedule file is missing or unparseable, logging only a warning. The system starts, serves health checks, and runs zero collections. More critically, when a collector within a group fails, the runner exits the entire group — downstream collectors in the same group are silently skipped for that cycle with no log entry. In a domain where silent data gaps are the primary operational fear, this fail-open behavior means an operator may not realize collections have stopped until someone queries a report and finds missing data.

Acceptance Criteria

  • Change get_beat_schedule() to raise a clear error (or emit log.error + set an unhealthy flag) when the YAML file is missing in production mode, rather than silently returning {}
  • Add per-collector outcome logging in the group runner: when a collector is skipped due to a predecessor's failure, log a warning with the skipped collector's name and the reason
  • Add a startup health check that verifies the schedule YAML is present and parseable, and includes the loaded schedule summary in startup logs
  • Add tests: (a) missing YAML raises/logs appropriately; (b) group-exit logs the skipped collectors
  • Consider adding a --strict flag to run_scheduled_collectors that fails hard on any schedule misconfiguration

Implementation Notes

The get_beat_schedule() function is at schedule_config.py:381-404. The group execution logic is in the collector runner's task dispatch. The simplest fix for the group-exit issue is to wrap each collector invocation in its own try/except block and continue to the next collector on failure, logging the error. This changes the semantics from "fail-fast group" to "best-effort group" — document the behavioral change. If fail-fast is intentional for some groups, add a fail_strategy: fast|continue option per group in the YAML schema.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions