restart notify, probe_success_rate, diff sort by henry0816191 · Pull Request #44 · cppalliance/paperscout

henry0816191 · 2026-05-20T17:39:47Z

Summary

Restart-aware seed notifications — On a cold first deploy, the first poll after seed() stays silent (no notify_callback). On restart (existing last_poll or discovered URLs), recent probe hits from the seed run_cycle() are filtered, matched via watchlists, and passed to notify_callback on the first poll. Redundant mark_discovered in seed() was removed; ISOProber.run_cycle() already persists hits with last_modified.
Health JSON — Renamed probe_hit_rate → probe_success_rate and fixed the ratio to (hit_recent + hit_old + hit_no_lm) / (hits + miss + error), excluding skipped URLs.
Diff ordering — updated_papers in diff_snapshots() are now sorted like new_papers (date descending, then id).

Changes

monitor.py
SeedResult; seed() captures had_prior_state before work; first poll_once() branch for restart notify; shared sort key for new/updated papers
main.py
probe_success_rate in /health extra fields
tests/test_monitor.py
Cold start vs restart notify tests; updated_papers sort test; test_seed_marks_discovered updated for run_cycle-only discovery
tests/test_health.py
Fixture key renamed to probe_success_rate

Breaking change (operators): /health JSON field probe_hit_rate is now probe_success_rate with a corrected definition. Update any dashboards or alerts that read the old name.

Test plan

uv run pytest tests/test_monitor.py tests/test_health.py -q (50 passed)
uv run pytest tests/ -q (318 passed, 2 skipped)
Manual: restart with populated DB (last_poll / discovered_urls); confirm first poll notifies only for recent seed probe hits, not cold deploy noise

Related issues

close #30

Summary by CodeRabbit

Improvements
- Health endpoint now reports probe_success_rate (hits/attempted) instead of probe_hit_rate.
- Initial scheduler run now provides recent probe data when prior state exists, improving visibility after restarts.
- Updated sorting for detected/updated items to be deterministic by date then ID.
Tests
- Tests updated/added to validate the new health metric, seeding/initial-run behavior, sorting, and notification logic.

coderabbitai · 2026-05-20T17:39:58Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 89807f73-7ceb-4228-a02b-989150acbfc4

📥 Commits

Reviewing files that changed from the base of the PR and between 4457889 and 0caf394.

📒 Files selected for processing (1)

src/paperscout/monitor.py

📝 Walkthrough

Walkthrough

Refactors scheduler seeding to return probe hits plus a prior-state flag and conditionally emit seed notifications on restart; renames health metric to probe_success_rate; ensures deterministic ordering for both new and updated paper snapshots.

Changes

Seed restart notifications, metrics, and ordering bugfix bundle

Layer / File(s)	Summary
SeedResult data contract and seed() signature `src/paperscout/monitor.py`	New `SeedResult` dataclass carries `probe_hits` list and `had_prior_state` flag. `Scheduler.seed()` signature changes to `async def seed() -> SeedResult`, with `had_prior_state` computed from existing poll or discovered URLs before seeding begins.
Scheduler.seed() implementation refactor `src/paperscout/monitor.py`	`seed()` now captures ISO probe hits into a local list, completes seeding, logs seed outcome including `had_prior_state`, and returns `SeedResult(probe_hits=..., had_prior_state=...)` instead of only internal state mutation.
Poll-once first-run notification conditionals `src/paperscout/monitor.py`	`poll_once()` on first run now calls `seed()` and conditionally emits notifications: if `had_prior_state=true`, recent probe hits are filtered, per-user watchlist matches are computed via blocking DB call, and `notify_callback` is invoked; if `had_prior_state=false`, an empty `PollResult` is returned silently (cold start).
Paper snapshot ordering and health metric naming `src/paperscout/monitor.py`, `src/paperscout/__main__.py`	`diff_snapshots` applies stable sort by date descending then id to both `new_papers` and `updated_papers`. Health endpoint renames `probe_hit_rate` to `probe_success_rate`, computing hits from scheduler stats and setting the rate to `hits / attempted` when nonzero (else `None`).
Test validation for seeding, notifications, ordering, and metrics `tests/test_monitor.py`, `tests/test_health.py`	Adds tests for cold-start silent behavior, restart notification emissions filtering old hits, seed result composition, `updated_papers` ordering by date descending, and health field `probe_success_rate`. Updates `_make_scheduler` fixture to mock `prober.snapshot_stats`.

Sequence Diagram(s)

sequenceDiagram
  participant PollOnce
  participant Scheduler.seed
  participant ProbeState
  participant DB
  participant notify_callback
  PollOnce->>Scheduler.seed: call seed()
  Scheduler.seed->>ProbeState: read last_poll & discovered_present?
  Scheduler.seed->>Scheduler.seed: prober.run_cycle() -> probe_hits
  Scheduler.seed-->>PollOnce: return SeedResult(probe_hits, had_prior_state)
  alt had_prior_state == false
    PollOnce-->>PollOnce: return empty PollResult (cold start)
  else had_prior_state == true
    PollOnce->>DB: matches_for_users(recent_hits) (blocking)
    PollOnce->>notify_callback: invoke with PollResult(probe_hits=recent_hits, per_user_matches=...)
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

cppalliance/paperscout#29: Both PRs modify the scheduler/health "extra fields" logic derived from scheduler stats; this PR renames the metric to probe_success_rate and updates tests.

Suggested labels

bug

Suggested reviewers

wpak-ai

Poem

🐰 A seed now blooms with facts in tow,
Did we have state before? Now we know!
Fresh notifs spring when paths are known,
While papers sort in steady tone.
Metrics gleam with clearer rate—first-run silence: smart and great!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title concisely references the three main changes: restart notification behavior, probe_success_rate metric rename, and diff ordering for updated papers.
Description check	✅ Passed	The description covers all required template sections, including a detailed summary of changes, comprehensive test plan with results, and a link to the related issue `#30`.
Linked Issues check	✅ Passed	All acceptance criteria from issue `#30` are met: seed() returns SeedResult with had_prior_state; poll_once() implements restart notify logic; probe_success_rate metric is corrected; updated_papers are now sorted; comprehensive tests added.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with issue `#30` objectives: seed notification behavior, health JSON metric correction, and diff ordering. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch bugfix/seed-notify-health-diff

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/paperscout/monitor.py`:
- Around line 170-173: The code updates `_last_successful_poll` and
`_last_probe_stats` immediately after `seed_result = await self.seed()` which
can mark a poll successful even if subsequent steps like
`self.matches_for_users(...)` or `self.notify_callback(...)` fail; move the
bookkeeping (assignments to `_last_successful_poll` and `_last_probe_stats`) to
the very end of the restart-first-poll path—i.e., after `matches_for_users` and
`notify_callback` have completed without error—to ensure only fully completed
polls are recorded; apply the same relocation in the other restart-first-poll
occurrences (the blocks around the other two sites referenced) so all three code
paths set those fields only after successful notification and matching.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 76ad56dd-a2d4-4f5e-9aa2-0012f80529c1

📥 Commits

Reviewing files that changed from the base of the PR and between dc319cd and 4457889.

📒 Files selected for processing (4)

src/paperscout/__main__.py
src/paperscout/monitor.py
tests/test_health.py
tests/test_monitor.py

seed/restart notify, probe_success_rate, diff sort

fde4d60

henry0816191 self-assigned this May 20, 2026

henry0816191 requested a review from wpak-ai as a code owner May 20, 2026 17:39

fixed lint errors

4457889

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread src/paperscout/monitor.py

addressed ai review

0caf394

wpak-ai approved these changes May 20, 2026

View reviewed changes

wpak-ai merged commit d4e5053 into develop May 20, 2026
10 checks passed

wpak-ai deleted the bugfix/seed-notify-health-diff branch May 20, 2026 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

restart notify, probe_success_rate, diff sort#44

restart notify, probe_success_rate, diff sort#44
wpak-ai merged 3 commits into
developfrom
bugfix/seed-notify-health-diff

henry0816191 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

henry0816191 commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Related issues

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

henry0816191 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading