You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recent-season weighting: 2026=4×, 2025=2×, 2024=1.5×, 2022-2023=1×. Historical data no longer weighted equally — corrects for league K-rate drift since 2022.
opp_whiff added to hit model: Pitcher SwStr% was missing from hit model training data — now populated from FanGraphs.
Ledger-first training: Loads real PropIQ graded legs from bet_ledger as primary source (3× weight bonus over Statcast); falls back to pybaseball only when ledger has <500 rows. After 500+ K legs accumulate, model trains primarily on actual PropIQ outcomes.
DB persistence: Models saved to xgb_model_store (base64-encoded PKL) — survives Railway restarts.
Usage: python update_blend_weights.py (preview) or python update_blend_weights.py --apply
3. marcel_layer.py — Replaced with production version
Replaces the stub (which was importing MarcelLayer, marcel_adjustment — classes that never existed — and returning 0.0).
New version implements Marcel regression-to-mean (Tango Tiger 2004):
get_marcel_k_rate(k_pct, season_bf) — pitcher K% regressed to league mean; 250 BF regression constant
get_marcel_hit_rate(avg, season_pa) — batter hit rate; 600 PA regression constant
get_marcel_xba(xba, season_pa) — xBA; 200 PA regression constant (stabilises faster)
enrich_prop_with_marcel(prop, hub) — top-level call that mutates sv_k_pct / sv_xba proportional to regression strength
Example: pitcher with 35% K-rate through 80 BF → Marcel regresses to ~27% (heavy). Same pitcher at 600 BF → ~27.5% (barely any change). In May (80-200 BF per starter), Marcel meaningfully corrects small-sample noise.
4. prop_enrichment_layer.py — Marcel call site updated
The adjusted sv_k_pct/sv_xba values flow into the XGBoost blend at inference time in tasklets.py. _get_marcel_adj stub retained as no-op for backward compat; _MARCEL_LAYER global removed.
Summary by cubic
Upgrades K-training to XGBoost v2, adds Marcel regression, and auto-tunes blend weights so inference adapts to measured model quality. Improves early-season stability, aligns training/inference features, and persists models across deploys.
New Features
XGBoost v2 training: recent-season weights (2026×4, 2025×2, 2024×1.5), fixed feature alignment (sv_era, sv_k_pct, sv_bb_pct, sv_whiff_pct), new features (l3_ks, l3_ip, l5_ip, days_rest), ledger-first training with 3× bonus, opp_whiff added to hit model, DB persistence to xgb_model_store, and --status for Brier + blend tips.
Data‑driven blend weights: update_blend_weights.py reads models/model_metrics.json and patches xgb_k_layer.py using Brier thresholds (70/30, 80/20, 90/10, 95/5); run with --apply to write.
Marcel regression (production): implements regression-to-mean for K% (250 BF), hit rate (600 PA), and xBA (200 PA); enrich_prop_with_marcel(prop, hub) mutates sv_k_pct/sv_xba; call site updated in prop_enrichment_layer.py (old stub kept as no-op).
Written for commit 19cfa4d. Summary will update on new commits.
@jaayslaughter-cpu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 58 seconds before requesting another review.
You’ve run out of usage credits. Purchase more in the billing tab.
⌛ How to resolve this issue?
After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.
We recommend that you space out your commits to avoid hitting the rate limit.
🚦 How do rate limits work?
CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
We reviewed changes in 76f309b...19cfa4d on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.
AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer TIP This summary will be updated as you push new changes.
The reason will be displayed to describe this comment to others. Learn more.
Unused argument 'hub'
An unused argument can lead to confusions. It should be removed. If this variable is necessary, name the variable _ or start the name with unused or _unused.
The reason will be displayed to describe this comment to others. Learn more.
Lambda may not be necessary
A lambda that calls a function without modifying any of its parameters is unnecessary. Python functions are first-class objects and can be passed around in the same way as the resulting lambda. It is recommended to remove the lambda and use the function directly.
The reason will be displayed to describe this comment to others. Learn more.
Lambda may not be necessary
A lambda that calls a function without modifying any of its parameters is unnecessary. Python functions are first-class objects and can be passed around in the same way as the resulting lambda. It is recommended to remove the lambda and use the function directly.
The reason will be displayed to describe this comment to others. Learn more.
Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.
The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.
The reason will be displayed to describe this comment to others. Learn more.
Unused argument 'model_name'
An unused argument can lead to confusions. It should be removed. If this variable is necessary, name the variable _ or start the name with unused or _unused.
The reason will be displayed to describe this comment to others. Learn more.
Unused variable 'original'
An unused variable takes up space in the code, and can lead to confusion, and it should be removed. If this variable is necessary, name the variable _ to indicate that it will be unused, or start the name with unused or _unused.
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request overhauls the Marcel projection system, transitioning from a standalone probability adjustment layer to a regression-to-the-mean mechanism that directly mutates input features for the XGBoost models. Key changes include the introduction of a new marcel_layer.py with sample-size-dependent regression formulas, updated training scripts for XGBoost models incorporating recency weighting, and a new utility to automatically update model blend weights based on Brier scores. Feedback focuses on a critical units mismatch where absolute rates are assigned to probability delta fields, potentially causing extreme probability distortions. Additionally, the review identifies a "double regression" logic error that over-penalizes extreme stats, a discrepancy between the implementation and documentation of weighted historical averages, and risks associated with using truthy fallbacks for zero-count statistics.
The reason will be displayed to describe this comment to others. Learn more.
This assignment introduces a critical units mismatch. _marcel_adj was previously a probability delta (e.g., ±0.018), but it is now being set to an absolute projected rate (e.g., 25.0). The consumer in tasklets.py (_BaseAgent._model_prob) multiplies this value by 100.0 and adds it to the win probability, which will result in massive adjustments (e.g., +2500pp) and peg almost all probabilities to the 95% cap. Since Marcel influence is now primarily handled via feature mutation (sv_k_pct/sv_xba), this legacy nudge should either be removed or converted back to a small probability delta.
The reason will be displayed to describe this comment to others. Learn more.
The _weighted_hist function implementation does not match its docstring. The docstring describes a "Three-year weighted average (current season × 5, prev × 4, prev-prev × 3)", but the code only uses the first two weights (weights[0] and weights[1]), effectively performing a 2-point weighted average. If a 3-year system is intended, the logic needs to handle a list of historical values or a pre-weighted historical aggregate.
The reason will be displayed to describe this comment to others. Learn more.
Using the or operator for fallbacks here is risky because a valid 0.0 rate (e.g., a pitcher with zero strikeouts in a very small early-season sample) will be treated as falsy and overwritten by the league average. This unintentionally biases the regression toward the mean for players with zero-count stats.
The reason will be displayed to describe this comment to others. Learn more.
This logic implements a form of "double regression" that may be mathematically unsound. marcel_k is already a regressed value (it has been pulled toward the league mean based on sample size inside get_marcel_k_rate). Blending it again with raw_k_pct using regression_strength (which is also sample-size dependent) applies the regression penalty twice, resulting in an overly conservative estimate that is pulled too hard toward the league average.
# Use the Marcel projection directly as it already incorporates regressionifabs(marcel_k-raw_k_pct) >1.5:
prop["sv_k_pct"] =round(marcel_k, 2)
logger.debug(
"[Marcel] K-rate: raw=%.1f%% Marcel=%.1f%% (BF=%d)",
raw_k_pct, marcel_k, season_bf
)
The reason will be displayed to describe this comment to others. Learn more.
While sv_k_pct and sv_xba are mutated with their regressed values, sv_whiff_pct (or sv_swstr_pct) is not. Since whiff rate is a key feature in the XGBoost model and is highly susceptible to small-sample noise early in the season, it should also be regressed and mutated to ensure consistency across the feature vector.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR #566: XGBoost v2 training + Marcel regression + data-driven blend weights
4 changes
1.
scripts/xgb_k_training.py— Replaced with v2sv_era,sv_k_pct,sv_bb_pct,sv_whiff_pct(matchesxgb_k_layer.pyinference names exactly). Adds 4 previously-zero features:l3_ks,l3_ip,l5_ip,days_rest.opp_whiffadded to hit model: Pitcher SwStr% was missing from hit model training data — now populated from FanGraphs.bet_ledgeras primary source (3× weight bonus over Statcast); falls back to pybaseball only when ledger has <500 rows. After 500+ K legs accumulate, model trains primarily on actual PropIQ outcomes.xgb_model_store(base64-encoded PKL) — survives Railway restarts.--statusflag:python scripts/xgb_k_training.py --statusshows Brier scores + blend recommendations.2.
update_blend_weights.py— New file (repo root)Reads
models/model_metrics.jsonafter training and automatically patchesxgb_k_layer.pyblend weights based on actual Brier scores.Blend schedule:
Usage:
python update_blend_weights.py(preview) orpython update_blend_weights.py --apply3.
marcel_layer.py— Replaced with production versionReplaces the stub (which was importing
MarcelLayer, marcel_adjustment— classes that never existed — and returning 0.0).New version implements Marcel regression-to-mean (Tango Tiger 2004):
get_marcel_k_rate(k_pct, season_bf)— pitcher K% regressed to league mean; 250 BF regression constantget_marcel_hit_rate(avg, season_pa)— batter hit rate; 600 PA regression constantget_marcel_xba(xba, season_pa)— xBA; 200 PA regression constant (stabilises faster)enrich_prop_with_marcel(prop, hub)— top-level call that mutatessv_k_pct/sv_xbaproportional to regression strengthExample: pitcher with 35% K-rate through 80 BF → Marcel regresses to ~27% (heavy). Same pitcher at 600 BF → ~27.5% (barely any change). In May (80-200 BF per starter), Marcel meaningfully corrects small-sample noise.
4.
prop_enrichment_layer.py— Marcel call site updatedOld:
_get_marcel_adj(player, prop_type, is_pitcher)→ returned 0.0 (dead code)New:
enrich_prop_with_marcel(prop, hub)→ mutatessv_k_pct/sv_xbadirectlyThe adjusted
sv_k_pct/sv_xbavalues flow into the XGBoost blend at inference time intasklets.py._get_marcel_adjstub retained as no-op for backward compat;_MARCEL_LAYERglobal removed.Summary by cubic
Upgrades K-training to XGBoost v2, adds Marcel regression, and auto-tunes blend weights so inference adapts to measured model quality. Improves early-season stability, aligns training/inference features, and persists models across deploys.
sv_era,sv_k_pct,sv_bb_pct,sv_whiff_pct), new features (l3_ks,l3_ip,l5_ip,days_rest), ledger-first training with 3× bonus,opp_whiffadded to hit model, DB persistence toxgb_model_store, and--statusfor Brier + blend tips.update_blend_weights.pyreadsmodels/model_metrics.jsonand patchesxgb_k_layer.pyusing Brier thresholds (70/30, 80/20, 90/10, 95/5); run with--applyto write.enrich_prop_with_marcel(prop, hub)mutatessv_k_pct/sv_xba; call site updated inprop_enrichment_layer.py(old stub kept as no-op).Written for commit 19cfa4d. Summary will update on new commits.