Skip to content

PR #566: XGBoost v2 training + Marcel regression + data-driven blend weights#435

Merged
jaayslaughter-cpu merged 1 commit into
mainfrom
pr-566-xgb-v2-marcel
May 15, 2026
Merged

PR #566: XGBoost v2 training + Marcel regression + data-driven blend weights#435
jaayslaughter-cpu merged 1 commit into
mainfrom
pr-566-xgb-v2-marcel

Conversation

@jaayslaughter-cpu
Copy link
Copy Markdown
Owner

@jaayslaughter-cpu jaayslaughter-cpu commented May 15, 2026

PR #566: XGBoost v2 training + Marcel regression + data-driven blend weights

4 changes

1. scripts/xgb_k_training.py — Replaced with v2

  • Recent-season weighting: 2026=4×, 2025=2×, 2024=1.5×, 2022-2023=1×. Historical data no longer weighted equally — corrects for league K-rate drift since 2022.
  • Feature alignment fixed: Training now uses sv_era, sv_k_pct, sv_bb_pct, sv_whiff_pct (matches xgb_k_layer.py inference names exactly). Adds 4 previously-zero features: l3_ks, l3_ip, l5_ip, days_rest.
  • opp_whiff added to hit model: Pitcher SwStr% was missing from hit model training data — now populated from FanGraphs.
  • Ledger-first training: Loads real PropIQ graded legs from bet_ledger as primary source (3× weight bonus over Statcast); falls back to pybaseball only when ledger has <500 rows. After 500+ K legs accumulate, model trains primarily on actual PropIQ outcomes.
  • DB persistence: Models saved to xgb_model_store (base64-encoded PKL) — survives Railway restarts.
  • --status flag: python scripts/xgb_k_training.py --status shows Brier scores + blend recommendations.

2. update_blend_weights.py — New file (repo root)

Reads models/model_metrics.json after training and automatically patches xgb_k_layer.py blend weights based on actual Brier scores.

Blend schedule:

  • Brier < 0.23 → 70/30 (strong edge)
  • Brier < 0.25 → 80/20 (marginal edge — current default)
  • Brier ≥ 0.25 → 90/10 (worse than null — reduce XGB weight)
  • Brier ≥ 0.27 → 95/5 (actively hurting)

Usage: python update_blend_weights.py (preview) or python update_blend_weights.py --apply

3. marcel_layer.py — Replaced with production version

Replaces the stub (which was importing MarcelLayer, marcel_adjustment — classes that never existed — and returning 0.0).

New version implements Marcel regression-to-mean (Tango Tiger 2004):

  • get_marcel_k_rate(k_pct, season_bf) — pitcher K% regressed to league mean; 250 BF regression constant
  • get_marcel_hit_rate(avg, season_pa) — batter hit rate; 600 PA regression constant
  • get_marcel_xba(xba, season_pa) — xBA; 200 PA regression constant (stabilises faster)
  • enrich_prop_with_marcel(prop, hub) — top-level call that mutates sv_k_pct / sv_xba proportional to regression strength

Example: pitcher with 35% K-rate through 80 BF → Marcel regresses to ~27% (heavy). Same pitcher at 600 BF → ~27.5% (barely any change). In May (80-200 BF per starter), Marcel meaningfully corrects small-sample noise.

4. prop_enrichment_layer.py — Marcel call site updated

Old: _get_marcel_adj(player, prop_type, is_pitcher) → returned 0.0 (dead code)
New: enrich_prop_with_marcel(prop, hub) → mutates sv_k_pct/sv_xba directly

The adjusted sv_k_pct/sv_xba values flow into the XGBoost blend at inference time in tasklets.py.
_get_marcel_adj stub retained as no-op for backward compat; _MARCEL_LAYER global removed.


Summary by cubic

Upgrades K-training to XGBoost v2, adds Marcel regression, and auto-tunes blend weights so inference adapts to measured model quality. Improves early-season stability, aligns training/inference features, and persists models across deploys.

  • New Features
    • XGBoost v2 training: recent-season weights (2026×4, 2025×2, 2024×1.5), fixed feature alignment (sv_era, sv_k_pct, sv_bb_pct, sv_whiff_pct), new features (l3_ks, l3_ip, l5_ip, days_rest), ledger-first training with 3× bonus, opp_whiff added to hit model, DB persistence to xgb_model_store, and --status for Brier + blend tips.
    • Data‑driven blend weights: update_blend_weights.py reads models/model_metrics.json and patches xgb_k_layer.py using Brier thresholds (70/30, 80/20, 90/10, 95/5); run with --apply to write.
    • Marcel regression (production): implements regression-to-mean for K% (250 BF), hit rate (600 PA), and xBA (200 PA); enrich_prop_with_marcel(prop, hub) mutates sv_k_pct/sv_xba; call site updated in prop_enrichment_layer.py (old stub kept as no-op).

Written for commit 19cfa4d. Summary will update on new commits.

…arcel regression layer + data-driven blend weight updater
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Warning

Rate limit exceeded

@jaayslaughter-cpu has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 58 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ddfa33a1-5187-4b71-8ba9-d04e3258bcb1

📥 Commits

Reviewing files that changed from the base of the PR and between 76f309b and 19cfa4d.

📒 Files selected for processing (4)
  • marcel_layer.py
  • prop_enrichment_layer.py
  • scripts/xgb_k_training.py
  • update_blend_weights.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pr-566-xgb-v2-marcel

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ecc-tools
Copy link
Copy Markdown
Contributor

ecc-tools Bot commented May 15, 2026

ECC bundle files are already tracked in this repository. Skipping generation of another bundle PR.

@deepsource-io
Copy link
Copy Markdown

deepsource-io Bot commented May 15, 2026

DeepSource Code Review

We reviewed changes in 76f309b...19cfa4d on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.

See full review on DeepSource ↗

PR Report Card

Overall Grade   Security  

Reliability  

Complexity  

Hygiene  

Code Review Summary

Analyzer Status Updated (UTC) Details
Docker May 15, 2026 2:46a.m. Review ↗
JavaScript May 15, 2026 2:46a.m. Review ↗
Python May 15, 2026 2:46a.m. Review ↗
SQL May 15, 2026 2:46a.m. Review ↗
Secrets May 15, 2026 2:46a.m. Review ↗

Important

AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.

@jaayslaughter-cpu jaayslaughter-cpu merged commit 42b06e0 into main May 15, 2026
7 of 9 checks passed
@codacy-production
Copy link
Copy Markdown

Not up to standards ⛔

🔴 Issues 2 critical · 9 high · 2 medium

Alerts:
⚠ 13 issues (≤ 0 issues of at least minor severity)

Results:
13 new issues

Category Results
ErrorProne 9 high
Security 2 critical
2 medium

View in Codacy

🟢 Metrics 59 complexity

Metric Results
Complexity 59

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Comment thread marcel_layer.py

import json
import logging
import os
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import os


An object has been imported but is not used anywhere in the file.
It should either be used or the import should be removed.

Comment thread marcel_layer.py
from datetime import datetime, timezone

import requests
from functools import lru_cache
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused lru_cache imported from functools


An object has been imported but is not used anywhere in the file.
It should either be used or the import should be removed.

Comment thread marcel_layer.py
return max(5.0, min(40.0, regressed))


def enrich_prop_with_marcel(prop: dict, hub: dict) -> dict:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused argument 'hub'


An unused argument can lead to confusions. It should be removed. If this variable is necessary, name the variable _ or start the name with unused or _unused.

Comment thread marcel_layer.py
# (label, current, sample_n, hist, expected_direction, func)
("K% elite early (35%, 80 BF)",
35.0, 80, None, "< 30",
lambda c, n, h: get_marcel_k_rate(c, n, h)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lambda may not be necessary


A lambda that calls a function without modifying any of its parameters is unnecessary. Python functions are first-class objects and can be passed around in the same way as the resulting lambda. It is recommended to remove the lambda and use the function directly.

Comment thread marcel_layer.py
lambda c, n, h: get_marcel_k_rate(c, n, h)),
("K% elite full season (28%, 600 BF)",
28.0, 600, None, "25-28",
lambda c, n, h: get_marcel_k_rate(c, n, h)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lambda may not be necessary


A lambda that calls a function without modifying any of its parameters is unnecessary. Python functions are first-class objects and can be passed around in the same way as the resulting lambda. It is recommended to remove the lambda and use the function directly.

Comment thread scripts/xgb_k_training.py
try:
import shap, pickle as _pkl
with open(model_path, "rb") as f:
model = _pkl.load(f)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue.


The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.

Comment thread scripts/xgb_k_training.py
print(f" {key:<10} {b_str:>8} {a_str:>8} {n_test:>8} {status}")

print(f"\n Null model Brier: {null_brier} (always predict 50%)")
print(f" Target Brier: <0.23 to justify current blend weights")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`f-string` used without any expression


It is wasteful to use f-string mechanism if there are no expressions to be extrapolated. It is recommended to use regular strings instead.

Comment thread scripts/xgb_k_training.py

print(f"\n Null model Brier: {null_brier} (always predict 50%)")
print(f" Target Brier: <0.23 to justify current blend weights")
print(f"\n Blend recommendations:")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`f-string` used without any expression


It is wasteful to use f-string mechanism if there are no expressions to be extrapolated. It is recommended to use regular strings instead.

Comment thread update_blend_weights.py
NULL_BRIER = 0.25 # null model always predicts 50%


def _get_blend_weight(brier: float | None, model_name: str) -> tuple[float, float, str]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused argument 'model_name'


An unused argument can lead to confusions. It should be removed. If this variable is necessary, name the variable _ or start the name with unused or _unused.

Comment thread update_blend_weights.py
return False

content = XGB_LAYER.read_text()
original = content
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable 'original'


An unused variable takes up space in the code, and can lead to confusion, and it should be removed. If this variable is necessary, name the variable _ to indicate that it will be unused, or start the name with unused or _unused.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request overhauls the Marcel projection system, transitioning from a standalone probability adjustment layer to a regression-to-the-mean mechanism that directly mutates input features for the XGBoost models. Key changes include the introduction of a new marcel_layer.py with sample-size-dependent regression formulas, updated training scripts for XGBoost models incorporating recency weighting, and a new utility to automatically update model blend weights based on Brier scores. Feedback focuses on a critical units mismatch where absolute rates are assigned to probability delta fields, potentially causing extreme probability distortions. Additionally, the review identifies a "double regression" logic error that over-penalizes extreme stats, a discrepancy between the implementation and documentation of weighted historical averages, and risks associated with using truthy fallbacks for zero-count statistics.

Comment thread prop_enrichment_layer.py
prop = _emp(prop, hub)
except Exception:
pass
prop["_marcel_adj"] = prop.get("_marcel_k_pct") or prop.get("_marcel_hit_rate") or 0.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This assignment introduces a critical units mismatch. _marcel_adj was previously a probability delta (e.g., ±0.018), but it is now being set to an absolute projected rate (e.g., 25.0). The consumer in tasklets.py (_BaseAgent._model_prob) multiplies this value by 100.0 and adds it to the win probability, which will result in massive adjustments (e.g., +2500pp) and peg almost all probabilities to the 95% cap. Since Marcel influence is now primarily handled via feature mutation (sv_k_pct/sv_xba), this legacy nudge should either be removed or converted back to a small probability delta.

Suggested change
prop["_marcel_adj"] = prop.get("_marcel_k_pct") or prop.get("_marcel_hit_rate") or 0.0
prop["_marcel_adj"] = 0.0 # Marcel influence now handled via feature mutation

Comment thread marcel_layer.py
Comment on lines +110 to +120
def _weighted_hist(current: float, hist: Optional[float],
weights=(5, 4, 3)) -> float:
"""
Parse FanGraphs percentage field.
Handles both string format ("22.0 %") and decimal float (0.22 or 22.0).
Returns a decimal fraction (0.22, not 22).
Three-year weighted average (current season × 5, prev × 4, prev-prev × 3).
Uses available data — if hist not provided, current season dominates.
"""
if val is None:
return 0.0
if isinstance(val, (int, float)):
v = float(val)
return v / 100.0 if v > 1.0 else v
s = str(val).strip().rstrip("%").strip()
try:
v = float(s)
return v / 100.0 if v > 1.0 else v
except ValueError:
return 0.0


def _parse_float(val, default: float = 0.0) -> float:
"""Safe float parse from any type."""
if val is None:
return default
try:
return float(val)
except (ValueError, TypeError):
return default


# ---------------------------------------------------------------------------
# FanGraphs data fetcher
# ---------------------------------------------------------------------------

def _fetch_fg_data(stats: str, season_start: int, season_end: int) -> list[dict]:
"""
Fetch multi-year leaderboard from FanGraphs JSON API.
if hist is None:
return current
# hist is a single prior-season value (could represent 1 or 2 seasons)
total_w = weights[0] + weights[1]
return (weights[0] * current + weights[1] * hist) / total_w
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _weighted_hist function implementation does not match its docstring. The docstring describes a "Three-year weighted average (current season × 5, prev × 4, prev-prev × 3)", but the code only uses the first two weights (weights[0] and weights[1]), effectively performing a 2-point weighted average. If a 3-year system is intended, the logic needs to handle a list of historical values or a pre-weighted historical aggregate.

Comment thread marcel_layer.py

# ── K props — Marcel pitcher K-rate ───────────────────────────────────────
if prop_type in ("strikeouts", "pitching_outs", "pitcher_strikeouts"):
raw_k_pct = float(prop.get("sv_k_pct") or prop.get("fg_kpct") or LEAGUE_AVG["k_pct"])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the or operator for fallbacks here is risky because a valid 0.0 rate (e.g., a pitcher with zero strikeouts in a very small early-season sample) will be treated as falsy and overwritten by the league average. This unintentionally biases the regression toward the mean for players with zero-count stats.

Suggested change
raw_k_pct = float(prop.get("sv_k_pct") or prop.get("fg_kpct") or LEAGUE_AVG["k_pct"])
raw_k_pct = float(prop.get("sv_k_pct") if prop.get("sv_k_pct") is not None else prop.get("fg_kpct") if prop.get("fg_kpct") is not None else LEAGUE_AVG["k_pct"])

Comment thread marcel_layer.py
Comment on lines +245 to 253
regression_strength = min(1.0, max(0.0, 1.0 - season_bf / 250))
if regression_strength > 0.3 and abs(marcel_k - raw_k_pct) > 1.5:
# Blend raw and Marcel proportional to regression strength
blended_k = (1 - regression_strength) * raw_k_pct + regression_strength * marcel_k
prop["sv_k_pct"] = round(blended_k, 2)
logger.debug(
"[Marcel] K-rate: raw=%.1f%% Marcel=%.1f%% → blended=%.1f%% (BF=%d reg=%.0f%%)",
raw_k_pct, marcel_k, blended_k, season_bf, regression_strength * 100,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic implements a form of "double regression" that may be mathematically unsound. marcel_k is already a regressed value (it has been pulled toward the league mean based on sample size inside get_marcel_k_rate). Blending it again with raw_k_pct using regression_strength (which is also sample-size dependent) applies the regression penalty twice, resulting in an overly conservative estimate that is pulled too hard toward the league average.

        # Use the Marcel projection directly as it already incorporates regression
        if abs(marcel_k - raw_k_pct) > 1.5:
            prop["sv_k_pct"] = round(marcel_k, 2)
            logger.debug(
                "[Marcel] K-rate: raw=%.1f%% Marcel=%.1f%% (BF=%d)",
                raw_k_pct, marcel_k, season_bf
            )

Comment thread marcel_layer.py
Comment on lines +255 to +258
raw_whiff = float(prop.get("sv_whiff_pct") or prop.get("sv_swstr_pct") or LEAGUE_AVG["whiff_pct"])
season_p = season_bf * 3 # rough pitch count from BF
marcel_whiff = get_marcel_whiff_pct(raw_whiff, season_p)
prop["_marcel_whiff_pct"] = marcel_whiff
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While sv_k_pct and sv_xba are mutated with their regressed values, sv_whiff_pct (or sv_swstr_pct) is not. Since whiff rate is a key feature in the XGBoost model and is highly susceptible to small-sample noise early in the season, it should also be regressed and mutated to ensure consistency across the feature vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant