Skip to content

New predictive templates: subscriber_retention + demand_forecasting#49

Merged
cafzal merged 10 commits intomainfrom
cross-template-predictives
May 4, 2026
Merged

New predictive templates: subscriber_retention + demand_forecasting#49
cafzal merged 10 commits intomainfrom
cross-template-predictives

Conversation

@cafzal
Copy link
Copy Markdown
Collaborator

@cafzal cafzal commented Apr 28, 2026

Summary

Adds two net-new Predictive (GNN) templates, completing the reasoner-coverage gap left by PR #48. Both ran end-to-end against jqb21724 and have customer-ready READMEs with real captured output.

Two templates

subscriber_retention (Telecommunications)

Multi-reasoner per-subscriber churn-risk pipeline (Graph → Rules → Predictive).

  • Concepts: Subscriber (graph node, denormalized plan attrs), Call (edge intermediary)
  • Stage 1 — Graph: PageRank on directed Subscriber→Subscriber call graph → Subscriber.pagerank continuous feature
  • Stage 2 — Rules: Subscriber.outgoing_calls / incoming_calls derivation properties → integer features
  • Stage 3 — Predictive: regression GNN on CHURN_RISK_SCORE (continuous 0–1 risk score). 20 epochs CPU. Stratified 70/15/15 split by SEGMENT.
  • Stage 4 — Reporting: top-5 highest-predicted-risk subscribers per segment, plus test-set RMSE.
  • Verified output: Test-set RMSE = 0.1386, top-5 per segment captured. Predictions cluster around segment mean (~0.24) on the synthetic data — README documents this and explains real-data feature requirements.

demand_forecasting (Retail)

Per-(store, item, date) demand-regression GNN over a Store / Item / ItemFamily / Sale knowledge graph.

  • Concepts: Store, Item, ItemFamily (graph nodes), Sale (prediction unit, identifies by sale_id)
  • Heterogeneous graph: Sale → Store, Sale → Item, Item → ItemFamily
  • Predictive: regression GNN on Sale.unit_sales. 20 epochs CPU. Temporal split: last 60 days = test, previous 60 = val, rest = train (split done in pandas).
  • Reporting: weekly per-(city, item family) forecast with actual vs predicted side-by-side, plus per-Sale and per-(city, family, week) RMSE.
  • Verified output: Per-Sale RMSE = 7.2792, per-(city,family,week) RMSE = 150.8997, weekly forecast table captured. GNN cleanly learns base demand + weekday/weekend rhythm; December holiday spike under-shot (see SDK note below).

Customer onboarding paths (each README, two-path framing)

Each template ships with both a bundled / light and a full / your-own option:

Template Bundled (light) — ships in ZIP Full / public-dataset link
subscriber_retention data/telco_mini/ — 4 CSVs (~1.2K subs, 6K calls, 1.2K plans, 0.7K billing). Synthetic, derived from internal DEMO_TELCO.RAW with PII dropped. No public call-graph telco dataset exists. IBM Telco Customer Churn is tabular only (no calls), so the GNN graph path needs real CDR data or a synthetic generator. README documents the Snowflake-loading pattern for own-data.
demand_forecasting data/favorita_mini/ — 3 CSVs (3 stores × 25 items × 365 days = 27,375 rows). Synthetic, generated by data/generate_favorita_mini.py with seasonality / promotions / Poisson noise. Kaggle: Corporación Favorita Grocery Sales Forecasting (~125M rows). README has a full Snowflake-load + GPU walkthrough including optional oil, holidays_events, transactions tables.

Schema-permission setup (one-time)

Both templates expose EXP_DATABASE / EXP_SCHEMA constants near the top of the script so customers can repoint to a database they own. Required one-time setup as ACCOUNTADMIN (DDL spelled out in each README's Prerequisites):

CREATE DATABASE IF NOT EXISTS <YOUR_DB>;
CREATE SCHEMA IF NOT EXISTS <YOUR_DB>.EXPERIMENTS;
GRANT USAGE ON DATABASE <YOUR_DB> TO APPLICATION RELATIONALAI;
GRANT ALL PRIVILEGES ON SCHEMA <YOUR_DB>.EXPERIMENTS TO APPLICATION RELATIONALAI;

Defaults: subscriber_retentionTELCO_ENRICHMENT.EXPERIMENTS; demand_forecastingFAVORITA_MINI.EXPERIMENTS. Customers must change to a database they own and rerun the DDL.

SDK gotcha applied to demand_forecasting

PyRel 1.0.x has a server-side DateTime/VString signature mismatch when has_time_column=True is paired with a date column at scale (also documented in dev_temp/predictive_enrichment_archive/handoff_full_paysim_cpu.md for fraud-detection at full PaySim scale). Reproduced for demand_forecasting (~27K rows, daily date column) — workaround applied:

  • has_time_column=False
  • Date kept as a plain datetime feature in PropertyTransformer (not time_col)
  • temporal_strategy removed
  • Train/Val/Test relationships time-stripped (f"{Sale} has {Any:value}" instead of f"{Sale} at {Any:date} has {Any:value}")
  • Temporal split is preserved at the pandas level — train still on the past, eval on the future

The README's "Customize this template" section spells out the steps to re-enable temporal indexing once the SDK fix lands.

Test plan

Step Status
python -m py_compile on both scripts ✅ passes
ruff check v1/subscriber_retention v1/demand_forecasting ✅ clean
python scripts/generate_version_indexes.py ✅ index up to date
Synthetic data generator (generate_favorita_mini.py) runs end-to-end ✅ 27,375 rows generated
subscriber_retention.py end-to-end against Snowflake ✅ Test-set RMSE 0.1386
demand_forecasting.py end-to-end against Snowflake ✅ Per-Sale RMSE 7.28, per-(city,family,week) RMSE 151
READMEs filled in with real captured output ✅ both populated
Customer onboarding paths (bundled-light + full-public) documented ✅ both READMEs, with explicit dataset links
Troubleshooting blocks (schema-permissions, worker-not-ready, stale-experiment, has_time_column at scale) ✅ both READMEs

Configuration / SDK learnings (for rai-agent-skills)

Real-run gotchas captured during this session and proposed for the relevant skills (rai-setup, rai-health, rai-predictive-training, rai-predictive-modeling):

  • L-1 experiment-schema setup DDL + grants
  • L-2 worker-not-ready recovery via SUSPEND_REASONER + RESUME_REASONER_ASYNC
  • L-3 SDK matches train jobs to existing experiments by Model("...") name; bump on re-run
  • L-4 has_time_column=True workaround at scale
  • L-5 CREATE_GNN_SERVICE() is legacy / not the right escalation for QUEUED predictive jobs
  • L-6 RELATIONALAI.API.JOBS history rolls; SDK can poll forever on a vanished job

Branch predictive-config-learnings (in rai-agent-skills repo, based on PR #30's branch) carries the SKILL.md edits — pushed but no PR opened yet, awaiting review.

Portfolio impact (after both PRs land)

Reasoner Pre-PR48 Post-PR48 Post-this-PR
Prescriptive (single) 17 17 17
Graph (single) 7 5 5
Rules (single) 1 2 2
Predictive (single) 0 0 1
Multi-reasoner 7 7 8
Scaffold / starter 1 1 1
Total active 33 32 34

demand_forecasting is the new single-reasoner Predictive entry. subscriber_retention is multi-reasoner (Graph + Predictive): PageRank on the call graph is a real Graph-reasoner usage, the GNN is the Predictive head.

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 28, 2026

The docs preview for this pull request has been deployed to Vercel!

✅ Preview: https://relationalai-docs-a3auhcyup-relationalai.vercel.app/build/templates
🔍 Inspect: https://vercel.com/relationalai/relationalai-docs/AJ5mAnUvhvdwUh8BiHFNALRRVJg8

@cafzal cafzal changed the title Cross-template predictives: subscriber_retention + demand_forecasting (scaffold) New predictive templates: subscriber_retention + demand_forecasting Apr 29, 2026
cafzal added 3 commits April 29, 2026 14:48
…casting)

Initial scaffolds for the two net-new predictive (GNN) templates:

- subscriber_retention (Predictive, Telecommunications) — telco churn
  node-classification GNN over a directed call-pattern graph, sourced
  directly from DEMO_TELCO.RAW Snowflake tables.

- demand_forecasting (Predictive, Retail) — retail demand regression GNN
  over a Store / Item / ItemFamily / SalesEvent knowledge graph. Mirror
  retail_planning_local.py: bundled CPU-tractable Favorita subset under
  data/favorita_mini/.

Each skeleton script has the canonical section structure
(Configure inputs / Define semantic model & load data / Configure features /
Train / Generate predictions / Validate) with TODO comments referencing
the rai-predictive-modeling and rai-predictive-training skill patterns.

Implementation deferred — pending:
- DEMO_TELCO.RAW table FQN inspection
- exp_database/exp_schema decision for experiment artifacts
- predictive engine sizing in raiconfig.yaml
- Favorita subset extraction
- Multi-key task-table pattern confirmation for (store, item, date) keys
Fully implemented per fraud_detection_local pattern:
- Loads bundled DEMO_TELCO.RAW exports under data/telco_mini/
  (subscribers 1.2K, call_detail_records 6K, plans_contracts 1.2K,
  billing_events 0.7K). PII columns (FIRST_NAME, LAST_NAME, EMAIL,
  PHONE) dropped before bundling.
- Concepts: Subscriber (graph node, with denormalized plan attrs),
  Call (edge intermediary, no identify_by).
- Stage 1 (Graph): PageRank on directed Subscriber -> Subscriber call
  graph; bound to Subscriber.pagerank as a continuous feature.
- Stage 2 (Rules): Subscriber.outgoing_calls and incoming_calls
  derivation properties via count(...).per(...).where(...).
- Stage 3 (Predictive): regression GNN on CHURN_RISK_SCORE; CPU,
  20 epochs, lr=0.005. Stratified 70/15/15 train/val/test split by
  SEGMENT to keep risk-score distribution stable.
- Stage 4 (Reporting): top-N highest-predicted-risk subscribers per
  SEGMENT with actual vs predicted side-by-side, plus test-set RMSE.

Cannot run end-to-end yet — relationalai 1.0.14 in this environment
ships without the predictive subreasoner module. py_compile + ruff
pass. Will execute and capture expected output for the README once
predictive is enabled.

Customers adapting this template would swap CHURN_RISK_SCORE for a
binary churn outcome by changing task_type='binary_classification'
and the target type — README "Customize this template" describes the
swap.
…a_mini

Fully implemented per retail_planning_local pattern:
- Bundled synthetic Favorita-shaped dataset under data/favorita_mini/
  generated by data/generate_favorita_mini.py: 3 stores x 25 items x
  365 days = 27.4K daily sales rows. Embeds weekday/weekend
  seasonality, December holiday spike, per-store/item base rates,
  Poisson noise, and ~5% promotion rate.

- Concepts: Store, Item, ItemFamily (graph nodes), Sale (prediction
  unit, identify_by sale_id, with date/unit_sales/onpromotion).

- Heterogeneous GNN graph: Sale -> Store, Sale -> Item, Item ->
  ItemFamily so signal propagates across the store and item
  hierarchies.

- Regression-with-time GNN on Sale.unit_sales; 20 epochs CPU,
  lr=0.005, temporal_strategy='last'. Sale.unit_sales explicitly
  dropped from PT features (target leakage prevention).

- Temporal train/val/test split: last 60 days = test, previous 60 =
  val, rest = train (forecasting is forward-looking by construction).

- Reporting: per-Sale predictions aggregated to weekly per-(city,
  item family) forecast for the test window with actual vs predicted
  side by side, plus per-Sale and per-(city, family, week) RMSE.

Cannot run end-to-end yet — same predictive-module gap as
subscriber_retention. py_compile + ruff pass. Will execute and capture
expected output for the README once predictive is enabled.

Customers adapting this template would replace the synthetic CSVs with
real retail data matching the schema (stores.csv, items.csv,
sales.csv) by overwriting the files under data/favorita_mini/. The
generator script is left in place as documentation.
@cafzal cafzal force-pushed the cross-template-predictives branch from 29c3cd0 to 816abab Compare April 29, 2026 21:49
Both templates ran end-to-end on jqb21724 against the predictive reasoner.

subscriber_retention: Test-set RMSE 0.1386, top-N per segment captured.
demand_forecasting: Test-set RMSE 7.28 per-Sale (151 per-(city,family,week)),
weekly forecast table captured.

Changes:
- EXP_DATABASE / EXP_SCHEMA constants near the top of both scripts so
  customers can repoint to a writable database they own. Defaults:
  TELCO_ENRICHMENT.EXPERIMENTS / FAVORITA_MINI.EXPERIMENTS.
- demand_forecasting: has_time_column=False with date kept as a plain
  datetime feature, train/val/test relationships time-stripped. The
  PyRel 1.0.x SDK has a server-side DateTime/VString signature mismatch
  with has_time_column=True at this dataset shape; documented in the
  README under Customize this template / Troubleshooting so the temporal
  index can be re-enabled when the SDK fix lands.
- README bodies filled in per dev-templates-review checklist with real
  captured output, schema-permission setup DDL, troubleshooting blocks
  for worker-not-ready and stale-experiment-matching SDK gotchas, and
  related-template links.
- v1/README.md regenerated from front matter.
@cafzal cafzal marked this pull request as ready for review May 1, 2026 02:49
Per customer-onboarding review: each template now has an explicit path
for moving from the bundled-CSV demo to a real / full dataset.

- subscriber_retention: 'Run on your own Snowflake data' subsection with
  Snowpark loading pattern, plus a note that no widely-known public
  call-graph telco dataset exists (IBM Telco churn is tabular only) so
  the GNN graph path needs real CDR data to exercise.
- demand_forecasting: 'Run on the full public Favorita dataset'
  subsection with Kaggle source, Snowflake load, GPU switch, and the
  optional Favorita-side tables (oil, holidays, transactions) that
  customers may want to fold in.
The data is synthetic demo data; reframe accordingly.

- README: remove 'RelationalAI's internal DEMO_TELCO.RAW demo schema'
  reference; describe the bundled CSVs as 'Demo data' and the four
  unused columns as 'identifier columns' rather than 'PII columns'.
- README (Run on your own Snowflake data): reframe step 2 to drop
  'unused identifier columns' and add a separate note about PII at
  the SQL level for real customer sources.
- Script docstring: drop the DEMO_TELCO.RAW.SUBSCRIBERS reference;
  describe CHURN_RISK_SCORE as 'an analyst-facing churn-risk estimate'.
- Script comment: 'Drop PII columns ...' -> 'Drop unused identifier
  columns -- not useful as features.'
… reasoner label)

The count aggregates (outgoing_calls / incoming_calls) are real derived
properties but they are feature engineering for the GNN, not a substantive
rules-based reasoning stage in the sense rai-rules-authoring covers
(classification / segmentation / compliance / alerting). PageRank, on the
other hand, is a real Graph reasoner usage. Reframe accordingly.

- README front matter: reasoning_types Graph + Predictive (drop
  Rules-based); description rewritten to match.
- README body: 'What you'll build' folds the count features into the
  Predictive bullet; pipeline-stages diagram drops the Stage 2 Rules
  banner; section header for the count derivations renamed.
- Script: docstring stage list collapses to Graph / Predictive /
  Reporting; section comments drop 'Stage N: ...' banners.
- v1/README.md: regenerated from front matter.

Casual usage of 'rule' in body text describing the count-aggregate
derivations is preserved -- the change is to the reasoner-level label,
not to natural language describing what the derivations do.
@cafzal cafzal requested review from Tellili and pkouki May 1, 2026 16:56
Copy link
Copy Markdown
Contributor

@ifountalis ifountalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strong PR overall — the SDK-workaround disclosure is honest, RMSE numbers are captured from real runs, and both templates ship customer-ready READMEs. A few small items worth tightening before merge.

HIGH — small doc-truthfulness edits

  • v1/subscriber_retention/subscriber_retention.py:30 — module docstring says "top-15 highest-predicted-risk subscribers per segment" but TOP_N_PER_SEGMENT = 5 (line 53) and the README + actual print say "Top 5". Change to "top-N" or "top-5".

  • v1/subscriber_retention/README.md:118 — the Expected output block shows

    ============================================================
    Stage 3: Predictive -- subscriber churn-risk regression GNN (CPU)
    ============================================================
    

    but the script (line 242) actually prints Predictive: subscriber churn-risk regression GNN (CPU) — the Stage 3: prefix is fabricated. Either drop it from the README or add a stage label to the print.

  • v1/subscriber_retention/subscriber_retention.py:4–5 — top-of-module docstring claims "no Snowflake data loading", but the template needs Snowflake for EXP_DATABASE = "TELCO_ENRICHMENT" (the GNN writes experiment artifacts there, and the README's Prerequisites spells out the schema-permission DDL). Tighten to "no source-data Snowflake loading; the GNN reasoner still uses Snowflake for experiment artifacts (see README Prerequisites)."

  • v1/demand_forecasting/demand_forecasting.py:13–14 — module docstring says "Train a regression GNN with a time column on Sale.date predicting unit_sales", but line 218 sets has_time_column=False and the inline NOTE (lines 128–135) explains the SDK workaround. The README is honest about this; the docstring should be too.

MEDIUM

  • v1/subscriber_retention/subscriber_retention.py:159–164 — the PropertyTransformer.drop= list has sub_id and postal_code but not churn_risk_score (the regression target). The target isn't in any feature list either, so it shouldn't auto-include, but adding it to drop makes the no-leakage invariant explicit and protects future adapters:

    drop=[
        Subscriber.sub_id,
        Subscriber.postal_code,
        Subscriber.churn_risk_score,  # target — never a feature
    ],
  • PR description ↔ front matter mismatch on reasoner taxonomy. The "Portfolio impact" table classifies both new templates as Predictive (single), but v1/subscriber_retention/README.md front matter declares reasoning_types: [Graph, Predictive] (multi-reasoner) — consistent with the README narrative "wires a call-graph signal into the model… Graph reasoner (PageRank)… Predictive reasoner trains a GNN regression head". Either fix the PR description's portfolio counts (1 single-reasoner predictive + 1 multi-reasoner) or change the front matter to [Predictive]. The README narrative argues for the front matter being right.

  • Hardcoded model names (subscriber_retention.py:68, demand_forecasting.py:65). Both READMEs have a "Re-running with a stale experiment causes training job failed" troubleshooting block telling customers to bump the model name on re-run, but the defaults (subscriber_retention_local, demand_forecasting_local) are version-less. Adding a _v1 suffix would model the bump pattern visually so customers see the convention before they hit the failure.

  • v1/subscriber_retention/README.md:147billing_events.csv is bundled (681 lines) and called out as "available for customization", but no concrete worked example exists in Customize this template. Either drop the CSV (keep the bundle lean) or add a 5-line example wiring late_payment_count as an integer feature.

On has_time_column=False — confirmed valid

Verified that no time/date column appears in any task table:

  • TrainTable: [sale_id, unit_sales]
  • ValTable: [sale_id, unit_sales]
  • TestTable: [sale_id]

The relationships correspondingly omit at {Any:date} (lines 184–194), time_col= is unset in PropertyTransformer, and Sale.date flows through as a regular datetime feature on the Sale node. The four pieces — task-table columns, relationship signatures, PropertyTransformer fields, GNN parameter — form an internally consistent set, and the configuration matches the workaround already documented in v1/retail_planning/README.md:377–382. The trade-off (no GNN temporal aggregation; December holiday spike under-shot) is honestly disclosed, and the "Re-enable temporal indexing" recipe in Customize this template gives the four-step path back when the SDK fix lands.

Optional follow-up (not blocking): the SDK bug fires at this scale (~18K train rows), but retail_planning_local works fine at ~7.6K train rows with has_time_column=True. Shrinking the bundled demand_forecasting dataset below the SDK-bug threshold would let the bundled run demonstrate the "happy-path" with time column on, while the README's "Run on full Favorita" section explains when the workaround becomes necessary at scale. Worth raising as a separate discussion.

Strengths

  • Honest SDK-workaround disclosure with the full four-step re-enable recipe in Customize this template.
  • Real captured RMSE numbers from end-to-end runs (0.1386 / 7.28 / 151) — not placeholders.
  • Stratified split (subscriber_retention, by SEGMENT) and temporal split (demand_forecasting, last 60 / previous 60 days) — both correct for their respective tasks; no leakage.
  • Schema-permission DDL spelled out per template — saves a customer support ticket.
  • Related templates cross-references between siblings.
  • Front matter complete (title, description, experience_level, industry, reasoning_types, tags, featured) and alphabetic insertion in v1/README.md correct for both entries.
  • Reproducibility: SEED=42 set in the script, the PropertyTransformer, and the data generator.

…example

Targeted edits responding to a doc-truthfulness review:

HIGH
- subscriber_retention.py docstring: 'top-15' -> 'top-N (N = TOP_N_PER_SEGMENT)'.
  TOP_N_PER_SEGMENT defaults to 5 and the runtime print + README both
  show 'Top 5'; the docstring was the only outlier.
- subscriber_retention/README.md Expected output: drop the fabricated
  'Stage 3:' prefix; the script prints 'Predictive: ...' (without a
  Stage banner) since the prior reframe to Graph + Predictive.
- subscriber_retention.py docstring: 'no Snowflake data loading' is
  too strong. The GNN reasoner uses Snowflake for experiment artifacts
  even on the bundled-CSV path. Tighten to 'no Snowflake source-data
  loading' and point at the README's schema-permission DDL.
- demand_forecasting.py docstring: drop the 'Train a regression GNN
  with a time column on Sale.date' claim. has_time_column=False is
  the configured behavior (SDK workaround is documented inline and in
  the README); the docstring should match.

MEDIUM
- subscriber_retention PropertyTransformer: add Subscriber.churn_risk_score
  to drop=[...] (script + README) so the no-leakage invariant is
  explicit. The target was already not in any feature list, so this is
  a documentation-of-intent change with no behavior impact.
- Bump default Model() name to '{template}_local_v1' so customers who
  hit the stale-experiment failure documented in troubleshooting can
  see the version-bump pattern in the default before they need it.
- billing_events.csv worked example: replace the one-line
  'available for customization' bullet with a concrete BillingEvent
  concept + Subscriber.late_payment_count rule snippet so the bundled
  CSV is actually usable as a starter, not a dead reference.
@cafzal cafzal removed the request for review from pkouki May 1, 2026 17:46
@cafzal cafzal removed the request for review from Tellili May 1, 2026 17:46
Copy link
Copy Markdown
Collaborator

@somacdivad somacdivad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Copy link
Copy Markdown
Collaborator

@somacdivad somacdivad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cafzal I already approved, but please make the following changes before mergine.

---
title: "Demand Forecasting"
description: "Forecast next-period unit sales per (store, item, day) with a regression GNN over a heterogeneous retail knowledge graph: sales transactions linked to stores, items, and item families so the GNN propagates signal through the store and product hierarchies."
featured: false
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
featured: false
featured: false
private: true

This will filter it from the public site

Comment thread v1/subscriber_retention/README.md
Co-authored-by: David Amos <somacdivad@gmail.com>
@cafzal cafzal merged commit fdbe437 into main May 4, 2026
3 checks passed
@cafzal cafzal deleted the cross-template-predictives branch May 4, 2026 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants