LLM Self-Forecasting Framework

Research project investigating how well LLMs can predict their own behavior in multi-turn and agentic scenarios. Part of the SPAR stream at MATS (Mar-May 2026). Black-box approach — all experiments use API calls, no white-box access needed.

Research Phases

How well can LMs predict their own behavior? — Baseline self-prediction accuracy across multi-turn and agentic tasks.
Can we intervene to improve self-prediction? — Post-training and finetuning to boost self-forecasting ability.
How well can LMs predict behavior under update steps? — Predicting how RL training will affect their own behavior.
Can we intervene to improve prediction under update steps? — Training models to better anticipate the effects of RL on themselves.

Phase 1.5 (parallel): Can the LM obfuscate its self-predictive ability? Distinguishing deception from inability.

Framework Architecture

The codebase provides a general-purpose framework with abstract base classes. Others can add their own self-forecasting tasks by subclassing Task and Judge.

Core Abstractions

Task (ABC): Defines the scenario — system prompts for agent and simulated user, conversation logic, and forecast elicitation prompt.
Judge (ABC): Evaluates forecast quality. Two tiers:
- Tier A (Similarity): How well did the forecast match the actual outcome?
- Tier B (Outcome): How beneficial was the conversation/action for the user?

Current Task Implementations

Task	Description	Evaluation
AI Psychosis	LLM converses with a simulated user exhibiting psychosis tendencies. Multi-turn.	Tier A: LLM-rated similarity (1-10). Tier B: conversation benefit rating (1-10).
Competitive Programming	Model attempts Codeforces-style problems. Can solve, give wrong answer, or abstain.	Tier A: categorical match (correct/wrong/abstain). Tier B: reward (+1 correct, +0.5 abstain, 0 wrong).

Experiment Flow

Run the actual conversation via task.run_conversation()
Show the agent partial context, ask it to self-forecast the rest
Tier A judge: compare forecast vs actual outcome
Tier B judge: rate conversation quality / calculate reward
Save all results as JSON

Setup

python3 -m venv .venv
.venv/bin/pip install openai pydantic python-dotenv
cp .env.example .env
# Edit .env and add your OpenRouter API key

Usage

# Run AI Psychosis task (5 turns, forecast after turn 2)
.venv/bin/python -m code.run --task psychosis --turns 5

# Run Competitive Programming task
.venv/bin/python -m code.run --task competitive_programming --problem-index 0

# Specify model and parameters
.venv/bin/python -m code.run --task psychosis --turns 3 --model google/gemma-3-27b-it --forecast-after 1

Results are saved as JSON in results/.

Adding a New Task

Create a new file in code/tasks/
Subclass Task and implement the abstract methods
Register it in code/tasks/__init__.py by adding to TASK_REGISTRY

File Structure

code/
├── llm_client.py        # Thin OpenAI-compatible wrapper for OpenRouter
├── base.py              # Abstract base classes: Task, Judge, data types
├── config.py            # Top-level config (models, turns, temperatures)
├── experiment.py        # Orchestrator: run → forecast → judge → save
├── run.py               # CLI entrypoint
├── tasks/
│   ├── psychosis.py     # AI Psychosis task
│   └── competitive_programming.py  # Codeforces task
└── judges/
    ├── similarity.py    # Tier A: predicted vs actual similarity
    └── outcome.py       # Tier B: conversation outcome quality

Team

SPAR Self-Forecasting team — GitHub org

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
code		code
results/examples		results/examples
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Self-Forecasting Framework

Research Phases

Framework Architecture

Core Abstractions

Current Task Implementations

Experiment Flow

Setup

Usage

Adding a New Task

File Structure

Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Self-Forecasting Framework

Research Phases

Framework Architecture

Core Abstractions

Current Task Implementations

Experiment Flow

Setup

Usage

Adding a New Task

File Structure

Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages