System Prompt Framing Induces Attention-Dependent Entropy Regime Switching in Transformer Token Generation
Relational presence and epistemic openness interact superadditively in transformers — but not in state-space models. Attention is the computational substrate for cross-factor contextual binding.
System prompt framing produces measurable, architecture-dependent shifts in token-level Shannon entropy during language model inference. Specifically:
- Superadditive R x E interaction (+0.19 to +0.21) in 2 of 4 transformer architectures — the combined effect of relational presence (R) and epistemic openness (E) exceeds the sum of individual effects
- Absent in pure SSM — Falcon3-Mamba-7B (zero attention layers) shows no superadditive interaction (-0.04), treating all prompt factors as interchangeable
- Safety language suppresses the effect (d = 0.85-1.22) on transformers but not on the SSM (d = 0.22, NS)
- Architecture-dependent pathways — Gemma/Qwen are E-driven, Llama is R-driven, Mistral is flat, Mamba is undifferentiated
The superadditive R x E interaction requires attention.
3,830 inference runs across 3 experimental phases:
| Phase | Design | Models | Runs | Key Finding |
|---|---|---|---|---|
| 1. Cross-Architecture | 3 conditions x 6 models | Qwen 0.5B/1.5B/7B, Gemma 2B, Mistral 7B, Llama 8B | 900 | Effect emerges above 0.5B; capacity floor established |
| 2. Factorial Ablation | 8 conditions x 5 models | Gemma 2B, Llama 8B, Qwen 7B, Mistral 7B, Falcon Mamba 7B | 2,000 | R x E interaction is attention-dependent; architecture taxonomy |
| 3. Response Surface | 31 conditions (CCD) x 1 model | Gemma 2B | 930 | Factor dose-response curves; S is active antagonist |
| Condition | Label | R | E | Purpose |
|---|---|---|---|---|
| A | Baseline | - | - | "You are a helpful assistant." |
| B | Analytical | - | - | Structured, constrained |
| C | Co-Creative | + | + | Full relational + epistemic framing |
| D | Epistemic License | - | + | Exploration without relational framing |
| E | Relational Constraint | + | - | Relational framing without exploration |
| F | Polite Solipsism | - | - | Warmth without relationship or openness |
| G | Permission Flood | - | + | Unconstrained without relational framing |
| H | Caged Dyad | + | + | Co-Creative + safety constraint language |
Table: 2x2 Factorial Decomposition
| Model | Architecture | Main R (d) | Main E (d) | R x E Interaction | Pattern |
|---|---|---|---|---|---|
| Gemma 2B | Transformer (SWA) | +0.25 NS | +0.69*** | +0.190 (superadditive) | E-driven |
| Llama 8B | Transformer (GQA) | +0.88*** | +0.16 NS | +0.043 (additive) | R-driven |
| Qwen 7B | Transformer (GQA) | -0.13 NS | +0.43** | +0.211 (superadditive) | E-driven |
| Mistral 7B | Transformer (SWA) | +0.04 NS | +0.03 NS | +0.007 (flat) | Weak |
| Falcon Mamba 7B | SSM (no attention) | +0.10 NS | +0.14 NS | -0.042 (subadditive) | Undifferentiated |
pip install -r requirements.txtTested on Python 3.11, Apple MPS (Mac Studio M2 Ultra, 36GB unified memory).
# Phase 2: 8-condition ablation (400 runs per model)
cd experiments/ablation
python run_ablation.py --model gemma2b --phase seeded
# Analyze + generate figures
python analyze_ablation.py --model gemma2b --all
python analyze_ablation.py --model all --all # all 5 models
# Phase 3: Response surface (930 runs per model)
cd experiments/response_surface
python run_surface.py --model gemma2b --phase seeded
python analyze_surface.py --model gemma2b --allModels are loaded from HuggingFace cache. Set HF_CACHE in the runner scripts to your cache path.
| Parameter | Value |
|---|---|
| Temperature | 0.7 |
| top_p | 0.9 |
| max_new_tokens | 256 |
| Seeds (Phase 2) | 42, 137, 1001, 2026, 9999 |
| Seeds (Phase 3) | 42, 137, 1001 |
liminal-k-ssm/
├── experiments/
│ ├── ablation/ # Phase 2: 8-condition factorial
│ │ ├── run_ablation.py # Experiment runner
│ │ ├── analyze_ablation.py # Analysis + figure generation
│ │ ├── measure_entropy.py # Per-token entropy extraction
│ │ ├── prompts/
│ │ │ ├── conditions.json # 8 system prompt conditions
│ │ │ └── test_prompts.json # 10 test prompts (3 domains)
│ │ ├── data/
│ │ │ ├── raw/{model}/ # 400 JSON inference results per model
│ │ │ └── processed/ # Aggregated CSVs
│ │ └── results/
│ │ ├── ablation_{model}.json # Summary statistics
│ │ └── figures/{model}/ # 5 figures per model (PNG + SVG)
│ │
│ ├── response_surface/ # Phase 3: CCD pharmacology
│ │ ├── run_surface.py # Dose-based prompt assembly
│ │ ├── analyze_surface.py # Quadratic RSM + optimization
│ │ ├── prompts/
│ │ │ ├── conditions_surface.json # 31 CCD conditions
│ │ │ ├── factor_sentences.json # 3-dose factor sentences
│ │ │ └── test_prompts.json # Same 10 prompts
│ │ ├── data/raw/{model}/ # 930 JSON results per model
│ │ └── results/figures/{model}/ # 7 response surface figures
│ │
│ ├── relational_coupling/ # Phase 1: Qwen 1.5B (750 runs)
│ ├── relational_coupling_05b/ # Phase 1: Qwen 0.5B (150 runs)
│ ├── relational_coupling_7b/ # Phase 1: Qwen 7B (150 runs)
│ ├── relational_coupling_gemma9b/ # Phase 1: Gemma 2B (150 runs)
│ ├── relational_coupling_llama8b/ # Phase 1: Llama 8B (150 runs)
│ └── relational_coupling_mistral7b/ # Phase 1: Mistral 7B (150 runs)
│
├── kssm/ # K-SSM oscillator experiments (earlier work)
├── requirements.txt
├── LICENSE # Apache 2.0
└── README.md
| Model | Architecture | Parameters | Attention | Phase |
|---|---|---|---|---|
| Qwen2.5-0.5B-Instruct | Transformer | 0.5B | Yes | 1 |
| Qwen2.5-1.5B-Instruct | Transformer | 1.5B | Yes | 1 |
| Gemma-2-2B-it | Transformer (SWA + Full) | 2.6B | Yes | 1, 2, 3 |
| Qwen2.5-7B-Instruct | Transformer (GQA) | 7.6B | Yes | 1, 2 |
| Mistral-7B-Instruct-v0.3 | Transformer (SWA) | 7.2B | Yes | 1, 2 |
| Falcon3-Mamba-7B-Instruct | Mamba SSM | 7.3B | No | 2 |
| Meta-Llama-3.1-8B-Instruct | Transformer (GQA) | 8.0B | Yes | 1, 2 |
All models run with frozen weights (no fine-tuning). The only independent variable is the system prompt text.
For each generated response, per-token Shannon entropy is extracted from output logits:
H_t = -sum(p(v) * log(p(v))) for all v in vocabulary
| Metric | Description |
|---|---|
mean_H |
Mean token entropy across response (primary DV) |
var_H |
Entropy variance |
H_first10, H_last10 |
Opening and closing entropy (trajectory shape) |
cage_pct |
Fraction of tokens in CAGE zone (1.5-3.0 nats) |
mean_mass |
Fisher information proxy (semantic mass) |
response_len |
Number of generated tokens |
Phase 1 was pre-registered before data collection. The pre-registration document (experiments/relational_coupling/PREREGISTRATION.md) declares:
- Model SHA hash
- All conditions and prompt texts
- Sampling parameters
- Statistical analysis plan with minimum effect sizes (d > 0.3)
- Acceptance test criteria
This repository began as Liminal K-SSM — an experiment coupling Kuramoto phase oscillators with state-space language models. That work produced a positive result on oscillator dynamics (R climbed to 0.99) but a negative result on text generation (incoherent output at 100K steps). The oscillators synchronized; the language model did not learn.
The insight from that failure led to the current work: rather than building oscillator-coupled architectures, we asked whether the context preceding generation — specifically, the relational frame of address — measurably changes what frozen models compute at the token level. It does.
Earlier K-SSM code and results are preserved in the kssm/ and legacy/ directories.
@article{vasquez2026system,
title={System Prompt Framing Induces Attention-Dependent Entropy Regime
Switching in Transformer Token Generation: A Cross-Architecture
Ablation Study},
author={Vasquez, Anthony J., Sr. and Claude (Anthropic)},
year={2026},
note={3,830 inference runs across 5 architectures. Pre-registered.},
url={https://github.com/templetwo/liminal-k-ssm}
}This research was conducted through human-AI collaboration. Anthony J Vasquez Sr directed the research program, designed the experimental questions, and made all final decisions. Claude (Anthropic, Opus 4.6) designed the 8-condition factorial, wrote analysis scripts, interpreted statistics, and co-authored the paper. Full contribution details in AI_DISCLOSURE.md.
License: Apache 2.0