[ICLR 2025] General-purpose activation steering library
-
Updated
Sep 18, 2025 - Python
[ICLR 2025] General-purpose activation steering library
KV Cache Steering for Inducing Reasoning in Small Language Models
[ACL 2026] - Official repo for the paper: "Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection"
Official code for "Activation Steering for Accent Adaptation in Speech Foundation Models" (Interspeech 2026). Parameter-free accent adaptation via mean-shift steering vectors — no weight updates, consistent WER reductions across 8 accents.
[ICLR 2026] ASGuard: Activation-Scaling Guard to Mitigate Targeted Jailbreaking Attack
Activation steering and trait monitoring for HuggingFace transformers
Phase-aware LLM activation steering and linear probing. A memory-efficient, practical implementation of Representation Engineering (RepE) for safety research.
Accepted at 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026
Activation steering toolkit for Llama 3.2 3B — inject sensory-constructed vectors into model activations to alter processing dispositions. Web UI + API. Runs locally on consumer hardware.
🌐 Model adversarial economics and AI alignment using the Panopticon Lattice, a multi-agent simulation exploring hidden collusion and system dynamics.
An experimental comparison of prompt-based behavioral steering and activation steering in LLMs.
Qwen3-0.6B activation steering: style vectors, lens contamination eval, CPRR methodology
Experimental local assistant runtime for GGUF models that steers token generation with activation perturbations and verbal control loops for self-correction, continuity, and future memory-driven support.
Mechanistic interpretability experiments detecting "Evaluation Awareness" in LLMs - identifying if models internally represent being monitored
Official implementation of "Beyond Multiple Choice: Evaluating Steering Vectors for Adaptive Free-Form Summarization" (ICML 2025 R2FM Workshop).
Reshape how AI thinks, one slider at a time. Activation steering for local LLMs with visual sliders, presets, and a web UI.
Bio-inspired multi-scale competency architecture for LLMs. VCG auction routing (Mixture of Bidders) and activation steering for cognitive homeostasis, grounded in Michael Levin's TAME framework
Multi-Agent Evolutionary Simulation exploring adversarial economics and AI steering
Add a description, image, and links to the activation-steering topic page so that developers can more easily learn about it.
To associate your repository with the activation-steering topic, visit your repo's landing page and select "manage topics."