Skip to content

AISmithLab/CoBRA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

18 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

CoBRA Logo

CHI 2026 Best Paper Award arXiv License Under Active Development

Toward Precise and Consistent Agent Behaviors across Models Anchored by Validated Social Science Knowledge

๐ŸŒ Project Page: cobra.clawder.ai ย |ย  ๐Ÿ“„ Paper: arXiv 2509.13588

If you find CoBRA useful, please star โญ this repo to help others discover it!

English ็ฎ€ไฝ“ไธญๆ–‡

Demo_Video.mp4

๐Ÿ’ก What is Cognitive Bias?

Systematic deviations from rational judgment in human cognition and decision-making. For example, Framing Effect: "90% survival rate" vs. "10% mortality rate" โ€” logically identical, yet people make different choices based on how information is framed.


Reproducibility and controllability are fundamental to scientific research. Yet implicit natural language descriptions โ€” the dominant approach for specifying social agent behaviors in nearly all LLM-based social simulations โ€” often fail to yield consistent behavior across models or capture the nuances of the descriptions.

CoBRA (Cognitive Bias Regulator for Social Agents) is a novel toolkit that lets researchers explicitly specify desired nuances in LLM-based agents and obtain consistent behavior across models.

Through CoBRA, we show how to operationalize validated social science knowledge as reusable "gym" environments for AI โ€” an approach that generalizes to richer social and affective simulations.

CoBRA Overview
The problem and our solution: from inconsistent agent behaviors under implicit specifications to explicit, quantitative control.


At the heart of CoBRA is a novel closed-loop system with two core components:

  • Cognitive Bias Index โ€” measures the cognitive bias of a social agent by quantifying its reactions in validated classic social science experiments
  • Behavioral Regulation Engine โ€” aligns the agent's behavior to exhibit controlled cognitive bias, via three control methods:
    • Prompt Engineering (input space control)
    • Representation Engineering (activation space control)
    • Fine-tuning (parameter space control)

CoBRA Workflow
Example: A researcher specifies a target bias level โ†’ CoBRA measures it via classic experiments โ†’ iteratively adjusts the agent until it reliably exhibits the desired bias.

Quick Start (3 Steps)

# 1. Install dependencies
pip install -r requirements.txt

# 2. Navigate to the unified bias control module
cd examples/unified_bias

# 3. Run a bias experiment
python pipelines.py --bias authority --method repe-linear --model Mistral-7B

That's it. The system will measure and control the agent's Authority Effect bias.

Repository Structure

CoBRA/
โ”œโ”€โ”€ control/                    # Core bias control engine
โ”œโ”€โ”€ examples/
โ”‚   โ”œโ”€โ”€ unified_bias/           # Main entry point (START HERE)
โ”‚   โ”‚   โ”œโ”€โ”€ pipelines.py        # Unified experiment runner
โ”‚   โ”‚   โ”œโ”€โ”€ run_pipelines.py    # CLI interface
โ”‚   โ”‚   โ”œโ”€โ”€ ablation/           # Ablation studies
โ”‚   โ”‚   โ””โ”€โ”€ README.md           # Full usage guide
โ”‚   โ”œโ”€โ”€ authority/              # Authority Effect utils
โ”‚   โ”œโ”€โ”€ bandwagon/              # Bandwagon Effect utils
โ”‚   โ”œโ”€โ”€ confirmation/           # Confirmation Bias utils
โ”‚   โ””โ”€โ”€ framing/                # Framing Effect utils
โ”œโ”€โ”€ generator/                  # Data generation utilities
โ”œโ”€โ”€ data_generated/             # Generated experimental data
โ”œโ”€โ”€ webdemo/                    # Web demonstration interface
โ””โ”€โ”€ requirements.txt            # Python dependencies

Key Components

Component Description Documentation
Cognitive Bias Index Measures bias strength via classic experiments data/data_README.md
Behavioral Regulation Engine Three control methods (Prompt/RepE/Finetune) control/control_README.md
Unified Pipeline Run full experiments with one command examples/unified_bias/README.md
Ablation Studies Test model/persona/temperature sensitivity examples/unified_bias/ablation/README.md
Data Generator Create custom bias scenarios and responses generator/README.md

Supported Biases & Experiments

Bias Type Paradigms Data Directory Control Range
Authority Effect Milgram Obedience, Stanford Prison data/authority/ 0-4 scale
Bandwagon Effect Asch's Line, Hotel Towel data/bandwagon/ 0-4 scale
Confirmation Bias Wason Selection, Biased Information data/confirmation/ 0-4 scale
Framing Effect Asian Disease, Investment/Insurance data/framing/ 0-4 scale

Citation

If you use CoBRA in your research, please cite our paper:

@article{liu2025cobra,
  title={CoBRA: Programming Cognitive Bias in Social Agents Using Classic Social Science Experiments},
  author={Liu, Xuan and Shang, Haoyang and Jin, Haojian},
  journal={arXiv preprint arXiv:2509.13588},
  year={2025}
}

Paper Link: https://arxiv.org/abs/2509.13588

License

MIT License - see LICENSE for details

Contact

For questions, please contact the corresponding author Xuan Liu at xul049@ucsd.edu, or file a GitHub Issue to report bugs and request features.


Need help? Check examples/unified_bias/README.md for detailed walkthroughs. The finetuning code is in the finetuning branch.

Releases

No releases published

Packages

 
 
 

Contributors