DALA — Dynamic Auction-based Language Agent

Reference implementation of Cost-Effective Communication: Auction-based Language Agent Interaction (Fan et al., 2025).

Multi-agent LLM systems tend to over-communicate: every agent talks every round, every message costs tokens, and the bill grows super-linearly with the number of agents. DALA reframes inter-agent communication as an economic resource-allocation problem:

Each agent drafts a candidate message and submits a bid proportional to the estimated informational utility of that message.
A central auctioneer runs a (first-price / second-price / softmax) auction and picks the winning speaker(s) for the round.
Losing messages are suppressed — their completion tokens are never charged to the global token budget, only the prompt tokens needed to generate the bid. The winner pays their completion tokens.
The loop repeats until the budget is exhausted or max_rounds elapses.

A small learnable head (one bid-aggressiveness coefficient α per agent) is tuned via a REINFORCE-style signal on the validation reward accuracy − λ · tokens/budget, matching the ablation in §4.3 of the paper.

📐 Repository layout

DALA/
├── src/dala/
│   ├── agents/        # bidder agent + auctioneer dialogue driver
│   │   ├── base.py
│   │   ├── bidder.py
│   │   └── auctioneer.py
│   ├── auction/       # mechanisms, budget tracker, utility estimators
│   │   ├── mechanism.py   # FirstPrice / SecondPrice / Softmax
│   │   ├── budget.py      # BudgetTracker + BudgetExhausted
│   │   └── utility.py     # Logprob / Entropy / Heuristic
│   ├── llm/           # pluggable backends
│   │   ├── base.py
│   │   ├── mock_backend.py        # deterministic, for CI / smoke runs
│   │   ├── openai_backend.py      # OpenAI chat-completions
│   │   ├── openrouter_backend.py  # OpenRouter (any upstream model)
│   │   ├── hf_backend.py          # transformers + apply_chat_template
│   │   └── registry.py
│   ├── data/          # Lightning DataModules for MMLU, GSM8K, HumanEval
│   ├── models/        # DALALightningModule
│   ├── metrics/       # accuracy + token-cost + cost-efficiency
│   ├── utils/         # seeding, logging, answer parsing
│   └── config.py      # YAML → DataModule + LightningModule
├── configs/           # default.yaml, mmlu.yaml, gsm8k.yaml, humaneval.yaml,
│                      # budget_sweep.yaml, openrouter.yaml
├── scripts/           # train.py, eval.py, run_experiment.py, budget_sweep.py
├── tests/             # 33 unit + end-to-end tests
└── experiments/       # smoke.sh

🚀 Quick start

pip install -e .[dev]
pytest -q                                                # 33 tests, all green
bash experiments/smoke.sh                                # 4-example smoke run
python scripts/run_experiment.py --config configs/default.yaml

Everything above runs fully offline on the built-in MockBackend — no API keys required. That's what makes CI deterministic and lets you iterate on the auction logic without spending a cent.

Train the bidding policy

python scripts/train.py --config configs/mmlu.yaml

This fits the per-agent α coefficients on the training split and reports val/accuracy, val/mean_tokens, and val/cost_efficiency.

Budget-vs-accuracy sweep (Figure 3)

python scripts/budget_sweep.py --config configs/budget_sweep.yaml \
                               --output outputs/budget_sweep.json

Runs DALA at budgets {256, 512, 1024, 1536, 2048} tokens/example and dumps the accuracy-cost curve to JSON.

🔌 LLM backends

Backend	YAML key	Install	Auth
Mock	`mock`	(built-in)	—
OpenAI	`openai`	`pip install dala[openai]`	`OPENAI_API_KEY`
OpenRouter	`openrouter`	`pip install dala[openai]`	`OPENROUTER_API_KEY`
HuggingFace	`hf`	`pip install dala[hf]`	—

OpenRouter

OpenRouter is a single OpenAI-compatible endpoint that proxies dozens of upstream providers (Anthropic, OpenAI, Mistral, Together, Groq, DeepSeek, …). One API key, one line of YAML, any model:

model:
  backend: openrouter
  backend_kwargs:
    model: anthropic/claude-3.5-sonnet      # or openai/gpt-4o-mini, meta-llama/llama-3.1-70b-instruct, ...
    http_referer: https://github.com/waltstephen/Cost-Effective-Communication
    app_title: DALA

export OPENROUTER_API_KEY=sk-or-...
python scripts/run_experiment.py --config configs/openrouter.yaml

The backend gracefully falls back to a word-count token estimate when an upstream provider doesn't return usage metadata, so budget accounting stays honest.

Switching backends from code

from dala.llm import get_backend, list_backends

print(list_backends())            # ['hf', 'mock', 'openai', 'openrouter']
b = get_backend("openrouter", model="openai/gpt-4o-mini")

🧩 Configuration

Every experiment is a single YAML file composed of three sections:

data:
  name: mmlu                       # mmlu | gsm8k | humaneval
  max_examples: 64
  factory:
    use_hf: true                   # flip to false for offline fallback

model:
  n_agents: 3
  mechanism: second_price          # first_price | second_price | softmax
  max_rounds: 4
  top_k: 1                         # speakers per round
  budget: 2048                     # global token budget per example
  backend: mock                    # mock | openai | openrouter | hf
  specialties: [generalist, math, code]
  alpha_init: 1.0
  lr: 0.01

trainer:                           # forwarded to pl.Trainer
  max_epochs: 3
  accelerator: cpu

📊 Metrics

DALA tracks three things per run:

Accuracy — task-specific (MCQ letter match for MMLU, numeric match for GSM8K, return-presence heuristic for HumanEval smoke runs).
Token cost — total prompt+completion tokens charged to the BudgetTracker, broken down per agent.
Cost efficiency — the headline number: accuracy / (mean_tokens / 1000).

🧪 Testing

pytest -q                          # 33 tests, ~1s on CPU
pytest --cov=dala --cov-report=term-missing

Tests cover the auction mechanisms (winner selection, clearing price, reserve price, top-k, softmax determinism), the budget tracker (charging, exhaustion, reset), utility estimators, the mock backend's determinism, the answer parsers, each dataset's offline fallback, the backend registry, and an end-to-end auctioneer → Lightning-module smoke path.

📄 Citation

@article{fan2025dala,
  title   = {Cost-Effective Communication: Auction-based Language Agent Interaction},
  author  = {Fan, Yijia and Zhang, Jusheng and Cai, Kaitong and Yang, Jing
             and Tang, Chengpei and Wang, Jian and Wang, Keze},
  journal = {arXiv preprint arXiv:2511.13193},
  year    = {2025}
}

📜 License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
configs		configs
experiments		experiments
scripts		scripts
src/dala		src/dala
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DALA — Dynamic Auction-based Language Agent

📐 Repository layout

🚀 Quick start

Train the bidding policy

Budget-vs-accuracy sweep (Figure 3)

🔌 LLM backends

OpenRouter

Switching backends from code

🧩 Configuration

📊 Metrics

🧪 Testing

📄 Citation

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DALA — Dynamic Auction-based Language Agent

📐 Repository layout

🚀 Quick start

Train the bidding policy

Budget-vs-accuracy sweep (Figure 3)

🔌 LLM backends

OpenRouter

Switching backends from code

🧩 Configuration

📊 Metrics

🧪 Testing

📄 Citation

📜 License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages