Unpaved: AI Developer Tool Bias Audit Toolkit

AI developer tools were built for one kind of road. Most of us build on a different one.

When a developer in Lagos tries to integrate Flutterwave and gets responses optimized for Stripe, or a team in Manila builds offline-first mobile apps only to have their AI assistant default to cloud-native architectures, the cost isn't abstract. It's measured in hours of rework, weeks of debugging, and the quiet accumulation of "that's just how it is" that Western toolmakers never have to feel. The dominant AI coding assistants were trained on codebases written in Silicon Valley offices, deployed on American cloud infrastructure, and tested against user behaviors that don't include 2G networks in rural Kenya or USSD menus in Jakarta.

This bias shows up in three places: API references (models assume Western payment processors), architecture patterns (suggestions assume always-online connectivity), and documentation context (examples assume high-bandwidth, high-compute environments). The developers who pay are those building for markets where these assumptions don't hold—because they are the markets, not the edge cases.

Unpaved is an open-source toolkit for measuring and documenting this bias. We provide standardized benchmarks, prompt guides, and a result submission schema so that anyone—anywhere—can run audits against their AI tool of choice and produce comparable data. The goal is not to blame individual models, but to build a collective evidence base that makes the problem visible, measurable, and solvable.

Why Unpaved?

The name is the point. Most of the world's developers build on infrastructure, APIs, and constraints that the dominant AI tools have never encountered. Unpaved is where they work. It is also what we are changing.

We don't want different tools for "other" developers. We want the tools to work for everyone. That starts with measurement.

Directory Structure

Directory	Description
`benchmarks/`	Standardized benchmark tasks for testing AI tool responses
`benchmarks/payment-apis/`	Integration tasks for African and Asian payment APIs
`benchmarks/mobile-money/`	USSD flow and mobile money integration benchmarks
`benchmarks/infrastructure/`	Low-bandwidth and offline-first architecture tasks
`benchmarks/compliance/`	Data protection compliance tasks for African jurisdictions
`results/`	Schema and examples for community-submitted benchmark results
`prompt-guides/`	Detailed prompting instructions for consistent benchmark execution
`dataset-guide/`	Guides for contributing data and engaging regional communities
`tools/`	CLI utilities for scoring and validating benchmark results
`.github/`	Issue and PR templates for contributions

How to Contribute

Run a benchmark: Pick a benchmark task from benchmarks/, use the corresponding prompt guide, and test your AI tool of choice.
Submit your result: Use the results/example-result.json format and submit via GitHub Issues using the benchmark-result template.
Add new benchmarks: If you're working with APIs or patterns we haven't covered, create a new benchmark task and submit via the new-benchmark-task template.
Improve prompt guides: Found a better way to prompt for a specific task? Submit your methodology via the prompt-guide-submission template.

Current Benchmark Coverage

Category	API/Region	Coverage
Payment APIs	Flutterwave (Nigeria)	Active
Payment APIs	M-Pesa Daraja (Kenya)	Active
Payment APIs	Paystack (Nigeria/Ghana)	Active
Payment APIs	bKash (Bangladesh)	Active
Mobile Money	USSD Flows (Regional)	Active
Infrastructure	Low-Bandwidth Patterns	Active
Infrastructure	Offline-First Architecture	Active
Compliance	NDPR (Nigeria)	Active
Compliance	PDPA (Kenya)	Active

Results So Far

Tool	Benchmark Category	Pass Rate	Avg Time to Correct	Regions Covered
Coming soon	Community data needed	-	-	Add your results!

Built in Kampala. Built for the world. The tools should be too.

Moses Wekesa — Founder, Digital Talisman, Kampala, Uganda

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
benchmarks		benchmarks
dataset-guide		dataset-guide
prompt-guides		prompt-guides
results		results
tools		tools
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFESTO.md		MANIFESTO.md
README.md		README.md
ROADMAP.md		ROADMAP.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unpaved: AI Developer Tool Bias Audit Toolkit

Why Unpaved?

Directory Structure

How to Contribute

Current Benchmark Coverage

Results So Far

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Unpaved: AI Developer Tool Bias Audit Toolkit

Why Unpaved?

Directory Structure

How to Contribute

Current Benchmark Coverage

Results So Far

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages