Exa Search Benchmarks

Open benchmarks for evaluating search APIs.

Benchmarks

Benchmark	Queries	Tracks	Description
WebCode	~840	Contents, Highlights, RAG, E2E	Code docs extraction, query-aware highlights, long-context QA
People Search	1,400	Retrieval	Find people profiles by role, location, seniority
Company Search	~800	Retrieval + RAG	Find companies by name, industry, geography, funding

WebCode Results

Contents — extraction fidelity against golden markdown (250 URLs)

Searcher	Completeness	Accuracy	Structure	Signal	Code Recall	Table Recall	ROUGE-L
Exa	82.8	89.3	81.8	94.5	96.7	91.9	83.2
Parallel	74.2	89.2	80.8	77.6	94.1	92.2	73.7
Claude	59.8	81.1	75.1	55.1	82.4	82.0	66.8

Highlights — in-document retrieval given a URL + query (250 queries)

Searcher	Groundedness	Correctness	Avg Tokens
Exa	94.8	93.2	696
Parallel	85.6	86.4	858
Claude	81.5	85.9	319

RAG — full-web retrieval + synthesis (307 queries)

Searcher	Groundedness	Avg Tokens	Citation Prec.
Exa	79.4	688	0.259
Brave	76.3	1229	0.328
Parallel	75.3	622	0.168
Perplexity	64.6	754	0.220
Tavily	61.1	464	0.159

See webcode-benchmark/ for details and blog post.

People Search Results

Searcher	R@1	R@10	Precision	Queries
exa	72.0%	94.5%	63.3%	1399
brave	44.4%	77.9%	30.2%	1373
parallel	20.8%	74.7%	26.9%	1387

Company Search Results

Two tracks designed to separate retrieval from fact extraction.

Retrieval Track — Ranked lists of companies matching criteria (named lookup, attribute filtering, funding queries, composite constraints, semantic descriptions).

Searcher	R@1	R@5	R@10	Precision
exa	61.8%	90.6%	94.2%	65.9%
brave	35.9%	61.8%	72.9%	39.2%
parallel	36.6%	66.3%	78.6%	40.4%

RAG Track — Extract specific facts (founding year, employee count, funding rounds, founders). Static facts use exact-match; dynamic facts get ±20% tolerance.

Searcher	Accuracy
exa	79%
brave	65%
parallel	66%

Quick Start

git clone https://github.com/exa-labs/benchmarks.git
cd benchmarks

WebCode Benchmark

cd webcode-benchmark
uv sync

export EXA_API_KEY="your-key"
export OPENAI_API_KEY="your-key"

python -m evals.contents --searchers exa tavily parallel --limit 20
python -m evals.highlights --searchers exa tavily parallel --limit 20
python -m evals.rag --searchers exa brave perplexity --limit 20
python -m evals.e2e --info

People Benchmark

cd simple-people-benchmark
uv sync

export EXA_API_KEY="your-key"
export OPENAI_API_KEY="your-key"

pbench --limit 50

Company Benchmark

cd simple-company-benchmark
uv sync

export EXA_API_KEY="your-key"
export OPENAI_API_KEY="your-key"

cbench --limit 50
cbench --track retrieval
cbench --track rag

Implementing Your Own Searcher

All benchmarks use the same Searcher interface:

from shared.searchers import Searcher, SearchResult

class MySearcher(Searcher):
    name = "my-search"
    
    async def search(self, query: str, num_results: int = 10) -> list[SearchResult]:
        response = await my_api.search(query, limit=num_results)
        return [
            SearchResult(url=r.url, title=r.title, text=r.snippet)
            for r in response.results
        ]
    
    async def extract(self, url: str, query: str | None = None) -> list[SearchResult]:
        content = await my_api.extract(url)
        return [SearchResult(url=url, text=content)]

The search method is used by retrieval and RAG evals. The extract method is used by the contents and highlights evals for URL-based extraction.

Requirements

Python 3.11+
OpenAI API key (for LLM grading)
Search API credentials

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
shared		shared
simple-company-benchmark		simple-company-benchmark
simple-people-benchmark		simple-people-benchmark
webcode-benchmark		webcode-benchmark
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exa Search Benchmarks

Benchmarks

WebCode Results

People Search Results

Company Search Results

Quick Start

WebCode Benchmark

People Benchmark

Company Benchmark

Implementing Your Own Searcher

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages

Folders and files

Latest commit

History

Repository files navigation

Exa Search Benchmarks

Benchmarks

WebCode Results

People Search Results

Company Search Results

Quick Start

WebCode Benchmark

People Benchmark

Company Benchmark

Implementing Your Own Searcher

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Languages

Packages