SpecAuditor: Generating Audit Specifications for LLM-Driven Bug Detection

SpecAuditor is an end-to-end framework for generating audit specifications to guide LLM-driven bug detection.

The main motivation behind SpecAuditor is that direct LLM auditing is often ineffective and expensive. Without explicit guidance, LLMs tend to focus on familiar surface patterns and struggle to reason about project-specific semantics and rarer bug behaviors. SpecAuditor addresses this by automatically constructing audit specifications that tell the model:

where to audit
how to check

At a high level, SpecAuditor follows the three-stage design:

extract seed specifications from bug patches
generalize the seed specifications, retrieve semantically related entities from documentation, and generate new concrete specifications
use the generated specifications to drive code search and LLM-driven bug auditing

Pipeline

Stage	Goal	Main scripts
Seed extraction	Extract a target (entity) and predicate from a given patch and validate it	`scripts/spec_extract.py`, `scripts/spec_validator.py`
Generalization and generation	Generalize the seed rule, retrieve related entities, and generate new concrete specifications	`scripts/spec_generalize.py`, `scripts/similar_target_search.py`, `scripts/spec_generation.py`, `scripts/format_spec_generation_results.py`
Guided bug detection	Localize relevant code and audit it with the generated specifications	`scripts/bug_detection_threaded.py`

Repository Map

Path	Purpose
`scripts/`	Main SpecAuditor pipeline
`prompts/`	Prompt templates used by all stages
`get_docs/`	Offline documentation cache and prebuilt Chroma store
`artifact/`	scripts, datasets, and reference outputs for artifact evaluation

Installation

We provide a Dockerfile to set up the environment. Please refer to INSTALL.md for details.

Testing and Artifact Evaluation

Please refer to AE.md for detailed instructions.

Entry	Purpose
`artifact/functional/run.sh`	End-to-end functional run on one seed patch
`artifact/reproduced_generation/run.sh`	Reproduced extraction/generalization/generation run on subset
`artifact/reproduced_bug_detection/run.sh`	Reproduced bug-detection benchmark with integrated candidate localization and bug auditing
`artifact/reproduced_bug_detection/run_localization_check.sh`	Localization-only check that tests whether the expected buggy functions can be automatically surfaced before auditing.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
artifact		artifact
get_docs		get_docs
prompts		prompts
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
AE.md		AE.md
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
artifact_metadata.txt		artifact_metadata.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpecAuditor: Generating Audit Specifications for LLM-Driven Bug Detection

Pipeline

Repository Map

Installation

Testing and Artifact Evaluation

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpecAuditor: Generating Audit Specifications for LLM-Driven Bug Detection

Pipeline

Repository Map

Installation

Testing and Artifact Evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages