AAAIM — Auto-Annotator via AI for Modeling

AAAIM is an LLM-powered tool that annotates biosimulation models (SBML) with standardized ontology terms from ChEBI, NCBI Gene, UniProt, and KEGG.

Installation

Requirements: Python 3.12

pip install -r requirements.txt

Set at least one LLM API key (in your shell or a .env file):

OPENAI_API_KEY=<your-openai-key>          # gpt-4o-mini, gpt-4.1-nano
OPENROUTER_API_KEY=<your-openrouter-key>  # llama-3.3-70b (free tier available)
LLAMA_API_KEY=<your-llama-key>            # Llama-3.3-70B-Instruct

Quick Start

from core import annotate_model

# Annotate all species — entity types are detected automatically
recommendations_df, metrics = annotate_model(
    model_file="path/to/model.xml",
    entity_type="auto",               # detects chemical / gene / protein / complex
    database=["chebi", "uniprot"]     # databases to search
)

recommendations_df.to_csv("recommendations.csv", index=False)

Run the bundled example (uses a test SBML model):

python examples/simple_example.py

For models with existing annotations (curation/validation workflow):

from core import curate_model

curations_df, metrics = curate_model(
    model_file="path/to/model.xml",
    entity_type="chemical",
    database="chebi"
)
print(f"Accuracy: {metrics['accuracy']:.1%}")

Applying Annotation Recommendations

After reviewing the output CSV, edit the update_annotation column for each row:

Value	Effect
`add`	Add the recommended annotation
`delete`	Remove the existing annotation
`ignore` / `keep`	Leave unchanged

Then write the updated model:

from core.update_model import update_annotation

update_annotation(
    original_model_path="model.xml",
    recommendation_table="recommendations.csv",
    new_model_path="model_updated.xml"
)

Optional: RAG-based Search

By default, AAAIM uses direct dictionary matching (method="direct"). For semantic (embedding-based) search, use method="rag" — but you must build the vector index first.

One-time setup (builds embeddings for all databases, this may take a while depending on the size of your database):

python setup_rag.py                        # all databases, human (tax_id=9606)
python setup_rag.py --databases chebi      # ChEBI only
python setup_rag.py --tax_id 10090         # mouse

Then pass method="rag" to annotate_model() or curate_model().

Full Documentation

See docs/README.md for:

All parameters for annotate_model / curate_model
Per-database annotation examples
Evaluation utilities
Data file descriptions
Supported embedding models for RAG

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
AAAIM.egg-info		AAAIM.egg-info
core		core
data		data
docs		docs
examples		examples
tests		tests
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.yaml		config.yaml
kegg_pipeline.py		kegg_pipeline.py
recommendations_correctedChEBI.csv		recommendations_correctedChEBI.csv
requirements.txt		requirements.txt
setup_rag.py		setup_rag.py
simple_annotation_results.csv		simple_annotation_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AAAIM — Auto-Annotator via AI for Modeling

Installation

Quick Start

Applying Annotation Recommendations

Optional: RAG-based Search

Full Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AAAIM — Auto-Annotator via AI for Modeling

Installation

Quick Start

Applying Annotation Recommendations

Optional: RAG-based Search

Full Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages