SDK-89: Validate test LLM and judge LLM can be accessed from HuggingFace by benglewis · Pull Request #214 · Hirundo-io/hirundo-python-sdk

benglewis · 2026-02-11T21:24:33Z

User description

Codex generated this pull request, but encountered an unexpected error after generation. This is a placeholder PR message.

Codex Task

Note

Medium Risk
Adds new outbound HuggingFace API calls on critical workflows (LlmModel.create, eval run launch), which can introduce latency or new failure modes if HF is unavailable or tokens are misconfigured.

Overview
Adds pre-flight HuggingFace access validation for both LLM creation and LLM behavior eval runs, surfacing clearer HirundoError messages for gated/private/missing/unauthorized models and skipping validation when the judge model is a local path.

Extends model source outputs to carry an optional HuggingFace token, introduces a new _model_access.py helper built on huggingface_hub, updates several Pydantic models’ model_config to protect model_validate/model_dump, and adds unit tests plus the new huggingface-hub dependency.

^{Written by Cursor Bugbot for commit d236474. This will update automatically on new commits. Configure here.}

Generated description

Below is a concise technical summary of the changes proposed in this PR:
Validate HuggingFace-hosted LLMs and judge models before use by reusing the new _model_access helper during LlmModel.create and LlmBehaviorEval.launch_eval_run, surfacing clearer HirundoError messages when gated, private, or unauthorized models are encountered. Update environment helpers so feature-gated tests rely on get_env_bool and centralize boolean flags while adding the huggingface-hub dependency for the new API calls.

Topic Details

Env Flags & Tests

Leverage get_env_bool for the shared QA/eval tests and document pytest-only guidance so long-running flows gate on consistent boolean flags instead of raw os.getenv calls.

Modified files (5)

AGENTS.md
hirundo/_env.py
tests/dataset_qa_shared.py
tests/llm-behavior-eval/llm_behavior_eval_test.py
tests/unlearning-llm/unlearn_llm_behavior_test.py

Latest Contributors(1)

User	Commit	Date
blewis@hirundo.io	SDK-87: Migrate to `uv...	February 11, 2026

HF Access Validation

Validate HuggingFace model access for LLM creation and behavior eval workflows by wiring validate_huggingface_model_access/validate_judge_model_access into LlmModel, LlmBehaviorEval, and their supporting Pydantic configs, plus covering the new logic with targeted unit tests and the huggingface-hub dependency.

Modified files (10)

hirundo/_llm_sources.py
hirundo/_model_access.py
hirundo/llm_behavior_eval.py
hirundo/llm_behavior_eval_results.py
hirundo/unlearning_llm.py
pyproject.toml
tests/test_llm_behavior_eval_model_access.py
tests/test_model_access.py
tests/test_unlearning_llm_model_create.py
uv.lock

Latest Contributors(1)

User	Commit	Date
blewis@hirundo.io	SDK-79: Add LLM behavi...	February 04, 2026

This pull request is reviewed by Baz. Review like a pro on (Baz).

…ate-llm-and-judge-model-can-be # Conflicts: # pyproject.toml # uv.lock

…sues)

Hardcoded token=None ignores model's stored token & Unreachable hint="private" message branch is dead code

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Thank you Dependabot

baz-reviewer · 2026-03-12T09:40:23Z

Spec Reviewer Report 📪 ✅

Checkout in Baz

All 2 Identified Requirements Met for Ticket:

Validate LLM and judge model can be accessed publicly / using token provided

2 met requirements

#	Requirement	Explanation
1	Validate HF access before LLM creation/eval	LLM creation and behavior eval launches now invoke helpers that call HuggingFace model_info and raise HirundoError when access cannot be confirmed, preventing the operation. evidence hirundo/unlearning_llm.py:53-63 checks HuggingFace access before create hirundo/llm_behavior_eval.py:157-171 validates judge/LLM HF models before run hirundo/_model_access.py:64-124 model_info call raises HirundoError when inaccessible
2	Explain why HuggingFace token is required in access errors	The new HuggingFace access validator raises HirundoError messages that use gated/not-found/unauthorized hints and fall back to a generic guidance string, ensuring users understand when a token or different ID is needed. evidence hirundo/_model_access.py:15-55 – gated/private/unauthorized hint builder hirundo/_model_access.py:64-113 – validator maps HF errors to those hints with token awareness hirundo/llm_behavior_eval.py:157-249 – run launch validates judge and LLM access before API call hirundo/unlearning_llm.py:53-75 – LLM creation now checks HuggingFace access before posting tests/test_model_access.py:32-73 – unit tests assert gated/private/unauthorized messaging

Note: Some optional integrations are missing, so it might not be possible to check some of the requirements.
For best results, make sure the following are integrated: Figma

Used resources:
Hash: dc4df91 | Ticket: link

To rerun the Spec Reviewer, comment "baz rerun spec review".

…s.py` to use `pytest` instead of `unittest`

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce40b0a1ab

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

orr-hirundo

LGTM

…or eval with a model ID

…on & allow skipping HuggingFace validation

SDK-89: Use huggingface_hub for model access validation

356754b

benglewis added the codex label Feb 11, 2026 — with ChatGPT Codex Connector

benglewis changed the title ~~Codex-generated pull request~~ SDK-89: Validate test LLM and judge LLM can be accessed from HuggingFace Feb 11, 2026

cursor Bot reviewed Feb 11, 2026

View reviewed changes

Comment thread hirundo/llm_behavior_eval.py

benglewis self-assigned this Feb 12, 2026

Merge branch 'main' into codex/2026-02-11/linear-mention-sdk-89-valid…

b1033e3

…ate-llm-and-judge-model-can-be # Conflicts: # pyproject.toml # uv.lock

cursor Bot reviewed Feb 12, 2026

View reviewed changes

Comment thread hirundo/_model_access.py Outdated

benglewis added 3 commits February 12, 2026 17:15

Bump huggingface-hub version to minimum 1.0.0 (for compatibility is…

4d84127

…sues)

Fix agent PR comments

5bb0211

Hardcoded token=None ignores model's stored token & Unreachable hint="private" message branch is dead code

Fix Pydantic model configuration

d236474

cursor Bot reviewed Feb 15, 2026

View reviewed changes

Comment thread hirundo/_model_access.py

Update authlib to 1.6.7 to fix vulnerability

dc4df91

Thank you Dependabot

baz-reviewer Bot reviewed Mar 12, 2026

View reviewed changes

Comment thread tests/test_model_access.py Outdated

Comment thread tests/test_model_access.py Outdated

benglewis added 2 commits March 12, 2026 12:43

Add docstring to new validation functions

1e5c8ca

Change test_llm_behavior_eval_model_access.py and `test_model_acces…

9c6bc59

…s.py` to use `pytest` instead of `unittest`

baz-reviewer Bot reviewed Mar 12, 2026

View reviewed changes

Comment thread tests/test_model_access.py Outdated

Fix Ruff error

ce40b0a

benglewis marked this pull request as ready for review March 13, 2026 16:33

benglewis requested review from a team as code owners March 13, 2026 16:33

chatgpt-codex-connector Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread hirundo/_model_access.py Outdated

Fix Codex's PR comment about model error

b09582d

baz-reviewer Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread hirundo/unlearning_llm.py Outdated

Comment thread hirundo/llm_behavior_eval.py

benglewis mentioned this pull request Mar 13, 2026

Bump authlib from 1.6.6 to 1.6.7 in the uv group across 1 directory #215

Closed

orr-hirundo previously approved these changes Mar 17, 2026

View reviewed changes

Fix Baz comment about validation when setting model to run LLM behavi…

8287c9e

…or eval with a model ID

benglewis dismissed orr-hirundo’s stale review via 8287c9e March 29, 2026 22:20

Refactor to replace boolean environment variable checks with a functi…

8616eab

…on & allow skipping HuggingFace validation

baz-reviewer Bot reviewed Mar 30, 2026

View reviewed changes

Comment thread hirundo/_env.py

baz-reviewer Bot added the baz approved label Apr 7, 2026

baz-reviewer Bot approved these changes Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK-89: Validate test LLM and judge LLM can be accessed from HuggingFace#214

SDK-89: Validate test LLM and judge LLM can be accessed from HuggingFace#214
benglewis wants to merge 12 commits into
mainfrom
codex/2026-02-11/linear-mention-sdk-89-validate-llm-and-judge-model-can-be

benglewis commented Feb 11, 2026 •

edited by baz-reviewer Bot

Loading

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

baz-reviewer Bot commented Mar 12, 2026

Validate LLM and judge model can be accessed publicly / using token provided

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

orr-hirundo left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

benglewis commented Feb 11, 2026 • edited by baz-reviewer Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Generated description

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

baz-reviewer Bot commented Mar 12, 2026

Spec Reviewer Report 📪 ✅

All 2 Identified Requirements Met for Ticket:

Validate LLM and judge model can be accessed publicly / using token provided

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

orr-hirundo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

benglewis commented Feb 11, 2026 •

edited by baz-reviewer Bot

Loading