Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions docs/hirundo.cli_dataset_qa.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.. meta::
:http-equiv=Content-Security-Policy: default-src 'self', frame-ancestors 'none'

hirundo.cli_dataset_qa module
=============================

.. automodule:: hirundo.cli_dataset_qa
:members:
:undoc-members:
:show-inheritance:
10 changes: 10 additions & 0 deletions docs/hirundo.cli_eval.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.. meta::
:http-equiv=Content-Security-Policy: default-src 'self', frame-ancestors 'none'

hirundo.cli_eval module
=======================

.. automodule:: hirundo.cli_eval
:members:
:undoc-members:
:show-inheritance:
10 changes: 10 additions & 0 deletions docs/hirundo.cli_unlearning.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.. meta::
:http-equiv=Content-Security-Policy: default-src 'self', frame-ancestors 'none'

hirundo.cli_unlearning module
=============================

.. automodule:: hirundo.cli_unlearning
:members:
:undoc-members:
:show-inheritance:
73 changes: 73 additions & 0 deletions hirundo/_cli_common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
import re
import sys
from collections.abc import Callable
from enum import Enum
from typing import Any

import typer
from rich.console import Console
from rich.table import Table

_RUN_ID_RE = re.compile(r"^[a-zA-Z0-9_-]+$")

docs = "sphinx" in sys.modules
hirundo_epilog = (
None
if docs
else "Made with ❤️ by Hirundo. Visit https://www.hirundo.io for more information."
)

console = Console()


def make_app(name: str, help_text: str) -> typer.Typer:
return typer.Typer(
name=name,
no_args_is_help=True,
rich_markup_mode="rich",
epilog=hirundo_epilog,
help=help_text,
)


def validate_run_id(run_id: str) -> str:
if not _RUN_ID_RE.fullmatch(run_id):
console.print(
f"[red]Invalid run ID '{run_id}'. "
"Run IDs may only contain alphanumeric characters, hyphens, and underscores.[/red]"
)
raise typer.Exit(code=1) from None
Comment on lines +33 to +39
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_run_id() uses match() which can accept trailing newlines — should we switch to fullmatch(), and add unit tests in tests/test_cli_common.py covering valid/invalid inputs?

Finding types: Breaking Changes Add tests for behavior | Severity: 🟢 Low


Prompt for AI Agents:

In hirundo/_cli_common.py around lines 29-35, in the validate_run_id(run_id: str)
function: 1. Replace the `_RUN_ID_RE.match` call with `fullmatch` so the entire string
must consist of exactly the allowed characters, preventing trailing newlines or other
whitespace from slipping through. Ensure the error path still prints the same red error
message and that `validate_run_id` returns the original `run_id` only when it is
strictly canonical. 2. Add a focused unit test in `tests/test_cli_common.py` that
verifies: a valid run ID (containing only letters/numbers/hyphens/underscores) returns
the input unchanged, and an invalid run ID (e.g., containing a space, newline, or other
disallowed character) raises `typer.Exit` with code 1 and that the console output
contains `Invalid run ID '<value>'` and the "may only contain…" text. Use
whatever test utilities the project already uses to capture Rich/console output so the
assertion is deterministic.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 67dc58a addressed this comment by switching validate_run_id to rely on _RUN_ID_RE.fullmatch, ensuring trailing newlines no longer slip through; the requested tests for valid/invalid inputs still need to be added.

return run_id


def validate_enum(value: str, enum_cls: type[Enum], label: str) -> Any:
try:
return enum_cls(value.upper())
except ValueError:
valid = ", ".join(member.value for member in enum_cls)
console.print(f"[red]Invalid {label} '{value}'. Valid options: {valid}[/red]")
raise typer.Exit(code=1) from None


def wait_or_notify(
run_id: str, check_fn: Callable[[str], Any], cmd_name: str, wait: bool
) -> Any:
if wait:
return check_fn(run_id)
console.print(
f"Use [bold]hirundo {cmd_name} check[/bold] [italic]<run_id>[/italic] to monitor progress."
)
return None
Comment on lines +52 to +60
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_or_notify changed visible behavior, but the diff lacks focused tests for the wait=True/wait=False branches (including the emitted prompt), so should we add a fast unit test that covers both cases and asserts the return propagation/message?

Finding type: Add tests for behavior | Severity: 🟢 Low


Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In
hirundo/_cli_common.py around lines 52-60, the `wait_or_notify` helper introduces new
visible behavior for both `wait=True` (returns `check_fn(run_id)`) and `wait=False`
(prints the “Use hirundo … check …” guidance and returns None), but there are no
focused tests covering these branches. Add a fast unit test that calls `wait_or_notify`
directly with a stub `check_fn` and asserts the return value is propagated when
`wait=True`, and when `wait=False` it asserts the exact printed message (or a stable
substring) is emitted and the return is None. Reuse the repo’s existing test/CLI
helper utilities and rich `console` capturing mechanism if available (e.g., a fixture
that captures stdout/console output), and keep the test isolated from network/IO.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 9a4feff addressed this comment by adding focused tests for wait_or_notify, covering the wait=True return propagation and the wait=False console prompt/hint output.



def print_runs_table(
title: str,
columns: tuple[str, ...],
rows: list[tuple[str | None, ...]],
) -> None:
table = Table(title=title, expand=True)
for col in columns:
table.add_column(col, overflow="fold")
for row in rows:
table.add_row(*row)
console.print(table)
55 changes: 12 additions & 43 deletions hirundo/cli.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,16 @@
import os
import re
import sys
from pathlib import Path
from typing import Annotated
from urllib.parse import urlparse

import typer
from rich.console import Console
from rich.table import Table

from hirundo._cli_common import docs, hirundo_epilog, validate_run_id
from hirundo._env import API_HOST, EnvLocation

docs = "sphinx" in sys.modules
hirundo_epilog = (
None
if docs
else "Made with ❤️ by Hirundo. Visit https://www.hirundo.io for more information."
)

from hirundo.cli_dataset_qa import dataset_qa_app, dataset_qa_list
from hirundo.cli_eval import eval_app
from hirundo.cli_unlearning import unlearning_app

app = typer.Typer(
name="hirundo",
Expand All @@ -26,6 +19,10 @@
epilog=hirundo_epilog,
)

app.add_typer(eval_app, name="eval")
app.add_typer(dataset_qa_app, name="dataset-qa")
app.add_typer(unlearning_app, name="unlearning")


def _upsert_env(dotenv_filepath: str | Path, var_name: str, var_value: str):
"""
Expand Down Expand Up @@ -197,45 +194,17 @@ def check_run(
"""
from hirundo.dataset_qa import QADataset

results = QADataset.check_run_by_id(run_id)
print(f"Run results saved to {results.cached_zip_path}")
results = QADataset.check_run_by_id(validate_run_id(run_id))
if results is not None:
print(f"Run results saved to {results.cached_zip_path}")


@app.command("list-runs", epilog=hirundo_epilog)
def list_runs():
"""
List all runs available.
"""
from hirundo.dataset_qa import QADataset

runs = QADataset.list_runs()

console = Console()
table = Table(
title="Runs:",
expand=True,
)
cols = (
"Dataset name",
"Run ID",
"Status",
"Created At",
"Run Args",
)
for col in cols:
table.add_column(
col,
overflow="fold",
)
for run in runs:
table.add_row(
str(run.name),
str(run.id),
str(run.status),
run.created_at.isoformat(),
run.run_args.model_dump_json() if run.run_args else None,
)
console.print(table)
dataset_qa_list(archived=False)


typer_click_object = typer.main.get_command(app)
Expand Down
80 changes: 80 additions & 0 deletions hirundo/cli_dataset_qa.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
from typing import Annotated

import typer

from hirundo._cli_common import (
console,
hirundo_epilog,
make_app,
print_runs_table,
validate_run_id,
wait_or_notify,
)

dataset_qa_app = make_app("dataset-qa", "Launch and monitor Dataset QA runs.")


@dataset_qa_app.command("run", epilog=hirundo_epilog)
def dataset_qa_run(
dataset_id: Annotated[int, typer.Argument(help="ID of the dataset to run QA on.")],
wait: Annotated[
bool,
typer.Option(
"--wait/--no-wait", help="Wait for the run to complete and stream progress."
),
] = True,
):
"""
Launch a Dataset QA run on the dataset with the given ID.
"""
from hirundo.dataset_qa import QADataset

run_id = QADataset.launch_qa_run(dataset_id)
console.print(f"Dataset QA run started. Run ID: [bold]{run_id}[/bold]")

results = wait_or_notify(run_id, QADataset.check_run_by_id, "dataset-qa", wait)
if results is not None:
console.print(f"Run results saved to {results.cached_zip_path}")


@dataset_qa_app.command("list", epilog=hirundo_epilog)
def dataset_qa_list(
archived: Annotated[
bool,
typer.Option("--archived/--no-archived", help="Include archived runs."),
] = False,
):
"""
List Dataset QA runs.
"""
from hirundo.dataset_qa import QADataset

runs = QADataset.list_runs(archived=archived)
print_runs_table(
"Dataset QA Runs:",
("Dataset Name", "Run ID", "Status", "Created At", "Run Args"),
[
Comment on lines +52 to +56
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataset_qa_list duplicates QADataset.list_runs + table formatting from hirundo/cli.py:list_runs, should we reuse the existing table helper or delegate to list_runs for a single rendering source of truth?

Finding type: Code Dedup and Conventions | Severity: 🟢 Low

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 9a4feff addressed this comment by importing dataset_qa_list into hirundo/cli and delegating the top-level list command to dataset_qa_list(archived=False), ensuring the Dataset QA table rendering now comes from a single shared helper.

(
str(run.name),
str(run.run_id),
str(run.status),
run.created_at.isoformat(),
run.run_args.model_dump_json() if run.run_args else None,
)
for run in runs
],
)


@dataset_qa_app.command("check", epilog=hirundo_epilog)
def dataset_qa_check(
run_id: Annotated[str, typer.Argument(help="The run ID to check.")],
):
"""
Check the status of a Dataset QA run and stream progress.
"""
from hirundo.dataset_qa import QADataset

results = QADataset.check_run_by_id(validate_run_id(run_id))
if results is not None:
console.print(f"Run results saved to {results.cached_zip_path}")
Loading
Loading