Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ concurrency:
jobs:
ruff:
runs-on: ubuntu-latest
permissions:
contents: read
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
Expand All @@ -31,6 +33,8 @@ jobs:

pyright:
runs-on: ubuntu-latest
permissions:
contents: read
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
Expand Down
20 changes: 11 additions & 9 deletions .github/workflows/pytest-full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,24 @@ on:
jobs:
pytest_full:
runs-on: ubuntu-latest
permissions:
contents: read
strategy:
max-parallel: 2
matrix:
optimization-test:
data-qa-test:
- test: classification/classification_aws_test.py
env: RUN_CLASSIFICATION_AWS_OPTIMIZATION
env: RUN_CLASSIFICATION_AWS_DATA_QA
- test: classification/classification_gcp_test.py
env: RUN_CLASSIFICATION_GCP_OPTIMIZATION
env: RUN_CLASSIFICATION_GCP_DATA_QA
- test: classification/sanity_gcp_test.py
env: RUN_CLASSIFICATION_GCP_SANITY_OPTIMIZATION
env: RUN_CLASSIFICATION_GCP_SANITY_DATA_QA
- test: object-detection/od_aws_test.py
env: RUN_AWS_OD_OPTIMIZATION
env: RUN_AWS_OD_DATA_QA
- test: object-detection/od_git_test.py
env: RUN_OD_GIT_OPTIMIZATION
env: RUN_OD_GIT_DATA_QA
- test: tests/object-detection/sama_coco_test.py
env: RUN_COCO_OD_GCP_SANITY_OPTIMIZATION
env: RUN_COCO_OD_GCP_SANITY_DATA_QA
steps:
- uses: actions/checkout@v4
- name: Set up Python
Expand All @@ -38,7 +40,7 @@ jobs:
source .venv/bin/activate
pip install -r requirements/dev.txt -r requirements/polars.txt
- name: Run PyTest
run: .venv/bin/pytest tests/${{ matrix.optimization-test['test'] }}
run: .venv/bin/pytest tests/${{ matrix.data-qa-test['test'] }}
env:
API_HOST: ${{ secrets.API_HOST }}
API_KEY: ${{ secrets.API_KEY }}
Expand All @@ -47,4 +49,4 @@ jobs:
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
HUGGINGFACE_ACCESS_TOKEN: ${{ secrets.HUGGINGFACE_ACCESS_TOKEN }}
UNIQUE_ID: ${{ github.ref }}-${{ github.run_number }}
${{ matrix.optimization-test['env'] }}: true
${{ matrix.data-qa-test['env'] }}: true
2 changes: 2 additions & 0 deletions .github/workflows/pytest-sanity.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ concurrency:
jobs:
pytest_sanity:
runs-on: ${{ matrix.os }}
permissions:
contents: read
strategy:
max-parallel: 4
matrix:
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/safety-scan.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ concurrency:
jobs:
safety-scan:
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- name: Checkout hirundo-client
uses: actions/checkout@v4
Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Hirundo

This package exposes access to Hirundo APIs for dataset optimization for Machine Learning.
This package exposes access to Hirundo APIs for dataset QA for Machine Learning.

Dataset optimization is currently available for datasets labelled for classification and object detection.
Dataset QA is currently available for datasets labelled for classification and object detection.

Support dataset storage configs include:

Expand Down Expand Up @@ -73,7 +73,7 @@ Classification example:
from hirundo import (
HirundoCSV,
LabelingType,
OptimizationDataset,
QADataset,
StorageGCP,
StorageConfig,
StorageTypes,
Expand All @@ -84,7 +84,7 @@ gcp_bucket = StorageGCP(
project="Hirundo-global",
credentials_json=json.loads(os.environ["GCP_CREDENTIALS"]),
)
test_dataset = OptimizationDataset(
test_dataset = QADataset(
name="TEST-GCP cifar 100 classification dataset",
labeling_type=LabelingType.SINGLE_LABEL_CLASSIFICATION,
storage_config=StorageConfig(
Expand All @@ -99,7 +99,7 @@ test_dataset = OptimizationDataset(
classes=cifar100_classes,
)

test_dataset.run_optimization()
test_dataset.run_qa()
results = test_dataset.check_run()
print(results)
```
Expand All @@ -111,7 +111,7 @@ from hirundo import (
GitRepo,
HirundoCSV,
LabelingType,
OptimizationDataset,
QADataset,
StorageGit,
StorageConfig,
StorageTypes,
Expand All @@ -124,7 +124,7 @@ git_storage = StorageGit(
),
branch="main",
)
test_dataset = OptimizationDataset(
test_dataset = QADataset(
name="TEST-HuggingFace-BDD-100k-validation-OD-validation-dataset",
labeling_type=LabelingType.OBJECT_DETECTION,
storage_config=StorageConfig(
Expand All @@ -140,7 +140,7 @@ test_dataset = OptimizationDataset(
),
)

test_dataset.run_optimization()
test_dataset.run_qa()
results = test_dataset.check_run()
print(results)
```
Expand Down
4 changes: 2 additions & 2 deletions docs/hirundo.dataset_optimization.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
.. meta::
:http-equiv=Content-Security-Policy: default-src 'self', frame-ancestors 'none'

hirundo.dataset\_optimization module
hirundo.dataset\_qa module
====================================

.. automodule:: hirundo.dataset_optimization
.. automodule:: hirundo.dataset_qa
:members:
:undoc-members:
:show-inheritance:
2 changes: 1 addition & 1 deletion docs/hirundo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Submodules
:maxdepth: 4

hirundo.cli
hirundo.dataset_optimization
hirundo.dataset_qa
hirundo.enum
hirundo.git
hirundo.storage
Expand Down
18 changes: 11 additions & 7 deletions hirundo/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,15 @@
LabelingType,
StorageTypes,
)
from .dataset_optimization import (
from .dataset_qa import (
ClassificationRunArgs,
Domain,
HirundoError,
OptimizationDataset,
ObjectDetectionRunArgs,
QADataset,
RunArgs,
VisionRunArgs,
)
from .dataset_optimization_results import DatasetOptimizationResults
from .dataset_qa_results import DatasetQAResults
from .git import GitPlainAuth, GitRepo, GitSSHAuth
from .labeling import (
COCO,
Expand Down Expand Up @@ -40,9 +42,11 @@
"KeylabsObjDetVideo",
"KeylabsObjSegImages",
"KeylabsObjSegVideo",
"OptimizationDataset",
"QADataset",
"Domain",
"RunArgs",
"VisionRunArgs",
"ClassificationRunArgs",
"ObjectDetectionRunArgs",
"DatasetMetadataType",
"LabelingType",
"GitPlainAuth",
Expand All @@ -54,7 +58,7 @@
# "StorageAzure", TODO: Azure storage is coming soon
"StorageGit",
"StorageConfig",
"DatasetOptimizationResults",
"DatasetQAResults",
"load_df",
"load_from_zip",
]
Expand Down
2 changes: 1 addition & 1 deletion hirundo/_constraints.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

if TYPE_CHECKING:
from hirundo._urls import HirundoUrl
from hirundo.dataset_optimization import LabelingInfo
from hirundo.dataset_qa import LabelingInfo
from hirundo.storage import (
ResponseStorageConfig,
StorageConfig,
Expand Down
8 changes: 4 additions & 4 deletions hirundo/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,9 +198,9 @@ def check_run(
"""
Check the status of a run.
"""
from hirundo.dataset_optimization import OptimizationDataset
from hirundo.dataset_qa import QADataset

results = OptimizationDataset.check_run_by_id(run_id)
results = QADataset.check_run_by_id(run_id)
print(f"Run results saved to {results.cached_zip_path}")


Expand All @@ -209,9 +209,9 @@ def list_runs():
"""
List all runs available.
"""
from hirundo.dataset_optimization import OptimizationDataset
from hirundo.dataset_qa import QADataset

runs = OptimizationDataset.list_runs()
runs = QADataset.list_runs()

console = Console()
table = Table(
Expand Down
Loading
Loading