Skip to content

feat: Add InputPath / OutputPath types to tesseract_core.runtime.experimental, deprecate InputFileReference + OutputFileReference#555

Merged
nmheim merged 45 commits intomainfrom
nh/path-references
Apr 21, 2026
Merged

feat: Add InputPath / OutputPath types to tesseract_core.runtime.experimental, deprecate InputFileReference + OutputFileReference#555
nmheim merged 45 commits intomainfrom
nh/path-references

Conversation

@nmheim
Copy link
Copy Markdown
Contributor

@nmheim nmheim commented Apr 6, 2026

Description of changes

  • Adds InputPath / OutputPath to support both files and directories (previously file-only). The old *FileReference names still work but emit a DeprecationWarning.
  • examples/file_io/ — replaces the old filereference example

Testing done

  • tesseract_core/runtime/testing/regression.pyTestSpec gains skip_output_path_checks flag so test cases with *PathReference outputs can skip output schema validation. Otherwise OutputSchema tries to validate files that are not there yet.
  • tests/endtoend_tests/test_examples.py — updated to use a tmp_path output directory for the path-reference example (avoids polluting the example tree).

@PasteurBot
Copy link
Copy Markdown
Contributor

PasteurBot commented Apr 6, 2026

CLA signatures confirmed

All contributors have signed the Contributor License Agreement.
Posted by the CLA Assistant Lite bot.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.14%. Comparing base (8cd5d69) to head (9435c61).

Files with missing lines Patch % Lines
tesseract_core/runtime/experimental.py 74.13% 12 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #555      +/-   ##
==========================================
+ Coverage   76.95%   77.14%   +0.18%     
==========================================
  Files          32       32              
  Lines        4409     4450      +41     
  Branches      730      735       +5     
==========================================
+ Hits         3393     3433      +40     
+ Misses        716      714       -2     
- Partials      300      303       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nmheim nmheim force-pushed the nh/path-references branch from 9b11ec5 to 8cde6f2 Compare April 6, 2026 16:27
…utputPathReference

Generalizes file references to accept any existing filesystem path (file or
directory). Removes the is_file() constraint from input validation; existence
check is preserved. Output validator is unchanged.
@PasteurBot
Copy link
Copy Markdown
Contributor

PasteurBot commented Apr 6, 2026

Benchmark Results

Benchmarks use a no-op Tesseract to measure pure framework overhead.

🚀 0 faster, ⚠️ 0 slower, ✅ 36 unchanged

✅ No significant performance changes detected.

Full results
Benchmark Baseline Current Change Status
api/apply_1,000 0.398ms 0.397ms -0.2%
api/apply_100,000 0.396ms 0.400ms +0.9%
api/apply_10,000,000 0.400ms 0.396ms -1.0%
cli/apply_1,000 1546.323ms 1563.886ms +1.1%
cli/apply_100,000 1580.993ms 1574.373ms -0.4%
cli/apply_10,000,000 1637.420ms 1626.263ms -0.7%
decoding/base64_1,000 0.028ms 0.028ms +1.4%
decoding/base64_100,000 0.826ms 0.828ms +0.2%
decoding/base64_10,000,000 118.681ms 118.658ms -0.0%
decoding/binref_1,000 0.165ms 0.167ms +0.9%
decoding/binref_100,000 0.259ms 0.257ms -0.7%
decoding/binref_10,000,000 20.388ms 20.186ms -1.0%
decoding/json_1,000 0.090ms 0.093ms +3.0%
decoding/json_100,000 8.469ms 8.528ms +0.7%
decoding/json_10,000,000 1056.158ms 1050.526ms -0.5%
encoding/base64_1,000 0.034ms 0.034ms +0.1%
encoding/base64_100,000 0.188ms 0.187ms -0.7%
encoding/base64_10,000,000 48.868ms 48.276ms -1.2%
encoding/binref_1,000 0.227ms 0.228ms +0.4%
encoding/binref_100,000 0.380ms 0.387ms +1.8%
encoding/binref_10,000,000 23.577ms 23.112ms -2.0%
encoding/json_1,000 0.118ms 0.118ms +0.2%
encoding/json_100,000 11.233ms 10.836ms -3.5%
encoding/json_10,000,000 1217.632ms 1210.218ms -0.6%
http/apply_1,000 2.873ms 2.954ms +2.8%
http/apply_100,000 8.988ms 9.425ms +4.9%
http/apply_10,000,000 770.503ms 778.131ms +1.0%
roundtrip/base64_1,000 0.071ms 0.071ms -0.6%
roundtrip/base64_100,000 1.199ms 1.200ms +0.1%
roundtrip/base64_10,000,000 167.584ms 167.509ms -0.0%
roundtrip/binref_1,000 0.410ms 0.409ms -0.1%
roundtrip/binref_100,000 0.645ms 0.650ms +0.7%
roundtrip/binref_10,000,000 44.280ms 44.070ms -0.5%
roundtrip/json_1,000 0.220ms 0.220ms -0.1%
roundtrip/json_100,000 18.594ms 18.020ms -3.1%
roundtrip/json_10,000,000 2250.802ms 2253.071ms +0.1%
  • Runner: Linux 6.17.0-1010-azure x86_64

@nmheim nmheim force-pushed the nh/path-references branch from 8cde6f2 to 1b804e4 Compare April 6, 2026 16:42
@jpbrodrick89
Copy link
Copy Markdown
Contributor

Should we not keep InputFileReference for backwards compatibility?

@nmheim nmheim force-pushed the nh/path-references branch from d3737ac to a11ef26 Compare April 7, 2026 11:50
@nmheim nmheim changed the title Turn FileReferences into PathReferences feat: Add *DirectoryReferences and enforce is_file() in OutputFileReference Apr 7, 2026
Copy link
Copy Markdown
Contributor

@jpbrodrick89 jpbrodrick89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I like this, I'd approve but as it's still draft I will wait until you mark it ready until I give the full green light. Thanks!

Comment thread examples/file_io/tesseract_api.py
@nmheim nmheim force-pushed the nh/path-references branch 2 times, most recently from 48111c6 to 5d05532 Compare April 7, 2026 16:29
@nmheim nmheim force-pushed the nh/path-references branch from 5d05532 to e63438c Compare April 7, 2026 16:32
@nmheim
Copy link
Copy Markdown
Contributor Author

nmheim commented Apr 7, 2026

Cool, I like this, I'd approve but as it's still draft I will wait until you mark it ready until I give the full green light. Thanks!

Yes, this was still quite drafty. After a chat with @xalelax I now settled on *PathReferences and keeping around *FileReferences. IMO we could remove those though, given that they were in the experimental module.

Comment thread tesseract_core/runtime/schema_generation.py Outdated
@nmheim nmheim changed the title feat: Add *DirectoryReferences and enforce is_file() in OutputFileReference feat: Automatic path resolution in InputSchema/OutputSchema Apr 8, 2026
Comment thread docs/content/examples/building-blocks/pathreference.md Outdated
Comment thread docs/content/examples/building-blocks/pathreference.md Outdated
Comment thread docs/content/examples/building-blocks/pathreference.md Outdated
Comment thread examples/pathreference/tesseract_config.yaml Outdated
Comment thread tests/endtoend_tests/test_examples.py
Comment thread tesseract_core/runtime/experimental.py Outdated
Comment thread tesseract_core/runtime/testing/regression.py Outdated
Comment thread examples/pathreference/test_tesseract.py Outdated
Comment thread tests/endtoend_tests/test_examples.py Outdated
Copy link
Copy Markdown
Contributor

@dionhaefner dionhaefner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved while we decide what to do with #569

Alternative design for #304
@nmheim nmheim enabled auto-merge (squash) April 21, 2026 12:31
@nmheim nmheim changed the title feat: Allow *PathReferences in apply schemas feat: Add InputPath / OutputPath schemas Apr 21, 2026
@dionhaefner dionhaefner changed the title feat: Add InputPath / OutputPath schemas feat: Add InputPath / OutputPath types to tesseract_core.runtime.experimental Apr 21, 2026
@dionhaefner dionhaefner changed the title feat: Add InputPath / OutputPath types to tesseract_core.runtime.experimental feat: Add InputPath / OutputPath types to tesseract_core.runtime.experimental, deprecate InputFileReference + OutputFileReference Apr 21, 2026
@nmheim nmheim merged commit 1ba136a into main Apr 21, 2026
54 checks passed
@nmheim nmheim deleted the nh/path-references branch April 21, 2026 12:47
@pasteurlabs pasteurlabs locked and limited conversation to collaborators Apr 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants