Skip to content

Add confidence to pretranslations#892

Merged
Enkidu93 merged 3 commits intomainfrom
add_confidence_to_pretranslations
Mar 25, 2026
Merged

Add confidence to pretranslations#892
Enkidu93 merged 3 commits intomainfrom
add_confidence_to_pretranslations

Conversation

@Enkidu93
Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 commented Mar 17, 2026

Fixes #879.

I'll want to test this end-to-end once sillsdev/machine.py#268 is complete and a docker image is created.

Also, made a few small typo-ish fixes.


This change is Reviewable

@Enkidu93 Enkidu93 requested a review from ddaspit March 17, 2026 15:40
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.52%. Comparing base (b548e70) to head (078a21c).

Files with missing lines Patch % Lines
...hared/Services/ServalTranslationEngineServiceV1.cs 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #892      +/-   ##
==========================================
+ Coverage   67.42%   67.52%   +0.10%     
==========================================
  Files         385      385              
  Lines       21096    21107      +11     
  Branches     2751     2748       -3     
==========================================
+ Hits        14223    14253      +30     
+ Misses       5889     5870      -19     
  Partials      984      984              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit reviewed 13 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on Enkidu93).


src/Serval/src/Serval.Translation/Models/Pretranslation.cs line 18 at r1 (raw file):

    public IEnumerable<string>? TranslationTokens { get; init; }
    public IReadOnlyList<AlignedWordPair>? Alignment { get; init; }
    public double Confidence { get; init; }

This should be optional. Old pretranslations won't have confidence set.

Copy link
Copy Markdown
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enkidu93 made 1 comment.
Reviewable status: 11 of 13 files reviewed, 1 unresolved discussion (waiting on ddaspit).


src/Serval/src/Serval.Translation/Models/Pretranslation.cs line 18 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

This should be optional. Old pretranslations won't have confidence set.

Done. Out of curiosity, why not just have a default value here? Why do we rather make it optional here and then provide an explicit default at the DTO level?

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 2 files and all commit messages, made 2 comments, and resolved 1 discussion.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).


src/Serval/src/Serval.Translation/Models/Pretranslation.cs line 18 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Done. Out of curiosity, why not just have a default value here? Why do we rather make it optional here and then provide an explicit default at the DTO level?

I don't have a problem with providing a default value here. We just need to do it somewhere.

@Enkidu93 Enkidu93 force-pushed the add_confidence_to_pretranslations branch from 550d505 to 078a21c Compare March 24, 2026 14:45
Copy link
Copy Markdown
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I was mistaken. machine.py is using 2 beams on production. I failed to noticed that for docker-compose jobs, it defaults to using one beam in the settings.yaml . I have updated the E2E test to specify 2 beams explicitly; also I had to switch to using the 600m model (rather than the random model) and up the steps in order to get non-empty translations. If you'd prefer, I can revert these changes and remove the confidence check in the E2E test (if you're satisfied with other testing).

@Enkidu93 made 1 comment.
Reviewable status: 9 of 13 files reviewed, all discussions resolved (waiting on ddaspit).

@Enkidu93 Enkidu93 requested a review from ddaspit March 24, 2026 15:17
Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does using NLLB impact how long it takes to run the E2E tests?

@ddaspit reviewed 4 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

@Enkidu93
Copy link
Copy Markdown
Collaborator Author

How much does using NLLB impact how long it takes to run the E2E tests?

@ddaspit reviewed 4 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

Good question! Just tested: About a minute (it's just the one E2E test that I updated).

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine.

@ddaspit made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

@Enkidu93 Enkidu93 merged commit a4e5a8e into main Mar 25, 2026
2 checks passed
@Enkidu93 Enkidu93 deleted the add_confidence_to_pretranslations branch March 25, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Save model confidence to Pretranslation model

3 participants