Skip to content

Support daughter projects#886

Merged
Enkidu93 merged 5 commits intomainfrom
support_daughter_projects
Mar 19, 2026
Merged

Support daughter projects#886
Enkidu93 merged 5 commits intomainfrom
support_daughter_projects

Conversation

@Enkidu93
Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 commented Mar 3, 2026

Fixes #854

I figured I'd push what I have and see what you think. There are a lot of options regarding how to divvy work among the classes. Unfortunately, scanning for the parent project is going to be a little ugly regardless of how you do it. I'd still like to add a couple tests.


This change is Reviewable

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit messy with the current design. I think we should seriously consider refactoring ParallelCorpusPreprocessingService, TextCorpusService, and ScriptureDataFileService. TextCorpusService seems pretty redundant at this point. Maybe we could merge ParallelCorpusPreprocessingService and TextCorpusService into a single CorpusService that performs various operations on ParallelCorpus and MonolingualCorpus instances. This means that you would need to map the context-specific models into the service toolkit models. This is no big deal in the Machine engine. We are already doing that. We will need to do the mapping in Serval. The new service could abstract all of the special Machine classes that handle Paratext projects and USFM (USFM updating, quotation convention detecting, preprocessing, versification warnings, getting chapters from a Scripture range, etc.). Let me know if you want to discuss more.

@ddaspit reviewed 3 files and made 1 comment.
Reviewable status: 3 of 21 files reviewed, all discussions resolved.

Copy link
Copy Markdown
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I really don't care for the way the logic required to support this makes a mess of all these classes. What I really don't like is that we are re-scanning the corpora for parent projects. I think we should either save that mapping in the client once via a call to one of these services and then pass the specific parent location or settings to each further call or else reimagine the singleton service class as either a scoped service or just a normal class and pass the corpora all at once, let the instance establish the relationships once, and then it could implicitly load the parent in further calls. What do you think is better?

Regardless, yes, I think it makes sense to merge those classes.

@Enkidu93 made 1 comment.
Reviewable status: 3 of 21 files reviewed, all discussions resolved (waiting on ddaspit).

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like where you are going with this. Maybe we could create a class that captures all of the data and relationships that we can pass as an argument to this new service. The class could be a scoped service, but a normal class might be easier. I don't know if it makes sense, but we could use some type of a factory pattern to create the state object. We should play around with the design a bit to see what works best.

@ddaspit made 1 comment.
Reviewable status: 3 of 21 files reviewed, all discussions resolved.

Copy link
Copy Markdown
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this has taken me a while! I still need to write some new tests and polish a couple things, but I thought I'd push what I have so you can see the direction I'm going. Basic idea is that the CorpusBundle (open to alternative names 😅) handles all of the text corpus creation, settings creation, parent project finding, etc. - basically anything SIL.Machine besides the filtering and alignment. This removes the need for the ITextCorpusService and the IScriptureDataFileService and simplifies the code in the preprocessing service a lot. It also means that we do not need to repeatedly parse settings files etc. I've renamed the ParallelCorpusPreprocessingService to ParallelCorpusService since its role is a little more generic (even though its functions are still only used in preprocessing). (Is there a reason we can't remove the idea of this class as a service and just make it a utility class with all static methods?) Let me know what you think.

@Enkidu93 made 1 comment.
Reviewable status: 2 of 40 files reviewed, all discussions resolved (waiting on ddaspit).

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the approach. I will finish reviewing it once you've finished the refactoring.

@ddaspit partially reviewed 17 files and made 1 comment.
Reviewable status: 19 of 40 files reviewed, all discussions resolved.

@Enkidu93 Enkidu93 force-pushed the support_daughter_projects branch from 8873edc to 92aa7da Compare March 10, 2026 15:42
@Enkidu93
Copy link
Copy Markdown
Collaborator Author

Alright, I believe that this is ready for re-review. Tests are passing locally. The nuget publishing action in Machine is failing though: https://github.com/sillsdev/machine/actions/runs/22910752599/job/66484569655. I think we may need to refresh the API key. I don't believe I have access in nuget. Could you refresh it?

@Enkidu93 Enkidu93 requested a review from ddaspit March 10, 2026 15:58
@Enkidu93 Enkidu93 force-pushed the support_daughter_projects branch from 92aa7da to 97ccd3b Compare March 11, 2026 15:57
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 88.00662% with 145 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.42%. Comparing base (1f17a38) to head (0037c67).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...L.ServiceToolkit/Services/ParallelCorpusService.cs 88.09% 52 Missing and 18 partials ⚠️
...erval.Translation/Services/CorpusMappingService.cs 87.24% 29 Missing and 8 partials ⚠️
...rval.Machine.Shared/Services/PreprocessBuildJob.cs 56.66% 11 Missing and 2 partials ⚠️
...it/src/SIL.ServiceToolkit/Services/CorpusBundle.cs 94.73% 5 Missing and 3 partials ⚠️
...l/src/Serval.Translation/Services/EngineService.cs 92.15% 1 Missing and 3 partials ⚠️
...rval.Translation/Services/PretranslationService.cs 92.98% 2 Missing and 2 partials ⚠️
...src/Serval.WordAlignment/Services/EngineService.cs 0.00% 4 Missing ⚠️
...Shared/Services/WordAlignmentPreprocessBuildJob.cs 71.42% 2 Missing ⚠️
...ServiceToolkit/Models/MissingParentProjectError.cs 0.00% 2 Missing ⚠️
...ared/Configuration/IServiceCollectionExtensions.cs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #886      +/-   ##
==========================================
+ Coverage   67.01%   67.42%   +0.40%     
==========================================
  Files         384      385       +1     
  Lines       21037    21096      +59     
  Branches     2734     2751      +17     
==========================================
+ Hits        14098    14223     +125     
+ Misses       5963     5889      -74     
- Partials      976      984       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Enkidu93
Copy link
Copy Markdown
Collaborator Author

Alright, I think this is ready for re-review, @ddaspit!

@Enkidu93
Copy link
Copy Markdown
Collaborator Author

Alright - once again, I think this ready for review 👍

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddaspit partially reviewed 53 files and all commit messages, and made 3 comments.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on Enkidu93).


src/Serval/src/Serval.Translation/Configuration/IServalBuilderExtensions.cs line 13 at r6 (raw file):

        builder.AddDataFileOptions(builder.Configuration.GetSection(DataFileOptions.Key));

        builder.Services.AddSingleton<IParallelCorpusService, ParallelCorpusService>();

You should use the AddParallelCorpusService method here.


src/Serval/src/Serval.Translation/Services/ICorpusMappingService.cs line 6 at r7 (raw file):

{
    IReadOnlyList<SIL.ServiceToolkit.Models.ParallelCorpus> Map(Build build, Engine engine);
    string GetFilePath(string filename);

Does this method need to be exposed publicly?


src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 52 at r7 (raw file):

    )
    {
        Engine? engine = await _engines.GetAsync(engineId, cancellationToken);

We should throw EntityNotFoundException if engine is null.

Copy link
Copy Markdown
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Enkidu93 made 3 comments.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on ddaspit).


src/Serval/src/Serval.Translation/Configuration/IServalBuilderExtensions.cs line 13 at r6 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

You should use the AddParallelCorpusService method here.

Done.


src/Serval/src/Serval.Translation/Services/ICorpusMappingService.cs line 6 at r7 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Does this method need to be exposed publicly?

It does not! Thank you - sorry I forgot about that


src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 52 at r7 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

We should throw EntityNotFoundException if engine is null.

Done. Added a test.

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 5 files and all commit messages, made 1 comment, and resolved 3 discussions.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

Enkidu93 and others added 5 commits March 19, 2026 17:57
Initial pass at daughter project support

Refactor paratext project parallel corpus handling (unfinished)

Remove unnecessary classes

Progress toward finished tests

Passing tests
- add update USFM methods to ParallelCorpusService
…ll typos; progress toward passing tests

Fix pretranslation service bug

Passing translation engine tests

Move all mapping to corpus mapping service

Remove commented out code
@Enkidu93 Enkidu93 force-pushed the support_daughter_projects branch from da07573 to 0037c67 Compare March 19, 2026 21:58
@Enkidu93 Enkidu93 merged commit b548e70 into main Mar 19, 2026
1 of 2 checks passed
@Enkidu93 Enkidu93 deleted the support_daughter_projects branch March 19, 2026 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Load daughter/back translation versification from the parent translation

3 participants