Move model download to a foreground WorkManager worker#22
Merged
Conversation
This was referenced May 10, 2026
40d78b8 to
5a323d6
Compare
ff89a10 to
0005370
Compare
ivan-digital
pushed a commit
that referenced
this pull request
May 10, 2026
… retries Two existing deletions were destroying the partial-download state we want to keep: 1. ensureModels() opens by walking the models dir and deleting every .tmp. The intent was to clean up after process crashes, but it also nukes the in-progress .tmp from a previous ModelDownloadWorker invocation that returned Result.retry() — meaning every WorkManager retry restarted that file from byte 0. Range resume can't help if there's nothing on disk to resume from. Replace with a comment explaining why we keep them; stale .tmp from an old MODEL_VERSION is still wiped by the version-mismatch path above. 2. downloadFile() deleted the .tmp file when its 5-attempt retry loop was exhausted before throwing. Same problem: a transient network failure that outlasts those 5 in-loop retries should still leave resumable bytes for the worker's next try. Drop the deletion. Net effect: a 1.2 GB download that hits a flaky network now keeps every byte that made it to disk — observed during emulator verification of PR #22 where a worker retry dropped ~2.5 MB of partial parakeet-encoder bytes for no good reason. Test: invert 'cleans up tmp file after all retries fail' to assert the .tmp persists after exhaustion, using DISCONNECT_DURING_RESPONSE_BODY to get a realistic mid-stream failure path.
added 3 commits
May 10, 2026 12:22
ModelManager.ensureModels was called from lifecycleScope in the demo activities, so the 1.2 GB initial download was bound to the foreground Activity. Backgrounding the app cancelled the coroutine mid-stream; partial files were retained (the existing Range: resume logic in ModelManager handles that), but progress visibly stalled and any fresh utterance had to start from the byte where the user left off. Add ModelDownloadWorker — a CoroutineWorker that wraps ensureModels, calls setForeground() with a progress notification (FOREGROUND_SERVICE_ TYPE_DATA_SYNC on API 34+), reports per-file progress via setProgress, and returns the resolved model directory in outputData. MainActivity / DictationActivity now enqueue the worker via WorkManager.enqueueUniqueWork(KEEP) and observe state through getWorkInfoByIdLiveData. Existing Activity-driven UI updates (status text, progress bar) wire up to the worker's progress instead. Also pulls in androidx.work:work-runtime-ktx and androidx.core:core-ktx in :sdk, the same work-runtime in :app, and POST_NOTIFICATIONS in the demo manifest. Verified locally: ./gradlew :sdk:testDebugUnitTest — 20/20 pass (no behavioral change to ModelManager itself).
Two issues surfaced during on-emulator verification of the worker: 1. WorkManager 2.9.x's bundled manifest declares SystemForegroundService without a foregroundServiceType. On API 34+, startForeground() with FOREGROUND_SERVICE_TYPE_DATA_SYNC then fails with IllegalArgumentException 'foregroundServiceType 0x00000001 is not a subset of foregroundServiceType attribute 0x00000000 in service element of manifest file'. Override the service entry in the SDK manifest with android:foregroundServiceType="dataSync" + tools:replace so the worker is compatible everywhere without forcing consumers to bump WorkManager. 2. NetworkType.CONNECTED maps to a JobInfo network request that requires the VALIDATED capability — which JobScheduler may hold unsatisfied for long periods on flaky networks (captive portal probe failure, transient DNS), even when the device has working internet. Drop the constraint entirely; OkHttp surfaces network failures as IOException which the worker already translates into Result.retry().
Six tests using TestListenableWorkerBuilder with mockkObject(ModelManager) so the worker contract can be exercised on the JVM without touching the network or the file system: - doWork_success_returnsModelDirInOutputData - doWork_ioException_returnsRetry — pins the transient-failure path - doWork_genericThrowable_returnsFailureWithMessage — pins KEY_ERROR shape - doWork_invalidPrecisionInput_defaultsToInt8 — guards against bad input - doWork_missingPrecisionInput_defaultsToInt8 - request_buildsRequestWithPrecisionInputDataAndNoNetworkConstraint — pins the no-CONSTRAINT_CONNECTIVITY decision Adds androidx.work:work-testing:2.9.1 to :sdk testImplementation. Local: ./gradlew :sdk:testDebugUnitTest — 26/26 pass (6 new + 5 service + 15 ModelManager).
0005370 to
9b44fd4
Compare
ivan-digital
pushed a commit
that referenced
this pull request
May 10, 2026
… retries Two existing deletions were destroying the partial-download state we want to keep: 1. ensureModels() opens by walking the models dir and deleting every .tmp. The intent was to clean up after process crashes, but it also nukes the in-progress .tmp from a previous ModelDownloadWorker invocation that returned Result.retry() — meaning every WorkManager retry restarted that file from byte 0. Range resume can't help if there's nothing on disk to resume from. Replace with a comment explaining why we keep them; stale .tmp from an old MODEL_VERSION is still wiped by the version-mismatch path above. 2. downloadFile() deleted the .tmp file when its 5-attempt retry loop was exhausted before throwing. Same problem: a transient network failure that outlasts those 5 in-loop retries should still leave resumable bytes for the worker's next try. Drop the deletion. Net effect: a 1.2 GB download that hits a flaky network now keeps every byte that made it to disk — observed during emulator verification of PR #22 where a worker retry dropped ~2.5 MB of partial parakeet-encoder bytes for no good reason. Test: invert 'cleans up tmp file after all retries fail' to assert the .tmp persists after exhaustion, using DISCONNECT_DURING_RESPONSE_BODY to get a realistic mid-stream failure path.
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #21. Fixes the issue where backgrounding the demo app
mid-download visibly stalls the model fetch.
The issue
ModelManager.ensureModels(...)was called inside the activity'slifecycleScope(MainActivity.kt:211,DictationActivity.kt:170).When the user backgrounded the app:
onStop; eventually destroyed if Android needs RAMlifecycleScopecancels the coroutine.tmpfiles persist, so on next launchensureModelsresumes from the saved offset (
ModelManager.kt:140-142) — butonly after the user re-enters the app
This works, but feels broken: a 1.2 GB first install pauses
indefinitely the moment the screen is off.
The fix (3 commits)
1.
ffbeeb6Move model download to a foreground workerA
CoroutineWorkerthat wrapsensureModelsand runs as a foregroundservice:
setForeground(...)withFOREGROUND_SERVICE_TYPE_DATA_SYNC(API 34+ requirement)Speech models — <file> 7/16) withPRIORITY_LOWandsetOnlyAlertOnce(true)setProgress(workDataOf(...))so observers can drive the UI without parsing the notificationmodelDirinoutputDataIOException→Result.retry(), anything else →Result.failure()with the message inoutputDataenqueueUniqueWork("audio.soniqo.speech.modelDownload", KEEP, ...)so concurrent enqueues are dedupedThe activities just observe state via
getWorkInfoByIdLiveDataandproceed to
initPipeline(modelDir)onSUCCEEDED. No more directensureModelscall from the foreground.2.
562ae25Runtime fixes from emulator verificationTwo issues surfaced when actually running the worker on an arm64 emulator:
SystemForegroundServicewithout aforegroundServiceType. When we callstartForeground(FOREGROUND_SERVICE_TYPE_DATA_SYNC), Android rejects withIllegalArgumentExceptionbecause the service element doesn't advertise a matching type. Fix: declare an override in the SDK manifest withtools:replace="android:foregroundServiceType"so consumers don't have to.CONNECTIVITYconstraint stuck.NetworkType.CONNECTEDmaps to a JobInfo network request requiring theVALIDATEDcapability — which JobScheduler can hold unsatisfied for long periods on flaky / captive networks even when the device has working internet. Drop the constraint; OkHttp surfaces network failures asIOExceptionwhich the worker already retries.3.
ff89a10Robolectric tests for the worker (6 tests)Using
TestListenableWorkerBuilder+mockkObject(ModelManager)sothe worker contract is exercised on the JVM without network or disk:
doWork_success_returnsModelDirInOutputDatadoWork_ioException_returnsRetry— pins the transient-failure pathdoWork_genericThrowable_returnsFailureWithMessage— pinsKEY_ERRORshapedoWork_invalidPrecisionInput_defaultsToInt8doWork_missingPrecisionInput_defaultsToInt8request_buildsRequestWithPrecisionInputDataAndNoNetworkConstraintFiles
sdk/.../ModelDownloadWorker.ktCoroutineWorker(~150 LOC)sdk/.../ModelDownloadWorkerTest.ktsdk/build.gradle.ktswork-runtime-ktx:2.9.1, +core-ktx:1.13.1, +work-testing(test)sdk/src/main/AndroidManifest.xmlFOREGROUND_SERVICE/FOREGROUND_SERVICE_DATA_SYNCperms;SystemForegroundServiceforegroundServiceType="dataSync"overrideapp/.../MainActivity.ktloadPipeline→ enqueue worker; newinitPipeline(modelDir)app/.../DictationActivity.ktapp/build.gradle.ktswork-runtime-ktx(app needs WorkManager types on its compile classpath)app/src/main/AndroidManifest.xmlPOST_NOTIFICATIONSTest plan
./gradlew :sdk:testDebugUnitTest— 26/26 pass (15 ModelManager + 5 SpeechRecognitionService + 6 ModelDownloadWorker)./gradlew :sdk:assembleDebug— green./gradlew :app:assembleDebug— greenadb shell am force-stop ...), relaunch — should resume from the last.tmpbyte.Notes
SpeechRecognitionServiceto use the worker (so Gboard's first invocation doesn't synchronously wait on the download) is done in stacked PR Worker rollout follow-ups: WM 2.11, .tmp resume, service uses worker #23.POST_NOTIFICATIONSwill not see the progress notification, but the foreground service still runs (Android shows a default "App is running in background" indicator).ModelManagerdeletes the partial.tmpon retry exhaustion, so a worker-levelResult.retry()loses the partial bytes for that file. Fixed in stacked PR Worker rollout follow-ups: WM 2.11, .tmp resume, service uses worker #23.