[api-minor] Replace the `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` API options with a single factory/option by Snuffleupagus · Pull Request #20919 · mozilla/pdf.js

Snuffleupagus · 2026-03-19T15:16:50Z

[api-minor] Replace the CMapReaderFactory, StandardFontDataFactory, and WasmFactory API options with a single
factory/option

Currently we have no less than three different, but very similar, factories for reading built-in CMap files, standard font files, and wasm files on the main-thread.[1]
These factories were added at different points in time, since I cannot imagine that we'd add essentially three copies of the same code otherwise.

Nowadays these factories are often not even used[2], since worker-thread fetching is used whenever possible to improve performance. In particular, they will only be used when either:
- The PDF.js library runs in Node.js environments.
- The user manually sets useWorkerFetch = false when calling getDocument.
- The user provides custom CMapReaderFactory, StandardFontDataFactory, and/or WasmFactory instances when calling getDocument.
Thanks to the use of import maps the impact of these separate factories on the total bundle-size is mostly avoided in the Firefox PDF Viewer, however there's still some amount of effectively dead code present there.
In order to reduce the number of getDocument options a little bit, this patch replaces the existing options with a single new BinaryDataFactory option instead.

A question is obviously if the option of providing custom factories to replace the default ones is actually utilized[3], by third-party PDF.js users, however that's impossible to know and outright removing the API option is probably too risky?

Please note: Any existing third-party code using the old factories should still work[4], since we (for now) fallback to use those if provided.

[1] Any new functionality could easily lead to more such factories being added in the future, which wouldn't be great.

[2] Note that the Firefox PDF Viewer will never use these factories, since it "forcibly" sets useWorkerFetch = true during building.

[3] Originally I recall that we only provided browser versions of these factories, and thus Node.js users needed to implement custom ones. However, this hasn't been the case for many years.

[4] Barring any bugs in the patch, of course.
Remove explicit name/filename validation in various BaseBinaryDataFactory methods

Given that the BaseBinaryDataFactory class is only used from the "FetchBinaryData" message handler, the name/filename parameters should never actually be missing here.
Besides, a missing name/filename would result in a "nonsense" URL and the actual data fetching would then fail instead. Hence it just doesn't seem necessary to keep this extra bit of validation now.
Simplify the main-thread reading of "raw" built-in CMap files

It's not entirely clear to me why we need to special-case reading of these, given that:
- The bcmap format has been used for over a decade now, see PR CMaps binary packing #4470.
- The bcmap files have not been updated even once, since their introduction.
- The "raw" CMap files only contain regular ASCII characters.
- Fetching the bytes-data directly should just work, which I've confirmed by successfully testing this patch with test/pdfs/mao.pdf and the "raw" CMap files in PR Adds built in CMaps and unifies the glyph mapping. #4259.

codecov-commenter · 2026-03-19T15:19:59Z

Codecov Report

❌ Patch coverage is 76.47059% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.59%. Comparing base (ff1af5a) to head (ed12eda).

Files with missing lines	Patch %	Lines
src/display/binary_data_factory.js	73.94%	37 Missing ⚠️
src/display/api.js	80.95%	8 Missing ⚠️
src/core/evaluator.js	84.61%	2 Missing ⚠️
src/core/jpx.js	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #20919      +/-   ##
==========================================
- Coverage   62.65%   62.59%   -0.06%     
==========================================
  Files         173      171       -2     
  Lines      121383   121284      -99     
==========================================
- Hits        76049    75922     -127     
- Misses      45334    45362      +28

Flag	Coverage Δ
fonttest	`7.66% <0.00%> (ø)`
unittestcli	`62.57% <76.47%> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…`, and `WasmFactory` API options with a single factory/option Currently we have no less than three different, but very similar, factories for reading built-in CMap files, standard font files, and wasm files on the main-thread.[1] These factories were added at different points in time, since I cannot imagine that we'd add essentially three copies of the same code otherwise. Nowadays these factories are often not even used[2], since worker-thread fetching is used whenever possible to improve performance. In particular, they will *only* be used when either: - The PDF.js library runs in Node.js environments. - The user manually sets `useWorkerFetch = false` when calling `getDocument`. - The user provides custom `CMapReaderFactory`, `StandardFontDataFactory`, and/or `WasmFactory` instances when calling `getDocument`. Thanks to the use of import maps the impact of these separate factories on the total bundle-size is *mostly* avoided in the Firefox PDF Viewer, however there's still some amount of effectively dead code present there. In order to reduce the number of `getDocument` options a little bit, this patch replaces the existing options with a single new `BinaryDataFactory` option instead. A question is obviously if the option of providing *custom factories* to replace the default ones is actually utilized[3], by third-party PDF.js users, however that's impossible to know and outright removing the API option is probably too risky? *Please note:* Any existing third-party code using the old factories should still work[4], since we (for now) fallback to use those if provided. --- [1] Any new functionality could easily lead to more such factories being added in the future, which wouldn't be great. [2] Note that the Firefox PDF Viewer will never use these factories, since it "forcibly" sets `useWorkerFetch = true` during building. [3] Originally I recall that we only provided browser versions of these factories, and thus Node.js users needed to implement custom ones. However, this hasn't been the case for many years. [4] Barring any bugs in the patch, of course.

…taFactory` methods Given that the `BaseBinaryDataFactory` class is only used from the "FetchBinaryData" message handler, the `name`/`filename` parameters should never actually be missing here. Besides, a missing `name`/`filename` would result in a "nonsense" URL and the actual data fetching would then fail instead. Hence it just doesn't seem necessary to keep this extra bit of validation now.

It's not entirely clear to me why we need to special-case reading of these, given that: - The `bcmap` format has been used for over a decade now, see PR 4470. - The `bcmap` files have not been updated even once, since their introduction. - The "raw" CMap files only contain regular ASCII characters. - Fetching the bytes-data directly should just work, which I've confirmed by successfully testing this patch with `test/pdfs/mao.pdf` and the "raw" CMap files in PR 4259.

This way we can build the complete `filename` directly in the worker-thread, which simplifies the CMap fetching in the `BaseBinaryDataFactory` class and improves overall consistency there.

Snuffleupagus added the core label Mar 19, 2026

Snuffleupagus temporarily deployed to code-coverage March 19, 2026 15:16 — with GitHub Actions Inactive

This comment was marked as off-topic.

Sign in to view

Snuffleupagus force-pushed the BinaryDataFactory branch from d9cdc86 to ee90931 Compare March 19, 2026 20:40

Snuffleupagus temporarily deployed to code-coverage March 19, 2026 20:40 — with GitHub Actions Inactive

Snuffleupagus force-pushed the BinaryDataFactory branch from ee90931 to 20cc07c Compare March 19, 2026 21:47

Snuffleupagus temporarily deployed to code-coverage March 19, 2026 21:47 — with GitHub Actions Inactive

mozilla deleted a comment from moz-tools-bot Mar 20, 2026

Snuffleupagus force-pushed the BinaryDataFactory branch from 20cc07c to 6ec05d5 Compare March 20, 2026 09:12

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 09:12 — with GitHub Actions Inactive

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 11:07 — with GitHub Actions Inactive

Snuffleupagus had a problem deploying to code-coverage March 20, 2026 11:07 — with GitHub Actions Failure

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 11:07 — with GitHub Actions Inactive

Snuffleupagus force-pushed the BinaryDataFactory branch from f92a46f to ef15e5b Compare March 20, 2026 11:08

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 11:09 — with GitHub Actions Inactive

Snuffleupagus had a problem deploying to code-coverage March 20, 2026 11:09 — with GitHub Actions Failure

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 11:09 — with GitHub Actions Inactive

Snuffleupagus force-pushed the BinaryDataFactory branch from ef15e5b to 83b3ae5 Compare March 20, 2026 11:37

Snuffleupagus had a problem deploying to code-coverage March 20, 2026 11:38 — with GitHub Actions Failure

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 11:38 — with GitHub Actions Inactive

Snuffleupagus added 3 commits March 20, 2026 12:40

Snuffleupagus force-pushed the BinaryDataFactory branch from 83b3ae5 to 1a5877f Compare March 20, 2026 11:40

Snuffleupagus had a problem deploying to code-coverage March 20, 2026 11:40 — with GitHub Actions Failure

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 11:40 — with GitHub Actions Inactive

Snuffleupagus force-pushed the BinaryDataFactory branch from 1a5877f to 58aa162 Compare March 20, 2026 12:06

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 12:06 — with GitHub Actions Inactive

Send the cMapPacked API option to the worker-thread

ed12eda

This way we can build the complete `filename` directly in the worker-thread, which simplifies the CMap fetching in the `BaseBinaryDataFactory` class and improves overall consistency there.

Snuffleupagus force-pushed the BinaryDataFactory branch from 58aa162 to ed12eda Compare March 20, 2026 12:25

Snuffleupagus temporarily deployed to code-coverage March 20, 2026 12:25 — with GitHub Actions Inactive

Snuffleupagus closed this Mar 20, 2026

Snuffleupagus deleted the BinaryDataFactory branch March 23, 2026 13:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[api-minor] Replace the `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` API options with a single factory/option#20919

[api-minor] Replace the `CMapReaderFactory`, `StandardFontDataFactory`, and `WasmFactory` API options with a single factory/option#20919
Snuffleupagus wants to merge 4 commits intomozilla:masterfrom
Snuffleupagus:BinaryDataFactory

Snuffleupagus commented Mar 19, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 19, 2026 •

edited

Loading

Uh oh!

This comment was marked as off-topic.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Snuffleupagus commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as off-topic.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Snuffleupagus commented Mar 19, 2026 •

edited

Loading

codecov-commenter commented Mar 19, 2026 •

edited

Loading