Skip to content

Fix Form XObject with extreme BBox values causing blank white page in Chromium#20876

Open
pengkunbin wants to merge 1 commit intomozilla:masterfrom
pengkunbin:fix/form-xobject-extreme-bbox-whitepage
Open

Fix Form XObject with extreme BBox values causing blank white page in Chromium#20876
pengkunbin wants to merge 1 commit intomozilla:masterfrom
pengkunbin:fix/form-xobject-extreme-bbox-whitepage

Conversation

@pengkunbin
Copy link
Contributor

Fixes #20872.

Problem

Some PDFs contain Form XObjects with astronomically large BBox values (e.g. ±8.988e76), intended to express "no clipping / cover the entire page". paintFormXObjectBegin passed these raw coordinates directly to Path2D.rect():

e.g.
<< /Filter /FlateDecode /Type /XObject /Subtype /Form /FormType 1 /BBox [-89884656743115785407263711865852178399035283762922498299458738401578630390014269380294779316383439085770229476757191232117160663444732091384233773351768758493024955288275641038122745045194664472037934254227566971152291618451611474082904279666061674137398913102072361584369088590459649940625202013092062429184 -89884656743115785407263711865852178399035283762922498299458738401578630390014269380294779316383439085770229476757191232117160663444732091384233773351768758493024955288275641038122745045194664472037934254227566971152291618451611474082904279666061674137398913102072361584369088590459649940625202013092062429184 89884656743115785407263711865852178399035283762922498299458738401578630390014269380294779316383439085770229476757191232117160663444732091384233773351768758493024955288275641038122745045194664472037934254227566971152291618451611474082904279666061674137398913102072361584369088590459649940625202013092062429184 89884656743115785407263711865852178399035283762922498299458738401578630390014269380294779316383439085770229476757191232117160663444732091384233773351768758493024955288275641038122745045194664472037934254227566971152291618451611474082904279666061674137398913102072361584369088590459649940625202013092062429184]

Chromium/Skia processes Path2D coordinates as Float32 internally. Values exceeding Float32.MAX (∼3.4e38) overflow to ±Infinity, silently invalidating the clip path and clearing the entire canvas, resulting in a blank white page.

Fix

Mirror the approach already used in beginGroup: transform the bbox to canvas (device) space via Util.axialAlignedBoundingBox, then clamp to the actual canvas bounds with Util.intersect(bounds, canvasBounds). For extreme BBox values this produces a clip rect equal to the full canvas, semantically equivalent and numerically safe.

Before / After

Before:
Clipboard_Screenshot_1773548299

After:
Clipboard_Screenshot_1773548158

File:

issue20872.pdf

the commits have been squashed into a single one @Snuffleupagus

@codecov-commenter
Copy link

codecov-commenter commented Mar 15, 2026

Codecov Report

❌ Patch coverage is 95.31250% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.60%. Comparing base (8f7a615) to head (f772d90).
⚠️ Report is 8 commits behind head on master.

Files with missing lines Patch % Lines
src/display/canvas.js 95.31% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #20876      +/-   ##
==========================================
+ Coverage   62.58%   62.60%   +0.02%     
==========================================
  Files         173      173              
  Lines      121275   121297      +22     
==========================================
+ Hits        75894    75936      +42     
+ Misses      45381    45361      -20     
Flag Coverage Δ
fonttest 7.66% <ø> (ø)
unittestcli 62.57% <95.31%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…flow

When a Form XObject has extreme BBox values (e.g. very large coordinates),
Skia's Float32 arithmetic can overflow causing a blank white page in
Chromium-based browsers. Fix by clamping the BBox to the canvas bounds in
device space before building the clip path, and use the clamped bounds
array indices when calling recordClipBox.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pengkunbin pengkunbin force-pushed the fix/form-xobject-extreme-bbox-whitepage branch from dfd1cab to f772d90 Compare March 15, 2026 13:17
@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Mar 15, 2026

Have you actually written this yourself, or is it completely AI generated?

@pengkunbin
Copy link
Contributor Author

Have you actually written this yourself, or is it completely AI generated?

My goal was to solve the white screen issue that my users encountered when using the PDF viewer. I used AI to quickly familiarize myself with the rendering pipeline and traced the root cause to a Skia rendering boundary issue. I now fully understand the entire rendering flow. This was accomplished collaboratively with AI.

@calixteman
Copy link
Contributor

Have you actually written this yourself, or is it completely AI generated?

My goal was to solve the white screen issue that my users encountered when using the PDF viewer. I used AI to quickly familiarize myself with the rendering pipeline and traced the root cause to a Skia rendering boundary issue. I now fully understand the entire rendering flow. This was accomplished collaboratively with AI.

There are several problems with this patch:

  • the pdf is useless: it just draw a white image over a white background so it doesn't show the problem you want to fix
  • the problem with this AI-driven patch is that you're fixing an issue without understanding what you're doing
  • we (pdf.js maintainters) spend some time to check the patchs, review them, ... and we don't have the time to teach to your AI how to write a good patch.

So more generally, please file bugs with useful test cases. If you've any ideas on how the problem can be fixed, please propose them in the bug itself and we can chat with you.
So I'm closing this PR, please update the bug you filed with a useful test case.

@calixteman calixteman closed this Mar 15, 2026
@pengkunbin
Copy link
Contributor Author

pengkunbin commented Mar 15, 2026

I think I've made myself quite clear:

  1. I already familiar with the entire rendering pipeline — these changes were made under my own informed judgment.
  2. The PDF files are included in both the original post and the code — it is not 'a white image over a white background'.
  3. I have also passed all existing tests in the codebase.

If this is how you treat contributors, I wish you the best of luck continuing to maintain pdf.js.

@Snuffleupagus @calixteman

@calixteman
Copy link
Contributor

I've been wrong: the pdf contains some data but none of the viewers I tested were able to render it (Chrome, Firefox, Preview, Acrobat, Foxit) and I, too quickly, looked at the pdf let me think that it was just a white image.
The patch itself is maybe not that bad, I mean just in reading it, it looks like it did what it should, but I'm not super excited by the idea to waste few microsecs for fixing an issue in an ill-formatted pdf that none of the well-know viewers is able to render correctly.
If you're the creator of the pdf, it's something which has to be fixed on your side, if you aren't, you should file a bug on the bug tracker of the software you used. If none of these 2 solutions is acceptable, please explain us why we should accept this.

@calixteman calixteman reopened this Mar 15, 2026
@pengkunbin
Copy link
Contributor Author

I've been wrong: the pdf contains some data but none of the viewers I tested were able to render it (Chrome, Firefox, Preview, Acrobat, Foxit) and I, too quickly, looked at the pdf let me think that it was just a white image. The patch itself is maybe not that bad, I mean just in reading it, it looks like it did what it should, but I'm not super excited by the idea to waste few microsecs for fixing an issue in an ill-formatted pdf that none of the well-know viewers is able to render correctly. If you're the creator of the pdf, it's something which has to be fixed on your side, if you aren't, you should file a bug on the bug tracker of the software you used. If none of these 2 solutions is acceptable, please explain us why we should accept this.

I maintain a PDF viewer built on pdf.js with over a million daily active users. These unusual PDFs were reported to me by real users.

I respect every single one of my users. When they encounter issues reading a PDF, I always look for the cause in the code. Telling them "your PDF is broken" — don't you think that's a bit too arrogant?

From a behavioral standpoint, the canvas receives invalid data before rendering, which causes the entire render to fail. Yet other layers (e.g. the text layer) render just fine. This inconsistency is confusing and feels like a bug.

This change simply adds data validation to ensure rendering stays within canvas's capability bounds. If you feel the performance aspect needs optimization, I'm happy to discuss or you're welcome to modify my code directly.

Anyway, I'm here to make pdf.js better.

@calixteman
Copy link
Contributor

I respect every single one of my users. When they encounter issues reading a PDF, I always look for the cause in the code.
Nice but don't you think we do the same ?
FYI, in the last months we've a bit too much AI generated patches: we spent/wasted some time to review them, test them and with nobody behind to take into account our feedback... So, just speaking for me: it can be boring.

Telling them "your PDF is broken" — don't you think that's a bit too arrogant?

Over the years, we spent a lot of time to fix some issues because of broken PDFs. I don't see any problem to tell you that the pdf is broken when as a matter of fact, the pdf is broken. And sometimes, as explained in my previous comment, it's a better idea to try to fix the root cause of the problem (i.e. fix the producer instead of fixing the reader).
Unfortunately, if we check whatever number we've everywhere because it can be wrong... the rendering will take forever.

Anyway, I don't want to have whatever argument with you or any people, I've more interesting to do.
I'm sorry to have closed your PR, but I reopen it, I explained that I made something wrong, what do you want I do ? Don't you make any errors ? So if your goal is to blame me... please forget !
It's my last comment about that: either we just speak code or nothing.

And to speak about code, I'm not opposed to figure out a solution, I'd really prefer avoiding to do that when drawing.

@pengkunbin
Copy link
Contributor Author

OK Here's my rationale:

  1. Following the layered architecture, this is a rendering engine adaptation concern, so I intentionally kept the logic out of the data parsing layer.
  2. The intent of the code is to ensure data stays within the canvas bounds — rather than hardcoding a magic number for the check, this approach is more accurate.
  3. I've seen similar defensive programming patterns already present in the rendering layer (e.g., // See https://bugzilla.mozilla.org/show_bug.cgi?id=726227, // See bug 1820511 (Windows specific bug).).

If you think this code change is too heavy, I can modify it to:

const MAX_CLIP_COORD = 3.4e38;                                                                 
  function clampCoord(v) {
    return Math.max(-MAX_CLIP_COORD, Math.min(MAX_CLIP_COORD, v));
  }

What do you think? Looking forward to your guidance. @calixteman

@pengkunbin
Copy link
Contributor Author

Cut to the chase @calixteman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Form XObject with extreme BBox values causes blank white page in Chromium-based browsers

5 participants