LDEV-2968 v3 openpdftohtml#62
Open
zspitzer wants to merge 44 commits into
Open
Conversation
- optimize: remove bookmarks, metadata, JS, attachments, thumbnails, comments, forms, links - sanitize: security-focused removal of dangerous elements (JS, attachments, metadata, link actions) - addStamp: delegates to watermark for image-based stamps
- srcfile/src without tag body now triggers rendering via doEndTag - getBaseUrl() handles Lucee Resources, not just java.io.File - Empty body no longer overrides srcfile content - Encryption constants are now distinct; AES-128 uses setPreferAES - Page ranges like "3-" resolve actual page count instead of -1 - "printing" permission maps to ALLOW_PRINTING, remove dead duplicate
- Cache getInfo() result to avoid re-parsing PDF on every struct access - Close source PDDocuments in concat(), use try-with-resources in toImage() - Fix InputStream leak in PDFForm.loadPDDocument() on error - Remove pd4ml.jar, ss_css2.jar, .flattened-pom.xml - Consolidate handlePageNumbers() into processPageVariables() - Remove dead getMultipleHF() and multi-render loop - Remove empty writeImages() stub - Require action attribute on cfpdf (was silent no-op) - Implement setFilter() for directory merge glob patterns - Fix FONT_EMBED_SELECCTIVE typo - Fix setMimetype() discarding normalised value - Update fontdirectory TLD description
Routes render logging through Lucee's pdf log when defined. No-op if the log isn't configured in admin.
cfdocument src with proxyserver/proxyport now works.
Verifies invalid proxy errors on remote fetch, and is ignored for local content.
- Prevent XXE in PDFForm XML parsing (disable DTDs and external entities) - Sanitise saveAsName in Content-Disposition header (strip CR/LF/quotes) - Fix LuceeLogHandler accumulation (only attach once per JVM) - Escape extracted text in XML output (extractText type=xml) - Fix setFilter glob-to-regex using Pattern.quote for literal chars - Map thumbnail scale (1-100) to DPI (3-300) instead of using raw value - Remove dead PDF2Image.java - Enable checkFileLocation on cfdocument filename attribute
- DocumentSection.setMimetype() now passes normalised value (was discarded) - PDFPageMark.getHtmlTemplate() delegates to getHtml() for bounds safety - ApplicationSettings.init is now volatile - Remove FontsJarExtractor.main() debug method - Replace e.printStackTrace() calls with comments
…onts.jar - Strip path components from PDF attachment filenames to prevent directory traversal - Set 15s timeout on JSoup URL fetching - Remove dead fonts.jar from res/ (2.4MB, loaded from classloader not res/)
Use OpenHTMLToPDF's native <bookmarks> element for PDF outline generation, replacing the post-render hack that pointed all bookmarks to section start pages. Bookmarks now resolve to exact rendered page positions for explicit bookmarks, HTML headings, and section names. cfpdf merge now preserves and remaps bookmarks from all source PDFs with correct page offsets, and filters out bookmarks for excluded pages.
Renders content onto a larger page proportional to the scale factor, then uses PDFBox to scale pages back to target dimensions.
IsPDFArchive now validates pdfaid:part in XMP metadata instead of just checking if the file is a valid PDF. getInfo() includes PDFAVersion key. Register IsPDFArchive as a standalone function in function.fld.
Allow self-closing unknown HTML tags for compatibility with real-world HTML
Accepts a Component (with onResourceFetch method) or UDF to intercept image/CSS/resource fetching. Returns content or null to fall through to default. Wired into both src fetching and OpenHTMLToPDF rendering.
Bumps test.bat to jdk-11.0.30 / jdk-21.0.10 and 7.1/snapshot/light.
Adds -fs-table-paginate: paginate to the default OpenHTMLToPDF stylesheet so tables break across pages instead of dropping rows.
The handler now receives (url, parsedUrl) where parsedUrl is a struct with protocol, host, port, path, query, fragment. CFC handlers can still declare onResourceFetch(url) and ignore the second arg.
setScale now throws for values <= 0 (was < 0), matching the documented 1-100 range. Adds TODO note where mergeDocuments() can NPE on form fields with null font names — PDFBOX-5963, fixed in 3.0.8.
DocumentRendering.cfc — 24 specs covering basic rendering, HTML entities, unicode, CSS, page breaks, page-size dimensions, images, HTML-to-AcroField conversion, and error handling. One skip for the checkbox case waiting on PDFBOX-5963.
Both files were previously fully skipped because they needed a PDF with AcroFields. The fixture is now generated from a cfdocument with <input type=text>, so all 9 populate specs and 6 read specs run.
Tests: - PDFWatermark — verifies watermark image is actually embedded via extractImage. removeWatermark stub is documented + skipped. - PDFThumbnail — verifies scale produces different-sized images. - PDFExtractImages — checks imagePrefix is honoured + extracted file is a valid image, multi-image case. - PDFRemoveAttachments — verifies attachments are gone via extractAttachments round-trip. Java: - Extract path-traversal sanitizer to PDFUtil.sanitizeFilename, add null-byte rejection. Used by extractAttachments. - Fix InputStream leak in PDFForm.loadFromResource — switch to try-with-resources, the buffer copies bytes but never owns the stream.
… resourceHandler docs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://luceeserver.atlassian.net/browse/LDEV-2968