Support image previews by grighakobian · Pull Request #747 · readium/swift-toolkit

grighakobian · 2026-03-18T14:16:42Z

Summary

This pull request partially addresses the feature request submitted on readium/mobile#10.

This PR adds TargetElement metadata to PointerEvent, enabling apps to detect when the user interacts with an image element(<img>, <svg>) and access its frame, source URL, and alt text. The TestApp demonstrates this with a fullscreen image preview viewer.

Navigator changes

JavaScript: Added extractTargetElement() in gestures.js that extracts metadata (bounding rect, tag, src, alt) from the nearest image element under the pointer. Added findNearestImageElement() to walk up the DOM tree.
PointerEvent.TargetElement: New struct with tag, src, alt, and frame properties, available on every -PointerEvent. The src URL is relativized against the publication base URL so it can be used directly with Publication.get().
Frame coordinates are converted through the full coordinate space chain (JS → spread view → navigator view).

TestApp demo

ImagePreviewViewController — fullscreen image viewer with UIScrollView-based pinch-to-zoom (1x–4x).
ImagePreviewTransition — custom spring-based animated transition from the image's in-page position to centered aspect-fit, and back on dismiss.
ImagePreviewNavigationController — wraps the preview and owns the custom transitioning delegate.
Wired via a .tap observer in VisualReaderViewController.

Design decisions

Image-only scope (<img> and <svg>) — Video and audio elements don't benefit from a zoom preview, so detection is limited to image tags to keep the implementation focused.
Single tap trigger — A double-tap detector would add a delay to all tap events. Using a single tap avoids this, consistent with the behavior in Apple Books.
No new gesture event types — The library already provides the InputObserving API, which can be used to detect double-tap, pinch, and other gestures. Rather than adding dedicated gesture events, this PR exposes TargetElement metadata on PointerEvent, giving apps the building blocks to implement custom gesture handling on top of the existing API.
Preview logic in TestApp, not Navigator — The Navigator provides element metadata; how to present it is left to the app.

Previews

iPhone	iPad
iphone.mov	ipad.mov

mickael-menu · 2026-03-27T16:44:09Z

Sources/Navigator/EPUB/EPUBSpreadView.swift

+            // The src from JS is a fully-resolved readium:// URL.
+            // Relativize it against the publication base URL to get a
+            // publication-relative href usable with Publication.get().


The comment is a bit misleading because the HTML document might contain URLs to external resources with http://, or it might already be a relative URL. In any case the code looks correct as it falls back on src when relativize() fails.

Good catch :)

mickael-menu · 2026-03-27T17:33:57Z

TestApp/Sources/Reader/Common/ImagePreview/ImagePreviewNavigationController.swift

+/// A navigation controller wrapper for image preview that owns the custom
+/// transition and forwards `ImagePreviewTransitioning` to its top view
+/// controller.
+final class ImagePreviewNavigationController: UINavigationController {


I'm a bit on the fence regarding the Test App implementation.

As a heads-up, we're phasing out the Test App in favor of a new Swift Playground app. The Test App is not really maintained anymore.

Your implementation looks super nice but we want to keep the Test App / Playground really simple and the code straightforward to test and demonstrate the Readium APIs. Having nice animations and UX is in the scope of the application and not the toolkit. From experience, if we offer more advanced components in the Test App, integrators will copy-paste it and expect support when it breaks or doesn't fit their app, which is why we want to move away from a full-fledged reading test app to a more technical and simple playground.

I would be fine merging your implementation as-is, but it will be removed in the near future.

For a concrete example of a more technical interface, we could display a sheet on top of the navigator (without fancy animations) which contains very raw data about the target element. For example displaying the image but without zooming capabilities. The idea is to keep the code simple so that the actual Readium APIs usage is easy to understand.

Totally agree with you @mickael-menu!

mickael-menu · 2026-03-30T15:01:03Z

Sources/Navigator/Input/Pointer/PointerEvent.swift

+    /// Metadata about the element under the pointer, if available.
+    ///
+    /// This is typically provided by the EPUB navigator's JavaScript bridge
+    /// when the pointer is over a media element (img, svg, video, etc.).
+    public var targetElement: TargetElement?
+
+    /// Metadata about the DOM element under a pointer event, extracted from
+    /// the JavaScript layer.
+    public struct TargetElement: Equatable {
+        /// Tag name of the element (e.g. "img", "svg").
+        public var tag: String
+
+        /// Source URL of the media element, if available.
+        public var src: String?
+
+        /// Alt text of the element, if available.
+        public var alt: String?
+
+        /// Frame of the element relative to the navigator's view.
+        public var frame: CGRect
+
+        public init(tag: String, src: String?, alt: String? = nil, frame: CGRect) {
+            self.tag = tag
+            self.src = src
+            self.alt = alt
+            self.frame = frame
+        }
+    }


There's a lot of potential for this API beyond image zooms. For example you could use it to select a sentence to trigger TTS, or a word to get the definition for.

This API must be usable with other navigators too (e.g. PDF) so it should not reference anything specific to HTML (tag or alt).

As any navigator might implement it, I think a protocol would be a better fit to allow for extensions. In particular, when we don't have match with a format-agnostic element like "ImageElement", the HTML navigator could return a specific HTMLElement that contains raw HTML tags.

We can get inspired by the existing ContentElement.

For this particular PR, it's fine if we only implement the ImageElement, as long as we're open for extensions.

Here's a draft as a suggestion:

public struct PointerEvent: Equatable { ... public var target: (any Element)? /// Manually implement Equatable because of `target`. public static func == (lhs: PointerEvent, rhs: PointerEvent) -> Bool { guard lhs.pointer == rhs.pointer, lhs.phase == rhs.phase, lhs.location == rhs.location, ... else { return false } switch (lhs.target, rhs.target) { case (nil, nil): return true case let (l?, r?): return l.isEqualTo(r) default: return false } } public var target: (any Element)? public var targetFrame: CGRect? public protocol Element { /// Frame of the element relative to the navigator's view. var frame: CGRect /// A `Locator` to the target element, so that the app can use navigator.go(to:) on this element. /// In the EPUB navigator, this can be computed by generating the CSS selector for the element. var locator: Locator /// Returns whether the receiver is equivalent to `other`. func isEqualTo(_ other: Element) -> Bool } public extension Element where Self: Equatable { func isEqualTo(_ other: Element) -> Bool { guard let other = other as? Self else { return false } return self == other } } /// An element referencing an embedded resource. /// This is useful if the app has a generic handler for any type of embedded resource (image, audio, video, etc.). public protocol EmbeddedResourceElement: ContentElement { /// Referenced resource in the publication. var embeddedResource: Link { get } } public struct ImageElement: EmbeddedResourceElement, Equatable { public var frame: CGRect public var locator: Locator /// Link matching the `img.src` in the HTML, taken from the publication.readingOrder. /// Useful as a `Link` instead of raw `AnyHREF` or `String` because the client app can /// check its `mediaType` to see if it is supported. public var embeddedResource: Link /// Short piece of text associated with the image (e.g. `alt`). public var caption: String? } }

Let me know what you think!

Thanks @mickael-menu for your suggestions. They definitely make sense to me.

I decided to explore whether we can reuse the existing ContentElement API.
After thoroughly reviewing the ContentElement API, I found that the only limitation I encountered was the frame attribute, which a visual element could have. If we add the target frame to the PointerEvent, it might be feasible to use the existing ContentElement API. In that case, we can reuse the existing ImageContentElement, AudioContentElement, VideoContentElement, and even TextualContentElement if necessary.

Here’s the updated PointerEvent structure:

public struct PointerEvent { … public var target: Target? public struct Target { public var frame: CGRect public var element: any ContentElement } }

Or just

public struct PointerEvent { … public var targetFrame: CGRect? public var targetElement: (any ContentElement)? }

What are your thoughts on this?

grighakobian force-pushed the feature/image-preview branch 2 times, most recently from cf68d97 to cdf6b48 Compare March 18, 2026 14:18

grighakobian marked this pull request as ready for review March 18, 2026 14:21

grighakobian mentioned this pull request Mar 18, 2026

Image Zoom readium/mobile#10

Open

grighakobian closed this Mar 18, 2026

grighakobian reopened this Mar 18, 2026

grighakobian temporarily deployed to LCP March 18, 2026 14:25 — with GitHub Actions Inactive

mickael-menu reviewed Mar 30, 2026

View reviewed changes

grighakobian force-pushed the feature/image-preview branch 2 times, most recently from 5fa6529 to c5008c1 Compare April 2, 2026 08:52

grighakobian temporarily deployed to LCP April 2, 2026 08:53 — with GitHub Actions Inactive

grighakobian added 12 commits April 3, 2026 16:08

Add double-tap and pinch gesture support to navigators

184be3f

Fix XHTML media element detection and add image zoom viewer

71b1435

Use modern InputObserving API for gesture observers

a2d0404

Remove pinch gesture observer from navigators

f1e4b1f

Replace double-tap with single tap for image zoom

26a3c58

Simplify ImagePreviewViewController layout and zoom behavior

9feded4

Refactor image preview transition

0c70d9a

Fix image preview dismissal transition when image is zoomed

405bc79

Restrict image preview to img and svg elements

02061ca

Refactoring

89c9931

Rename TargetElementInfo to TargetElement

b992a07

Regenerate bundled JS scripts after rebase

7c82eb4

grighakobian force-pushed the feature/image-preview branch from c5008c1 to 7c82eb4 Compare April 3, 2026 12:09

grighakobian temporarily deployed to LCP April 3, 2026 12:09 — with GitHub Actions Inactive

grighakobian requested a review from mickael-menu April 6, 2026 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support image previews#747

Support image previews#747
grighakobian wants to merge 12 commits intoreadium:developfrom
grighakobian:feature/image-preview

grighakobian commented Mar 18, 2026

Uh oh!

mickael-menu Mar 27, 2026

Uh oh!

grighakobian Apr 2, 2026

Uh oh!

mickael-menu Mar 27, 2026

Uh oh!

mickael-menu Mar 27, 2026

Uh oh!

grighakobian Apr 2, 2026

Uh oh!

mickael-menu Mar 30, 2026

Uh oh!

grighakobian Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

grighakobian commented Mar 18, 2026

Summary

Navigator changes

TestApp demo

Design decisions

Previews

Uh oh!

mickael-menu Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

grighakobian Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

mickael-menu Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

mickael-menu Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

grighakobian Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

mickael-menu Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

grighakobian Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

grighakobian Apr 3, 2026 •

edited

Loading