Conversation
This fixes the overlapping 3-button navigation toolbar on Android.
Well, this was not a small one-line fix. The handling of onscreen/offscreen transitions was broken in all kinds of interesting ways: 1. On Android, the rendering surface can go away at any point from the point of view of the JavaScript thread. There's no way to control it. 2. On iOS/macOS the surfaces never go away, but we can't resize them outside the main thread. The previous commits add an example of a virtualized list that exercises the surface lifecycle quite vigorously. Oh, and to add to the fun: Dawn WebGPU device is not threadsafe. So it easily crashes even if all we want to do is copy an offscreen texture to the screen, while the JavaScript thread is rendering. This commit adds a GPU-level lock to all `NativeObject`-based wrappers, so it is automatically taken on any operation. And the UI-thread based logic takes this lock manually. This serializes all the GPU operations. And because the iOS and Android implementations have diverged quite a bit, the `SurfaceInfo` is now an abstract class `SurfaceBridge` that has platform-specific implementations.
|
Thanks a lot for making this PR. I was thinking about redesign the current model to work properly. I need to think about it. I think we can do a single threaded model with an API à la I also need to check how to transfer textures from one thread/device to another. this would be how maybe we go from the JS thread canvas (seemingly offscreen) to the onscreen UI thread canvas. Do I have the right intuition there? I'm excited for us to fix this. |
I don't think this will work with JavaScript and complicated scenes (e.g. anything with ThreeJS). You'll need to serialize everything to the UI thread. That was my initial plan, but our code uses WGPU for complex 2D renderings and serializing them is a pain. But... Hmm... What if we made the device itself be "borrowable"? To get it, you "lock" it to borrow the device from the main thread. Then you do rendering and put it back. This can be represented as a closure, like the Web Locks ( https://developer.mozilla.org/en-US/docs/Web/API/Web_Locks_API ) to cope with exceptions. And this nicely integrates with worklets. If we want to preserve the current semantics, then in my patch, Android now always renders into an offscreen texture buffer. I don't think there's any other choice for robust Android apps. But the upside is that the JS thread does not need to concern itself with the synchronization. The native surface then picks up the last presented texture when the view becomes attached to a window and gets the hardware surface. This works reasonably well and was easy to implement. The downside is that it requires the overhead of doing one texture-to-texture copy and one additional back buffer for the whole window. I went a little bit off the deep end with iOS to implement zero-overhead rendering. I'm not sure it's worth it... |
|
is there a simple example that we could add to the repo that makes it really easy to show the crash issue? |
Yes, the first commit add a new example: WebGPU widget inside a virtualized list ("MultiContext"). It's here: 30116d9 Without the fix, it crashes quickly on Android during scrolling or when switching from the main screen to the screen with the list. |
Sorry for the big PR. This is still a bit of work-in-progress, but it's now in a state where it doesn't crash my Android emulator and renders all the examples correctly.
The handling of onscreen/offscreen transitions was broken in all kinds of interesting ways:
point of view of the JavaScript thread. There's no way to control it.
outside the main thread.
The previous commits add an example of a virtualized list that exercises
the surface lifecycle quite vigorously.
Oh, and to add to the fun: Dawn WebGPU device is not threadsafe. So it
easily crashes even if all we want to do is copy an offscreen texture
to the screen, while the JavaScript thread is rendering.
This commit adds a GPU-level lock to all
NativeObject-based wrappers,so it is automatically taken on any operation. And the UI-thread based
logic takes this lock manually. This serializes all the GPU operations.
And because the iOS and Android implementations have diverged quite a
bit, the
SurfaceInfois now an abstract classSurfaceBridgethat hasplatform-specific implementations.