Conversation
|
Thanks for the PR! |
|
Okay, played around with it a little. |
|
I played around with it a little bit as well, and using nSight I could at least get as far as seeing that rendering a background with a skybox causes some amount of corruption to get into the main framebuffer - I haven't yet been able to see where it's coming from (as rendering a skybox should be one of the simpler things, just some basic geo and a couple textures, no lighting), but, well, there it is. |
i'm on the hard-light discord server i'm "mara" there. i'm not very active on discord, but happy to discuss.
To be honest i've only been replaying the retail campaign with it. So code paths not exercised there will be less (or even not) tested. Please let me know which mods this happens with!
Thanks for trying. i've installed NVidia Coresight but couldn't get it to report any issues in the level i tried (and i'd been using Vulkan's validation layer as well as RenderDoc during development and it should be clean). Can you send me the level file this happens with? And the messages that i should watch for? |
My testing was done using the latest mediaVPs mod, running both the first mission and using the lab environment. RenderDoc capture available here: https://drive.google.com/file/d/1ficdGUP-e8xfmUjZWAzWZmwpa9aAtFmt/view?usp=drive_link |
|
I was similarly running the MediaVPs' first mission (where I got the artifacts), and I was testing the "Icarus" Cutscene from Blue Planet (which crashes on trying to render the opening movie, skippable with |
|
Im not in any position to ask, but instead of getting the vulkan lib and headers currently installed in the host system, maybe its better to use a glad2 loader for vulkan in the same way as it is for OpenGL? |
|
Wow, nice work. Pretty straightforward design, nothing surprising. I have a local WIP DX12 implementation I've been working on and off on in my spare time for general practice and I see a lot of similar decisions you've made here in your VK implementation. I kind of wonder if we need to double buffer the immediate buffer so that we leave alone the one that's in-flight. But maybe it doesn't matter if the fence in the command buffer submission and flip takes care of everything. Or is it the buffer manager that keeps track of the frame num? Surprised that my batching code made it out intact. Also surprised that my render primitives immediate code also made it out intact. Sorry if it caused any headaches. |
|
Ill put this in here for reference in case anyone is interested. i did tried to see if i can change it to use the glad2 loader instead, as i expected since it is using vulkan.hpp, it is using the Vulkan C++ bindings, glad 2 loader has the C bindings, in the exact same way as with the version OpenGL. So its not a huge amount of work to change it, but it is still considerable work to change all bindings. (like 2-3 days). It is some work just to get it compile again not knkwing it is going to still work after that. I also got the current PR version to compile for android by just adding the missing .hpp vulkan headers to the Android NDK, not elegant as im adding stuff to the toolchain but, it will do for now. Buuuuut it does not compile for 32 bits (x86/arm32), but it does for x86_64/arm64, not sure if this also the case for regular builds On my phone with a Mali-G57 On my Retroid G2 Handheld with a Qualcomm G2 and an Adreno 22 GPU, it fails to init vulkan because it lacks a transfer queue. I guess it is VK_QUEUE_TRANSFER_BIT? So its not completely 1.1 it uses an optional extension/feature. |
Thanks. The renderdoc capture should be helpful for reproduction.
It does. This is handled purely in the Vulkan layer. The buffer manager does a double buffering of all dynamic and streaming buffers in
Hahah it wasn't too bad!
Will look into it. It seems it would be way easier to vendor vulkan.hpp instead of switching to using C bindings, so i'll go for that first. |
|
Okay. i've bundled the Vulkan and Vulkan-CPP headers in With this, it should be possible to build it on (or for) platforms without the Vulkan library and headers installed. |
|
Trying to get it to pass the CI now. Will squash all these changes into the main (or otherwise original) commit when done. |
|
i'm not happy where clang-tidy is taking some of these. It first wants to make these functions static (because it could), and now it want to refer to them by fully qualified class name instead of instance: - auto* texSlot = texManager->getTextureSlot(handle);
+ auto* texSlot = graphics::vulkan::VulkanTextureManager::getTextureSlot(handle);- drawManager->stencilClear();
+ graphics::vulkan::VulkanDrawManager::stencilClear();Which is strictly correct but it's also less readable, and asymmetric with the rest of the API. Will see if Edit: it did. Will look into rendering issues next. |
e5c9a34 to
61f890e
Compare
|
Hi, Shivansps here, im on a diferent account, i think i know why it says there is no transfer queue on the adreno driver. I think this if here is wrong if (!values.transferQueueIndex.initialized && queue.queueFlags & vk::QueueFlagBits::eTransfer) { Acording to the documentation "All commands that are allowed on a queue that supports transfer operations are also allowed on a queue that supports either graphics or compute operations. Thus, if the capabilities of a queue family include VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT, then reporting the VK_QUEUE_TRANSFER_BIT capability separately for that queue family is optional." eGraphics (and eCompute) all include a transfer queue but may not report it. |
Good catch. Yes, the logic there is wrong. "It worked on NVidia" 😊 Will fix. Edit: Mind that the transfer queue is currently unused, as this makes the upload code simpler, due to there being no cross-queue synchronization requirement. In the current design there wouldn't be a benefit to using it, just overhead, as there's (AFAIK) no way to exploit parallelism here. So we could even decide to completely remove checking for it. |
|
i've pushed a few rendering corruption fixes. Some wrong assumptions about renderpass state, and Vulkan vs GL differences. The cubemap corruption and random framebuffer noise should be solved now. |
|
Just reporting back here, the change to the transfer queue selection did work. Now the Adreno GPU works and can get into the game. |
|
Today i saw two things:
Changing to C++ types fixes 32 bit compilation struct VulkanAllocation { Why this compiles its not going to work or it is going to have additional issues as VulkanPipeline.cpp has shifts to go out of range for 32 bit types. |
|
While not an immediate priority, I'd love to question the following design goal: Long / Medium term, I would like for FSO to ship with shadertool or something to allow it to compile to SPIR-V itself. This gets rid of a lot of issues here. First, we'd be able to keep text-based shaders that can be dual-use for OpenGL and Vulkan. Any incompatibilities can just be put in preprocessor blocks like main-f's prereplace, allowing full dual-use of all shaders. Furthermore, it'd allow table-able postprocessing and shader changes. While currently a full shader replace is necessary for custom shaders, I eventually want this to be properly modular, so being able to modify parts of shaders is a goal, and that for sure requires compilation on-the-fly. Shipping with shadertool and then compiling on load (ideally after game-settings.tbl, especially since the recent Z-Compress changes) all available shaderfiles to SPRIV shouldn't be that hard either. |
|
@Shivansps @GamingCity @The-E @BMagnu |
|
Please, I dont want to make you waste time on android testing, its not even a working platform yet. Ill post if i can find out something. If you want to see i have a Fso_Android_Wrapper](https://github.com/Shivansps/Fso_Android_Wrapper) as the android test app, Fso-Android-Prebuilts were i have the script and instructions to build the fso dependencies and fso itself, and i have a "android-build-vulkan" branch on my fork where i added this pr to my previous android work, I did found one problem with android on VulkanRenderer It seems that if you leave at that and use I changed it to this that did worked. auto supported = deviceValues.surfaceCapabilities.supportedTransforms; I dont know if thats the right fix, it does not seems to do anything in windows. Keep in mind i used an AI to point me to this and the potential fix as i did not know if anything in vulkan could cause this, it told me to check where the preTransfor and surface capabilities are set for the transform and that i should use the eidentity flag. https://docs.vulkan.org/refpages/latest/refpages/source/VkSurfaceTransformFlagBitsKHR.html |
|
Fair enough re: compiling shaders and large dependencies, but I think it is worth here. |
4cd808a to
25d4c43
Compare
|
The most recent commit unifies the GL and Vulkan shaders (as much as possible). It also uses shader variants throughout, just like GL. Apparently i need to rebase. i think i'll first squash back the Vulkan backend into one commit, to avoid back-and-forth changes (e.g. the changes to Edit: i could also extract some common shader compiler logic between Vulkan and OpenGL backends, which i'll put in a commit before introducing the Vulkan backend. Still working on this. Edit/2: sorry i messed up the rebase. i have it working again locally, but ran into some new rendering issues to do with the z-compression commit (#7245). Will push the new branch only when i have that sorted out. Edit/3: this should be back to the functional state before the shader unification and rebase. The depth compression for the deferred shading position buffer works in Vulkan too. There's still the same rendering issues with particles, but no other regressions AFAIK. |
Move the pure-math vertex/index generation out of `gropengldeferred.cpp` into graphics/util/primitives so it can be reused by the Vulkan backend. Modernize to use `SCP_vector` instead of `vm_malloc`/`vm_free` for automatic memory management.
Replace direct `ImGui_ImplOpenGL3` calls in game code with backend-agnostic `gr_imgui_new_frame` and `gr_imgui_render_draw_data` function pointers, matching the pattern used by all other `gr_*` functions. This makes it possible for the Vulkan backend to provide its own ImGui implemantation.
`bm_close` calls `gf_bm_free_data` for each bitmap slot, which needs the graphics backend (Vulkan texture manager, OpenGL context) to still be alive. Move `bm_close` before the backend cleanup switch in `gr_close`.
`gr_flash_internal` used int vertices with `SCREEN_POS` (`VK_FORMAT_R32G32_SINT`) but the default-material vertex shader expects vec4 float at location 0. OpenGL silently converts via glVertexAttribPointer; Vulkan requires exact type matching. Use float vertices with `POSITION2` format instead. There should be no difference in behavior.
The `SCREEN_POS` vertex format is no longer used after the only use in `gr_flash` was removed. Remove it entirely.
Deduplicate compressed texture block-size mapping and mip-size calculation into two inline helpers in `ddsutils.h`, replacing repeated inline formulas in `ddsutils.cpp` and `gropengltexture.cpp`.
Add a render system capability to indicate whether GPU timestamp query handles can be immediately reused after reading. When queries are not reusable, `free_query_object` returns handles to the backend via `gr_delete_query_object` instead of the tracing free list, letting the backend manage its own reset lifecycle. This greatly simplifies query management for Vulkan. Also change shutdown to discard gpu_events for backends where queries aren't reusable (no more frames will be submitted to make them available).
Move `output_uniform_debug_data` before `gr_reset_immediate_buffer` so debug text is rendered while the immediate buffer still contains valid data. The previous ordering read from a buffer that was already reset to offset 0, which is logically wrong for any backend and a hard failure for deferred-submission backends.
`gr_set_proj_matrix` already branches on rendering_to_texture to choose top-left (RTT) vs bottom-left (screen) viewport origin. `gr_end_2d_matrix` should match, but it unconditionally used the bottom-left formula. Add the same `rendering_to_texture` branch so the viewport is restored correctly when rendering to a texture.
Change `bool clipEnabled` to `uint clipEnabled` in the default-material shader UBO. GLSL bool has implementation-defined std140 layout; uint is portable and matches the SPIR-V decompiled output. Add an else-branch writing `gl_ClipDistance[0] = 1.0` when clipping is disabled. Without this, gl_ClipDistance is undefined and some drivers cull geometry unexpectedly.
Memcpy from a `const void*` to `void*` is trivial enough. However, this case was missing, resulting in a false positive compilation error.
Extract shader loading and preprocessing (include/predefine expansion) into code/graphics/shader_preprocess.cpp, so it can be shared with the Vulkan backend.
Bundle Vulkan headers (v1.4.309).
Bundle Vulkan Memory Allocator (v3.2.1).
|
i managed to narrow down and fix the particle/laser background problem. It was really tricky! Apparently, It's been changed to use the There were some other bugs in Vulkan compressed texture handling as well. With these fixed, not only do the lasers etc. look as they should, loading is a lot faster too. Edit: With MVPS, there's still a render issue where it shows contours around ships when backgrounded against particles like the engine exhaust. But we're getting there. |
95f5285 to
299a2e5
Compare
|
Tested this on my Mac and after a slight struggle to get things going (libs and shadertool primarily) it runs pretty well. Some visual glitches, but nothing terrible. I only did a few quick tests so I can't say for sure how well everything is working. But from what I saw the average FPS jumped up by 40-50 FPS over what OpenGL gives me. Full neb missions were quite slow in comparison, but this appeared to be the case with OpenGL too and may be related to MVPS 5.0.x as this was my first run with that update as well. The massive battle missions that often run at 7-8 FPS at the start with OpenGL were at 20+ FPS with vulkan. One issue that hit me early on was error handling. If vulkan failed to init properly for some reason then it would always hit an assert or memory error during deinit. This only appears to happen during init failure, as a proper init will deinit just fine. Another issue was memory usage. It was consuming 2x-3x the memory compared to using OpenGL. I also only ever saw the memory usage increase, never decrease. With OpenGL the memory usage would fluctuate up or down based on the loaded mission. With vulkan, after loading 3 or 4 different missions, it appeared to be using roughly 16 GB of memory (~10 for app, ~6 for graphics, according to metal overlay). When using OpenGL it typically appears to use 4-5 GB of memory total. I'm not sure if that's a platform issue, a code issue, or just a renderer difference. |
Thanks for testing! FWIW: You shouldn't need shadertool. It's only used to generate
Agree that ideally, deinit shouldn't hit asserts. Any problem during init will already have been reported by then, so it's redundant. It's absolutely set to be somewhat paranoid on the side of error handling right now because i didn't want bugs to silently creep in.
Did you test before or after the texture compression fix? If before, please retry, as that alone makes textures, a big part of memory use, much smaller. In any case, thanks for the report. i haven't looked into optimizing memory usage at all, yet. |
I'll do some better testing later this week or early next week so I can point you to the exact areas I was hitting. Most of that appeared to be due to library issues, which we'd fix with a new set of prebuilt libs, so people generally shouldn't have those issues anyway. But I'll document what I can in case there does happen to be an actual code issue hiding in there somewhere.
After. And I double-checked that to be sure before submitting that previous comment since I figured that might be an issue. I confirmed that both S3TC and BPTC were actually supported too in case that was the problem. That might be part of the issue as well since OpenGL on the Mac does not support BPTC and the decompression routine limits the mipmap size which could artificially lower OpenGL memory requirements on the larger compressed textures. So it's a unfair comparison to make on the Mac and should really be looked at on Windows/Linux with better OpenGL drivers to confirm whether there is a problem or not. Still though, it's pretty awesome to jump from 30-40 avg fps on the Mac to 90-100 avg fps. With a debug build no less. So I'm pretty thrilled with the progress here and the great work that you're doing! |
ddsutils.cpp checked OpenGL-specific GLAD globals to decide whether to decompress DXT textures. When the Vulkan backend was active these variables were never set, so all DXT textures were decompressed to 32bpp RGBA. Replace the GLAD checks with gr_is_capable() queries for the new CAPABILITY_S3TC and existing CAPABILITY_BPTC, making ddsutils backend-agnostic. Add the S3TC capability handler to the OpenGL backend.
Extract shader type tables (filenames, descriptions) and
variant tables (type, flag, define, description) into shared
code/graphics/shader_types.{h,cpp}.
Also move FXAA quality preset defines into shader_types so both
backends can share a single implementation.
Implement a Vulkan 1.1 renderer that replaces the previous stub with a fully functional backend, mostly matching the OpenGL backend's rendering capabilities. Core rendering infrastructure: - `VulkanMemory`: Custom allocator with sub-allocation from device-local and host-visible memory pools - `VulkanBuffer`: Per-frame bump allocator for streaming uniform/vertex/index data (persistently mapped, double-buffered, auto-growing) - `VulkanTexture`: Full texture management including 2D, 2D-array, 3D, and cubemap types with automatic mipmap generation and sampler caching - `VulkanPipeline`: Lazy pipeline creation from hashed render state, with persistent VkPipelineCache - `VulkanShader`: GLSL shader loading. Shader code and metadata are shared with OpenGL, with differences guarded by preprocessor conditions - `VulkanDescriptorManager`: 3-set descriptor layout (Global/Material/PerDraw) with per-frame pool allocation, auto-grow, and batched updates - `VulkanDeletionQueue`: Deferred resource destruction synchronized to frame-in-flight fences Design choices: - Two frames in flight with fence-based synchronization - Asynchronous texture upload, no `waitIdle` in hot path - Single command buffer per frame; render passes begun/ended as needed for the multi-pass deferred pipeline - Per-frame descriptor pools - All descriptor bindings pre-initialized with fallback resources (zero UBO + 1x1 white texture) so partial updates never leave undefined state - Streaming data uses a bump allocator (one large VkBuffer per frame) - Pipeline cache persisted to disk for fast startup on subsequent runs - Use VMA (Vulkan Memory Allocator) for buffer management Some notable Vulkan vs OpenGL differences are: - Depth range is [0,1] not [-1,1]: shadow projection matrices adjusted, shaders that linearize depth need isinf/zero guards at depth boundaries where OpenGL gives finite values - In Vulkan, all shader outputs must be initialized. Leaving them uninitialized can result in random corruptions, while OpenGL allows leaving them in some cases - Swap chain is B8G8R8A8: screenshot/save_screen paths swizzle to RGBA - Vulkan render target is "upside down", y-flip for render target is handled through negative viewport height, as is common - Texture addressing for AABITMAP/INTERFACE/CUBEMAP forced to clamp (OpenGL's sampler state happens to do this implicitly) - Render pass architecture requires explicit transitions between G-buffer, shadow, decal, light accumulation, fog, and post-processing passes (OpenGL just switches FBO bindings)






Implement a Vulkan 1.1 renderer that replaces the previous stub with a fully functional backend, mostly matching the OpenGL backend's rendering capabilities. The game should be playable with minimal divergence from OpenGL rendering.
This is, most likely, too big to go in all at once, but just filing it here for reference because it's reached a testable state.
Core rendering infrastructure. The code lives under
code/graphics/vulkan:VulkanMemory: Custom allocator with sub-allocation from device-local and host-visible memory poolsVulkanBuffer: Per-frame bump allocator for streaming uniform/vertex/index data (persistently mapped, double-buffered, auto-growing)VulkanTexture: Full texture management including 2D, 2D-array, 3D, and cubemap types with automatic mipmap generation and sampler cachingVulkanPipeline: Lazy pipeline creation from hashed render state, with persistent VkPipelineCacheVulkanShader: SPIR-V shader loading (main, deferred, effects, post-processing, shadows, decals, fog, MSAA resolve, etc.)VulkanDescriptorManager: 3-set descriptor layout (Global/Material/PerDraw) with per-frame pool allocation, auto-grow, and batched updatesVulkanDeletionQueue: Deferred resource destruction synchronized to frame-in-flight fencesDesign choices:
waitIdleor other CPU-on-GPU blocking in hot pathSome notable Vulkan vs OpenGL differences are:
Preparation patches to common game code (these commits need to go in first):
SCREEN_POSvertex format: Cleanup after previous commitvoid *, const void*What's possibly left to be done:
Unify OpenGL and Vulkan shaders where possible: the only shader shared with OpenGL (defined in the buid system's
SHADERS_GL_SHARED) is still the default material. Although the Vulkan backend does some things differently, it would definitely be possible to share more code. But i didn't want to accidentally break OpenGL in some way.Integrate VMA (Vulkan Memory Allocator). Some of the memory handling could be simplified by importing this dependency.
OpenXR anything. This is currently not implemented at all.
Build steps:
To run (with maximum debugging and Vulkan layer validation):
Full disclosure: i used Claude Opus 4.6 while developing this. However, the overall direction and design is my own, and i've paid careful attention to the code.