godot

mirror of https://github.com/godotengine/godot.git synced 2024-12-15 10:12:40 +08:00

Author	SHA1	Message	Date
Matias N. Goldberg	c77cbf096b	Improvements from TheForge (see description) The work was performed by collaboration of TheForge and Google. I am merely splitting it up into smaller PRs and cleaning it up. This is the most "risky" PR so far because the previous ones have been miscellaneous stuff aimed at either [improve debugging](https://github.com/godotengine/godot/pull/90993) (e.g. device lost), [improve Android experience](https://github.com/godotengine/godot/pull/96439) (add Swappy for better Frame Pacing + Pre-Transformed Swapchains for slightly better performance), or harmless [ASTC improvements](https://github.com/godotengine/godot/pull/96045) (better performance by simply toggling a feature when available). However this PR contains larger modifications aimed at improving performance or reducing memory fragmentation. With greater modifications, come greater risks of bugs or breakage. Changes introduced by this PR: TBDR GPUs (e.g. most of Android + iOS + M1 Apple) support rendering to Render Targets that are not backed by actual GPU memory (everything stays in cache). This works as long as load action isn't `LOAD`, and store action must be `DONT_CARE`. This saves VRAM (it also makes painfully obvious when a mistake introduces a performance regression). Of particular usefulness is when doing MSAA and keeping the raw MSAA content is not necessary. Some GPUs get faster when the sampler settings are hard-coded into the GLSL shaders (instead of being dynamically bound at runtime). This required changes to the GLSL shaders, PSO creation routines, Descriptor creation routines, and Descriptor binding routines. - `bool immutable_samplers_enabled = true` Setting it to false enforces the old behavior. Useful for debugging bugs and regressions. Immutable samplers requires that the samplers stay... immutable, hence this boolean is useful if the promise gets broken. We might want to turn this into a `GLOBAL_DEF` setting. Instead of creating dozen/hundreds/thousands of `VkDescriptorSet` every frame that need to be freed individually when they are no longer needed, they all get freed at once by resetting the whole pool. Once the whole pool is no longer in use by the GPU, it gets reset and its memory recycled. Descriptor sets that are created to be kept around for longer or forever (i.e. not created and freed within the same frame) must not use linear pools. There may be more than one pool per frame. How many pools per frame Godot ends up with depends on its capacity, and that is controlled by `rendering/rendering_device/vulkan/max_descriptors_per_pool`. - Possible improvement for later: It should be possible for Godot to adapt to how many descriptors per pool are needed on a per-key basis (i.e. grow their capacity like `std::vector` does) after rendering a few frames; which would be better than the current solution of having a single global value for all pools (`max_descriptors_per_pool`) that the user needs to tweak. - `bool linear_descriptor_pools_enabled = true` Setting it to false enforces the old behavior. Useful for debugging bugs and regressions. Setting it to false is required when workarounding driver bugs (e.g. Adreno 730). A ridiculous optimization. Ridiculous because the original code should've done this in the first place. Previously Godot was doing the following: 1. Create a command buffer pool. One per frame. 2. Create multiple command buffers from the pool in point 1. 3. Call `vkBeginCommandBuffer` on the cmd buffer in point 2. This resets the cmd buffer because Godot requests the `VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT` flag. 4. Add commands to the cmd buffers from point 2. 5. Submit those commands. 6. On frame N + 2, recycle the buffer pool and cmd buffers from pt 1 & 2, and repeat from step 3. The problem here is that step 3 resets each command buffer individually. Initially Godot used to have 1 cmd buffer per pool, thus the impact is very low. But not anymore (specially with Adreno workarounds to force splitting compute dispatches into a new cmd buffer, more on this later). However Godot keeps around a very low amount of command buffers per frame. The recommended method is to reset the whole pool, to reset all cmd buffers at once. Hence the new steps would be: 1. Create a command buffer pool. One per frame. 2. Create multiple command buffers from the pool in point 1. 3. Call `vkBeginCommandBuffer` on the cmd buffer in point 2, which is already reset/empty (see step 6). 4. Add commands to the cmd buffers from point 2. 5. Submit those commands. 6. On frame N + 2, recycle the buffer pool and cmd buffers from pt 1 & 2, call `vkResetCommandPool` and repeat from step 3. Possible issues: @dariosamo added `transfer_worker` which creates a command buffer pool: ```cpp transfer_worker->command_pool = driver->command_pool_create(transfer_queue_family, RDD::COMMAND_BUFFER_TYPE_PRIMARY); ``` As expected, validation was complaining that command buffers were being reused without being reset (that's good, we now know Validation Layers will warn us of wrong use). I fixed it by adding: ```cpp void RenderingDevice::_wait_for_transfer_worker(TransferWorker p_transfer_worker) { driver->fence_wait(p_transfer_worker->command_fence); driver->command_pool_reset(p_transfer_worker->command_pool); // ! New line ! ``` Secondary cmd buffers are subject to the same issue but I didn't alter them. I talked this with Dario and he is aware of this.* Secondary cmd buffers are currently disabled due to other issues (it's disabled on master). - `bool RenderingDeviceCommons::command_pool_reset_enabled` Setting it to false enforces the old behavior. Useful for debugging bugs and regressions. There's no other reason for this boolean. Possibly once it becomes well tested, the boolean could be removed entirely. Adds `command_bind_render_uniform_sets` and `add_draw_list_bind_uniform_sets` (+ compute variants). It performs the same as `add_draw_list_bind_uniform_set` (notice singular vs plural), but on multiple consecutive uniform sets, thus reducing graph and draw call overhead. - `bool descriptor_set_batching = true;` Setting it to false enforces the old behavior. Useful for debugging bugs and regressions. There's no other reason for this boolean. Possibly once it becomes well tested, the boolean could be removed entirely. Godot currently does the following: 1. Fill the entire cmd buffer with commands. 2. `submit()` - Wait with a semaphore for the swapchain. - Trigger a semaphore to indicate when we're done (so the swapchain can submit). 3. `present()` The optimization opportunity here is that 95% of Godot's rendering is done offscreen. Then a fullscreen pass copies everything to the swapchain. Godot doesn't practically render directly to the swapchain. The problem with this is that the GPU has to wait for the swapchain to be released to start anything, when we could start much earlier. Only the final blit pass must wait for the swapchain. TheForge changed it to the following (more complicated, I'm simplifying the idea): 1. Fill the entire cmd buffer with commands. 2. In `screen_prepare_for_drawing` do `submit()` - There are no semaphore waits for the swapchain. - Trigger a semaphore to indicate when we're done. 3. Fill a new cmd buffer that only does the final blit to the swapchain. 4. `submit()` - Wait with a semaphore for the submit() from step 2. - Wait with a semaphore for the swapchain (so the swapchain can submit). - Trigger a semaphore to indicate when we're done (so the swapchain can submit). 5. `present()` Dario discovered this problem independently while working on a different platform. However TheForge's solution had to be rewritten from scratch: The complexity to achieve the solution was high and quite difficult to maintain with the way Godot works now (after Übershaders PR). But on the other hand, re-implementing the solution became much simpler because Dario already had to do something similar: To fix an Adreno 730 driver bug, he had to implement splitting command buffers. This is exactly what we need!. Thus it was re-written using this existing functionality for a new purpose. To achieve this, I added a new argument, `bool p_split_cmd_buffer`, to `RenderingDeviceGraph::add_draw_list_begin`, which is only set to true by `RenderingDevice::draw_list_begin_for_screen`. The graph will split the draw list into its own command buffer. - `bool split_swapchain_into_its_own_cmd_buffer = true;` Setting it to false enforces the old behavior. This might be necessary for consoles which follow an alternate solution to the same problem. If not, then we should consider removing it. PR #90993 added `shader_destroy_modules()` but it was not actually in use. This PR adds several places where `shader_destroy_modules()` is called after initialization to free up memory of SPIR-V structures that are no longer needed.	2024-12-09 11:49:28 -03:00
Gergely Kis	146ba4106f	Move Vulkan includes to a central godot_vulkan.h header Also fixes Vulkan build problem with recent Clang.	2024-09-29 17:53:18 +02:00
Rémi Verschelde	940d629070	vulkan: Update all components to Vulkan SDK 1.3.183.0 Pass `VMA_ALLOCATOR_CREATE_KHR_MAINTENANCE5_BIT` to VMA when using Vulkan 1.3 features. Co-authored-by: Pedro J. Estébanez <pedrojrulez@gmail.com>	2024-06-03 10:25:46 +02:00
Jakub Marcowski	8350c88718	vulkan: Update all components to Vulkan SDK 1.3.275.0	2024-02-06 13:46:56 +01:00
DeeJayLSP	7e48a7420c	vulkan: Update components to Vulkan SDK 1.3.268.0	2024-01-11 20:27:30 -03:00
Rémi Verschelde	728dbeab69	vulkan: Update all components to Vulkan SDK 1.3.261.1 Updates to volk, vulkan headers, `vk_enum_string_helper.h`, VMA, glslang, spirv-reflect. VMA doesn't tag SDK releases specifically, and still hasn't had a tagged release since 3.0.1, but the Vulkan SDK now seems to ship a recent master commit, so we do the same.	2023-09-01 11:23:48 +02:00
DeeJayLSP	1b642d283c	Update Vulkan and related libraries to 1.3.250.0	2023-06-06 12:40:04 -03:00
Rémi Verschelde	b113e6d4ff	Vulkan: Fix VMA build with GCC 13 Fixes #74647.	2023-03-09 10:46:35 +01:00
Rémi Verschelde	0181d005c9	vulkan: Update all components to Vulkan SDK 1.3.231.1 Updates to volk, vulkan headers, `vk_enum_string_helper.h`, glslang, spirv-reflect. No update to VMA which still has 3.0.1 as it's last tagged release.	2022-11-03 12:20:46 +01:00
Cyberrebell	6a2bd6c936	updated vk_mem_alloc.h to fix startup issue with AMD 6000 series GPUs using SteamVR on Windows	2022-06-12 23:36:06 +02:00
Pedro J. Estébanez	171e31de68	vk_mem_alloc: Update to upstream + Replace use of deprecated items	2022-03-29 11:28:09 +02:00
Pedro J. Estébanez	801741e787	vk_mem_alloc: Update to upstream + Adapt approach to small objects pooling This updates VMA and instead of using the custom small pool approach from `4e6c9d3ae9`, lazily creates pools for the relevant memory type indices, which doesn't require patching VMA. Also, patches already merged upstream or not needed any longer are removed.	2022-02-24 14:30:55 +01:00
Rémi Verschelde	09a61cdf53	Merge pull request #57989 from RandomShaper/update_vma Update & patch VMA, and re-implement the small buffers optimization	2022-02-14 09:07:11 +01:00
Pedro J. Estébanez	4e6c9d3ae9	Add a separate pool for small allocations in Vulkan RD	2022-02-12 12:47:08 +01:00
Pedro J. Estébanez	648a10514b	vk_mem_alloc: Update to latest commit	2022-02-12 12:45:28 +01:00
Rémi Verschelde	26b2defe0c	vulkan: Update volk, headers and glslang to 1.3.204	2022-02-11 18:42:51 +01:00
Rémi Verschelde	8f4793b225	Revert "vulkan: Update volk, headers and glslang to 1.3.204" This reverts commit `d233908fb6`.	2022-02-11 17:50:22 +01:00
Rémi Verschelde	d233908fb6	vulkan: Update volk, headers and glslang to 1.3.204	2022-02-10 23:57:03 +01:00
Rémi Verschelde	fd641ac85c	Vulkan: Update volk and Vulkan SDK components to 1.2.190	2021-09-22 12:56:15 +02:00
Pedro J. Estébanez	7b7e17a626	Upgrade Vulkan memory allocator	2021-08-13 00:05:41 +02:00
bruvzg	d7957a2a20	Use "volk" instead of statically linked Vulkan loader.	2021-08-12 14:25:15 +03:00
jacobcoughenour	66d429576c	Vulkan: loader, headers, and glslang updated to sdk-1.2.162.0 Updated glslang and Vulkan headers/loader following the instructions found in thirdparty/README. glslang was updated to the 'known good' matching Vulkan SDK version 1.2.162.0. Vulkan headers and loader were updated to the commit tagged with sdk-1.2.162.0. 'vk_mem_alloc.h' and 'vk_mem_alloc.c' are unchanged since there hasn't been a new tagged release since 2.3.0. Here's the Vulkan release notes for this update: https://vulkan.lunarg.com/doc/sdk/1.2.162.0/windows/release_notes.html Reverted and removed the unnecessary fix-mingw-snprintf patch for glslang as well as the mention of it in thirdparty/README.md.	2020-12-21 20:28:49 -05:00
Rémi Verschelde	6a951267ae	vulkan: Backport build fix for MinGW-w64 8.0.0 Taken from https://github.com/KhronosGroup/Vulkan-Loader/pull/475. Supersedes and reverts #43119 since the upstream change removes the need for that custom define.	2020-10-29 12:47:35 +01:00
Rémi Verschelde	9000db505e	vulkan: Re-add Windows patch to fix static library use Fixes #43105.	2020-10-26 23:30:47 +01:00
Rémi Verschelde	148ad49c93	vulkan: Sync loader, headers and glslang to sdk-1.2.154.0 Actually sdk-1.2.154.1 for Vulkan-Loader. glslang is updated to bacaef3237c515e40d1a24722be48c0a0b30f75f which is the known-good version for Vulkan-ValidationLayers 1.2.154.0. COPYRIGHT.txt was synced with the current version of the glslang LICENSE.txt, and `glslang/register_types.cpp` now uses the upstream definition for its default builtin resource instead of hardcoding it.	2020-10-15 12:29:42 +02:00
Sergey Minakov	6e0d4e21ff	Thirdparty Vulkan: patch VMA to fix assets Applies VMA master branch patch that removes incorrect asserts: issue: https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator/issues/102 patch: `39aeff7a43`	2020-07-25 21:55:05 +02:00
PouleyKetchoupp	802bbe87ad	Fix extra warnings in Android build	2020-04-10 11:06:11 +02:00
Rémi Verschelde	516b3bb88f	Fix Clang warnings on Windows Fixes #37490.	2020-04-01 16:28:20 +02:00
Rémi Verschelde	d744d3046e	vulkan: Re-add option to build Vulkan-Loader statically Upstream removed the option in KhronosGroup/Vulkan-Loader#260, which breaks our current use case. This commit reverts KhronosGroup/Vulkan-Loader#260 is our vendored loader. We may need to re-evaluate how we link the loader, but until then, reverting this PR fixes Windows support after the upgrade to a recent SDK version in #36932.	2020-03-09 15:25:54 +01:00
Rémi Verschelde	214bc9e5a1	Update Vulkan loader and headers to sdk-1.2.131.2 (Headers are actually sdk-1.2.131.1, they did not get a re-release.) Also synced VMA 2.3.0 again, fixing unwanted clang-formatting of thirdparty code.	2020-03-09 09:36:37 +01:00
bruvzg	4cc439922a	Update VulkanMemoryAllocator to 2.3.0 (Fixes build for 32-bit Windows and Linux).	2020-02-11 19:05:50 +02:00
Rémi Verschelde	db81928e08	Vulkan: Move thirdparty code out of drivers, style fixes - `vk_enum_string_helper.h` is a generated file taken from the SDK (Vulkan-ValidationLayers). - `vk_mem_alloc.h` is a library from GPUOpen: https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator	2020-02-11 14:08:44 +01:00
bruvzg	7bf72ed14e	Update Vulkan loader to 1.1.127	2020-02-11 12:05:27 +01:00
Rémi Verschelde	511f65214f	SCons: Streamline Vulkan buildsystem + fixups - Renamed option to `builtin_vulkan`, since that's the name of the library and if we were to add new components, we'd likely use that same option. - Merge `vulkan_loader/SCsub` in `vulkan/SCsub`. - Accordingly, don't use built-in Vulkan headers when not building against the built-in loader library. - Drop Vulkan registry which we don't appear to need currently. - Style and permission fixes.	2020-02-11 11:59:04 +01:00
Rémi Verschelde	ae3ce08982	VulkanLoader: Make Windows includes lowercase for MinGW MinGW-w64 ships all Windows SDK headers as lowercase, which prevents cross-compiling this code from Linux. Windows filesystems are case insensitive so it should work fine with lowercase includes. PR'ed upstream: https://github.com/KhronosGroup/Vulkan-Loader/pull/212	2020-02-11 11:58:54 +01:00
bruvzg	eb48be51db	Add static Vulkan loader. Initial Vulkan support for Windows. Initial Vulkan support for macOS.	2020-02-11 11:57:11 +01:00

36 Commits