Cluster light probes using conservative spherical bounds. by pcwalton · Pull Request #13746 · bevyengine/bevy

pcwalton · 2024-06-08T03:59:35Z

This commit allows the Bevy renderer to use the clustering infrastructure for light probes (reflection probes and irradiance volumes) on platforms where at least 3 storage buffers are available. On such platforms (the vast majority), we stop performing brute-force searches of light probes for each fragment and instead only search the light probes with bounding spheres that intersect the current cluster. This should dramatically improve scalability of irradiance volumes and reflection probes.

The primary platform that doesn't support 3 storage buffers is WebGL 2, and we continue using a brute-force search of light probes on that platform, as the UBO that stores per-cluster indices is too small to fit the light probe counts. Note, however, that that platform also doesn't support bindless textures (indeed, it would be very odd for a platform to support bindless textures but not SSBOs), so we only support one of each type of light probe per drawcall there in the first place. Consequently, this isn't a performance problem, as the search will only have one light probe to consider. (In fact, clustering would probably end up being a performance loss.)

Known potential improvements include:

We currently cull based on a conservative bounding sphere test and not based on the oriented bounding box (OBB) of the light probe. This is improvable, but in the interests of simplicity, I opted to keep the bounding sphere test for now. The OBB improvement can be a follow-up.
This patch doesn't change the fact that each fragment only takes a single light probe into account. Typical light probe implementations detect the case in which multiple light probes cover the current fragment and perform some sort of weighted blend between them. As the light probe fetch function presently returns only a single light probe, implementing that feature would require more code restructuring, so I left it out for now. It can be added as a follow-up.
Light probe implementations typically have a falloff range. Although this is a wanted feature in Bevy, this particular commit also doesn't implement that feature, as it's out of scope.
This commit doesn't raise the maximum number of light probes past its current value of 8 for each type. This should be addressed later, but would possibly require more bindings on platforms with storage buffers, which would increase this patch's complexity. Even without raising the limit, this patch should constitute a significant performance improvement for scenes that get anywhere close to this limit. In the interest of keeping this patch small, I opted to leave raising the limit to a follow-up.

Changelog

Changed

Light probes (reflection probes and irradiance volumes) are now clustered on most platforms, improving performance when many light probes are present.

This commit allows the Bevy renderer to use the clustering infrastructure for light probes (reflection probes and irradiance volumes) on platforms where at least 3 storage buffers are available. On such platforms (the vast majority), we stop performing brute-force searches of light probes for each fragment and instead only search the light probes with bounding spheres that intersect the current cluster. This should dramatically improve scalability of irradiance volumes and reflection probes. The primary platform that doesn't support 3 storage buffers is WebGL 2, and we continue using a brute-force search of light probes on that platform, as the UBO that stores per-cluster indices is too small to fit the light probe counts. Note, however, that that platform also doesn't support bindless textures (indeed, it would be very odd for a platform to support bindless textures but not SSBOs), so we only support one of each type of light probe per drawcall in the first place. So this isn't a performance problem, as the search will only have one light probe to consider. (In fact, clustering would probably end up being a performance loss.) Known potential improvements include: 1. We currently cull based on a conservative bounding sphere test and not based on the oriented bounding box (OBB) of the light probe. This is improvable, but in the interests of simplicity, I opted to keep the bounding sphere test for now. The OBB improvement can be a follow-up. 2. This patch doesn't change the fact that each fragment only takes a single light probe into account. Typical light probe implementations detect the case in which multiple light probes cover the current fragment and perform some sort of weighted blend between them. As the light probe fetch function presently returns only a single light probe, implementing that feature would require more code restructuring, so I left it out for now. It can be added as a follow-up. 3. Light probe implementations typically have a falloff range. Although this is a wanted feature in Bevy, this particular commit also doesn't implement that feature, as it's out of scope. 4. This commit doesn't raise the maximum number of light probes past its current value of 8 for each type. This should be addressed later, but would possibly require more bindings on platforms with storage buffers, which would increase this patch's complexity. Even without raising the limit, this patch should constitute a significant performance improvement for scenes that get anywhere close to this limit. In the interest of keeping this patch small, I opted to leave raising the limit to a follow-up.

jgayfer

Overall I think this is making sense. The match statement give us a chance to only perform work on the appropriate types, which should reduce redundant work (and I presume make it easier to add more types in the future).

Is there a specific example or benchmark I can run to verify performance delta? I haven't done much in the way of benchmarking before yet with Bevy.

jgayfer · 2024-06-23T04:31:50Z

crates/bevy_pbr/src/cluster/mod.rs

+    /// that order. The remaining fields are filled with zeroes.
    #[size(runtime)]
-    data: Vec<UVec4>,
+    data: Vec<[UVec4; 2]>,


To check my understanding, is this basically a way to let us push more than 4 values into each element in a format that WGSL supports?

Is passing in a vector of a struct that can support more than 4 fields an option? Or is this mostly a way to not need to explicitly pad?

Using a struct could lend itself to better readability throughout this patch (e.g. no need to keep track of what x, y, z, and w map to). But I imagine it's done this way for performance reasons.

crates/bevy_pbr/src/render/clustered_forward.wgsl

Elabajaba

Code looks good, comments are great as always.

Somewhat surprisingly (to me at least) is that this raises our vector register (VGPR) usage in the pbr fragment shader by quite a bit in both the 3d_scene and irradiance_volumes examples. In irradiance_volumes we went from 77 to 94 VGPR, which dropped our wavefront occupancy from 6/16 to 5/16 (16 is the theoretical max on my 6800xt).

In 3d_scene it went from 72 to 77 VGPR, which drops occupancy from 7/16 to 6/16.

Not really sure how to contextualize these numbers, but basically forward PBR shaders are expected to have not great register pressure, but I think in more optimized engines they're in the 8-10/16 occupancy range?

I don't think the increased register use should block this PR, but it's just something to note.

crates/bevy_pbr/src/light_probe/light_probe.wgsl

This commit allows the Bevy renderer to use the clustering infrastructure for light probes (reflection probes and irradiance volumes) on platforms where at least 3 storage buffers are available. On such platforms (the vast majority), we stop performing brute-force searches of light probes for each fragment and instead only search the light probes with bounding spheres that intersect the current cluster. This should dramatically improve scalability of irradiance volumes and reflection probes. The primary platform that doesn't support 3 storage buffers is WebGL 2, and we continue using a brute-force search of light probes on that platform, as the UBO that stores per-cluster indices is too small to fit the light probe counts. Note, however, that that platform also doesn't support bindless textures (indeed, it would be very odd for a platform to support bindless textures but not SSBOs), so we only support one of each type of light probe per drawcall in the first place. So this isn't a performance problem, as the search will only have one light probe to consider. (In fact, clustering would probably end up being a performance loss.) Known potential improvements include: 1. We currently cull based on a conservative bounding sphere test and not based on the oriented bounding box (OBB) of the light probe. This is improvable, but in the interests of simplicity, I opted to keep the bounding sphere test for now. The OBB improvement can be a follow-up. 2. This patch doesn't change the fact that each fragment only takes a single light probe into account. Typical light probe implementations detect the case in which multiple light probes cover the current fragment and perform some sort of weighted blend between them. As the light probe fetch function presently returns only a single light probe, implementing that feature would require more code restructuring, so I left it out for now. It can be added as a follow-up. 3. Light probe implementations typically have a falloff range. Although this is a wanted feature in Bevy, this particular commit also doesn't implement that feature, as it's out of scope. 4. This commit doesn't raise the maximum number of light probes past its current value of 8 for each type. This should be addressed later, but would possibly require more bindings on platforms with storage buffers, which would increase this patch's complexity. Even without raising the limit, this patch should constitute a significant performance improvement for scenes that get anywhere close to this limit. In the interest of keeping this patch small, I opted to leave raising the limit to a follow-up.

…ster-light-probes

tychedelia

LGTM! Tested irradiance_volumes and reflection_probes examples.

Cluster light probes

tychedelia · 2024-10-18T22:40:38Z

Looks like the example failure is real. Also getting The preprocessing index buffer wasn't present when trying to run alien_cake_addict locally.

…ster-light-probes

This patch was doing non-shadows before shadows (and ditto for volumetrics), while the reverse was what `prepare_lights` expected.

jgayfer

Thanks!

# Objective Volumetric fog was broken by #13746. Looks like this particular shader just got missed. I don't see any other instances of `unpack_offset_and_counts` in the codebase. ``` 2024-12-06T03:18:42.297494Z ERROR bevy_render::render_resource::pipeline_cache: failed to process shader: error: no definition in scope for identifier: 'bevy_pbr::clustered_forward::unpack_offset_and_counts' ┌─ crates/bevy_pbr/src/volumetric_fog/volumetric_fog.wgsl:312:29 │ 312 │ let offset_and_counts = bevy_pbr::clustered_forward::unpack_offset_and_counts(cluster_index); │ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unknown identifier │ = no definition in scope for identifier: 'bevy_pbr::clustered_forward::unpack_offset_and_counts' ``` ## Solution Use `unpack_clusterable_object_index_ranges` to get the indices for point/spot lights. ## Testing `cargo run --example volumetric_fog` `cargo run --example fog_volumes` `cargo run --example scrolling_fog`

…#13746) This commit allows the Bevy renderer to use the clustering infrastructure for light probes (reflection probes and irradiance volumes) on platforms where at least 3 storage buffers are available. On such platforms (the vast majority), we stop performing brute-force searches of light probes for each fragment and instead only search the light probes with bounding spheres that intersect the current cluster. This should dramatically improve scalability of irradiance volumes and reflection probes. The primary platform that doesn't support 3 storage buffers is WebGL 2, and we continue using a brute-force search of light probes on that platform, as the UBO that stores per-cluster indices is too small to fit the light probe counts. Note, however, that that platform also doesn't support bindless textures (indeed, it would be very odd for a platform to support bindless textures but not SSBOs), so we only support one of each type of light probe per drawcall there in the first place. Consequently, this isn't a performance problem, as the search will only have one light probe to consider. (In fact, clustering would probably end up being a performance loss.) Known potential improvements include: 1. We currently cull based on a conservative bounding sphere test and not based on the oriented bounding box (OBB) of the light probe. This is improvable, but in the interests of simplicity, I opted to keep the bounding sphere test for now. The OBB improvement can be a follow-up. 2. This patch doesn't change the fact that each fragment only takes a single light probe into account. Typical light probe implementations detect the case in which multiple light probes cover the current fragment and perform some sort of weighted blend between them. As the light probe fetch function presently returns only a single light probe, implementing that feature would require more code restructuring, so I left it out for now. It can be added as a follow-up. 3. Light probe implementations typically have a falloff range. Although this is a wanted feature in Bevy, this particular commit also doesn't implement that feature, as it's out of scope. 4. This commit doesn't raise the maximum number of light probes past its current value of 8 for each type. This should be addressed later, but would possibly require more bindings on platforms with storage buffers, which would increase this patch's complexity. Even without raising the limit, this patch should constitute a significant performance improvement for scenes that get anywhere close to this limit. In the interest of keeping this patch small, I opted to leave raising the limit to a follow-up. ## Changelog ### Changed * Light probes (reflection probes and irradiance volumes) are now clustered on most platforms, improving performance when many light probes are present. --------- Co-authored-by: Benjamin Brienen <Benjamin.Brienen@outlook.com> Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>

# Objective Volumetric fog was broken by bevyengine#13746. Looks like this particular shader just got missed. I don't see any other instances of `unpack_offset_and_counts` in the codebase. ``` 2024-12-06T03:18:42.297494Z ERROR bevy_render::render_resource::pipeline_cache: failed to process shader: error: no definition in scope for identifier: 'bevy_pbr::clustered_forward::unpack_offset_and_counts' ┌─ crates/bevy_pbr/src/volumetric_fog/volumetric_fog.wgsl:312:29 │ 312 │ let offset_and_counts = bevy_pbr::clustered_forward::unpack_offset_and_counts(cluster_index); │ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ unknown identifier │ = no definition in scope for identifier: 'bevy_pbr::clustered_forward::unpack_offset_and_counts' ``` ## Solution Use `unpack_clusterable_object_index_ranges` to get the indices for point/spot lights. ## Testing `cargo run --example volumetric_fog` `cargo run --example fog_volumes` `cargo run --example scrolling_fog`

pcwalton requested review from IceSentry, robtfm and superdump June 8, 2024 03:59

pcwalton added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Jun 8, 2024

pcwalton added this to the 0.15 milestone Jun 8, 2024

jgayfer reviewed Jun 23, 2024

View reviewed changes

Elabajaba approved these changes Aug 6, 2024

View reviewed changes

crates/bevy_pbr/src/light_probe/light_probe.wgsl Outdated Show resolved Hide resolved

bas-ie added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Sep 30, 2024

BenjaminBrienen added the D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes label Oct 14, 2024

pcwalton and others added 5 commits October 14, 2024 20:53

fix conflicts

1e8de1a

Merge remote-tracking branch 'pcwalton/cluster-light-probes' into clu…

9549f03

…ster-light-probes

fix merge error

164ae67

feedback

b6254fc

tychedelia approved these changes Oct 18, 2024

View reviewed changes

Merge pull request #6 from BenjaminBrienen/cluster-light-probes

aa0bf81

Cluster light probes

alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Oct 20, 2024

Nits for CI

fc5ff60

alice-i-cecile enabled auto-merge October 20, 2024 18:17

alice-i-cecile disabled auto-merge October 20, 2024 18:18

alice-i-cecile added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it labels Oct 20, 2024

alice-i-cecile modified the milestones: 0.15, 0.16 Oct 20, 2024

pcwalton added 4 commits December 4, 2024 11:29

Merge remote-tracking branch 'origin/main' into cluster-light-probes

4c679cf

Merge remote-tracking branch 'origin/main' into cluster-light-probes

133ee88

Merge remote-tracking branch 'pcwalton/cluster-light-probes' into clu…

54038c5

…ster-light-probes

Fix alien_cake_addict.

fa86258

This patch was doing non-shadows before shadows (and ditto for volumetrics), while the reverse was what `prepare_lights` expected.

pcwalton added S-Needs-Review Needs reviewer attention (from anyone!) to move forward and removed S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Dec 5, 2024

jgayfer approved these changes Dec 5, 2024

View reviewed changes

pcwalton added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 5, 2024

alice-i-cecile added this pull request to the merge queue Dec 5, 2024

Merged via the queue into bevyengine:main with commit b7bcd31 Dec 5, 2024

rparrett mentioned this pull request Dec 6, 2024

Fix error in volumetric fog shader #16677

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cluster light probes using conservative spherical bounds.#13746

Cluster light probes using conservative spherical bounds.#13746
alice-i-cecile merged 12 commits intobevyengine:mainfrom
pcwalton:cluster-light-probes

pcwalton commented Jun 8, 2024 •

edited

Loading

Uh oh!

jgayfer left a comment

Uh oh!

jgayfer Jun 23, 2024

Uh oh!

Uh oh!

Elabajaba left a comment

Uh oh!

Uh oh!

tychedelia left a comment

Uh oh!

tychedelia commented Oct 18, 2024

Uh oh!

jgayfer left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Uh oh!

Conversation

pcwalton commented Jun 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog

Changed

Uh oh!

jgayfer left a comment

Choose a reason for hiding this comment

Uh oh!

jgayfer Jun 23, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Elabajaba left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tychedelia left a comment

Choose a reason for hiding this comment

Uh oh!

tychedelia commented Oct 18, 2024

Uh oh!

jgayfer left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

pcwalton commented Jun 8, 2024 •

edited

Loading