Skip to content

GS/HW: ROV support for feedback draws.#13754

Draft
TJnotJT wants to merge 4 commits into
PCSX2:masterfrom
TJnotJT:gs-rov
Draft

GS/HW: ROV support for feedback draws.#13754
TJnotJT wants to merge 4 commits into
PCSX2:masterfrom
TJnotJT:gs-rov

Conversation

@TJnotJT
Copy link
Copy Markdown
Contributor

@TJnotJT TJnotJT commented Dec 28, 2025

Intro/status

The PR #7655 by Stenzek was consulted to get implementation ideas and heuristics.

This is committed on top on #14386 for easier dump run testing.

Description of changes

Adds ROV (rasterizer ordered view) support to DX12/VK/GL/DX11 for feedback draws (i.e., where writing the new color/depth value for each pixel requires reading the current color/depth value).

Thanks @TheLastRar for identifying and fixing the Windows RDNA2 feedback transition issue.

Thanks @SternXD for providing FSUI ROV options and other UI fixes.

Rationale behind Changes

Can help improve performance and accuracy in feedback draws with overlapping geometry by reducing draw call count and barriers.

Suggested testing steps

  • Select the DX12/VK/GL/DX11 API and enable ROV and ROV Preset in Settings>Graphics>Rendering (or Advanced for ROV VK barriers). Alternatively, add the following to the PCSX2 INI:
    [EmuCore/GS]
    ...
    HWROV = 1
    HWROVUseBarriersVK = 0 # described below
    
  • Higher blend levels may be required for ROVs to be used.
  • Graphically, there should not be any difference in games with Maximum blend with/without ROVs (other than issues noted below).
  • We generally expect to see improvement in frame stats such as reduced barriers and draw calls (other than issues with barriers noted below).
  • Testing a range of games both with/without heavy overdraw/feedback would be helpful.

Vulkan ROV barrier setting

The Advanced setting for ROV barriers in Vulkan has the following values:

  • HWROVUseBarriersVK = 0: No extra barriers.
  • HWROVUseBarriersVK = 1: Use barriers before every ROV draw.
  • HWROVUseBarriersVK = 2: Use barriers on every UAV change. Less barriers than setting 1.

Some dumps are known to have flickers/missing effects when barriers are not used.

Results/shortcomings

  • Stats: the draw call/barrier count is reduced significantly in some GS dumps.
  • Accuracy: some dumps appears to have large differences in frames when compared in tools like WinMerge, but there should not be much visual difference. A possible reason is small differences in HW blending vs. SW blending.
  • Not all games benfit from ROVs, especially if they don't use many barriers or accuracy settings are low.
  • On some drivers (e.g. Intel Xe DX12), depth ROVs have small inaccuracies than can manifest as Z-fighting-like acne (some examples below).
  • Can cause inaccuracy in line blending due to overdraw of first/last pixels in a linestrip (example below). Does not occur with barriers because the we allow bboxes of consecutive lines to overlap in one pixel in a single draw.

Examples of Z inaccuracy

This seems to not affect all systems.

Conspiracy - Weapons of Mass Destruction_SLES-53098_20250214222606.gs.xz

Master DX12
S044932_f00001_fr1_00a00_C_32_Conspiracy - Weapons of Mass Destruction_SLES-53098_20250214222606 gs xz_master

PR DX12
S044932_f00001_fr1_00a00_C_32_Conspiracy - Weapons of Mass Destruction_SLES-53098_20250214222606 gs xz_rov

Haunting Ground_SLES-52877_20230724220957.gs.xz (see mirror)

Master DX12
S043379_f00003_fr1_00000_C_32_Haunting Ground_SLES-52877_20230724220957 gs xz_master

PR DX12
S043379_f00003_fr1_00000_C_32_Haunting Ground_SLES-52877_20230724220957 gs xz_rov

Example of line blending inaccuracy

ZOE_Copy_Effect.gs.xz (see blueish elliptical lines)

Master VK
06416_f00003_fr1_00700_C_32

PR VK
06416_f00003_fr1_00700_C_32

TODO

  • Port to DX11/OpenGL (might be done in a separate PR).
  • (Done by GS/DX12: Use Enhanced Barriers API #13792) Give DX12 textures a state enum instead of manipulating D3D12_RESOURCE_STATE* directly (similarly to how Vulkan does layouts) and replace the current UAV state tracking.
  • (Done) Allow tweaking the ROV usage heuristic to see if it might benefit games differently.
  • (Done) Combine the DX12 descriptor tables for RT, depth, and UAVs into a single descriptor table to reduce root parameters and number of updates, since at least one table must be updated anyway each time an RT changes.
  • (Partially done by GS/HW: Support for integer depth. #13854) Future work: Emulate 32 bit Z in the pixel shader with custom interpolation.
  • Future work: merge the depth as RT system in depth feedback with the depth as color system in ROV. Both create a color clone of depth so have overlapping functionality.
  • Future work: modify the target vram counting so that it accurately tracks depth color clones used for ROV.

AI usage

AI was used as a reference for graphics API usage and to brainstorm ideas for correctness/efficiency of UAV usage. The heuristic to activate/deactivate ROV was obtained in part with AI, though the approach is likely well known. Additionally, AI was used to review parts of the code for correctness.

@JordanTheToaster
Copy link
Copy Markdown
Member

Using way too many barriers even at basic and minimum breaks rendering.

image

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Dec 28, 2025

Using way too many barriers even at basic and minimum breaks rendering.

Thanks for the test and catching that.

There was a bug in tracking ROV feedback history, which is hopefully resolved in the last push.

I also changed the heuristic to only vote for ROVs if barriers are already configured. It was unnecessarily using barriers for blends that could be done in HW.

Haven't yet looked into the issue with the minimum blend.

@bigol83
Copy link
Copy Markdown

bigol83 commented Dec 28, 2025

GTA San Andreas maximum blending just 120 barriers with dx12 and huge speed improvement

PR
immagine

Master
immagine

Unfortunately the improvement is not always this big

PR
immagine

Master

immagine

Added the second scenario dump
gtasa.zip

@SlimDread
Copy link
Copy Markdown

GTA: LCS (NTSC U)

Master (UHD 770)
image

PR (UHD 770)
image

Master (RTX 2060 SUPER)
image

PR (RTX 2060 SUPER)
image

GS Dump
gta lcs.zip

Nice improvement on NVIDIA but a regression on Intel

@AmandaRoseChaqueta
Copy link
Copy Markdown

AmandaRoseChaqueta commented Dec 28, 2025

I will test on linux once it gets Vulkan support. Great work bringing ROV back :).

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Dec 28, 2025

Unfortunately the improvement is not always this big

Nice improvement on NVIDIA but a regression on Intel

Thanks all for the tests. Looks like these scenes have spiky barrier usage so ROVs were not being activated enough. I think we may need to change the heuristic again and have it account for how many barriers can be saved on each pass. Also, instead of hardcoding constants into the heuristic it might be useful to expose these via the INI for tuning. Will try to get to this soon.

I will test on linux once it gets Vulkan support. Great work bring ROV back :).

Thanks, it's in the works, hopefully not too long.

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Dec 29, 2025

The last push adds some INI config settings to try and tweak how aggressively ROVs are used and how frequently to switch between ROV/non-ROV usage (described in detail in the OP). If anyone is so inclined, please try tweaking them to see if certain games benefit differently from different parameters.

The code is a bit volatile at the moment as Vulkan ROV is partially implemented and is not functional. Some bugs might have creeped into the other APIs also so be warned.

@bigol83
Copy link
Copy Markdown

bigol83 commented Dec 29, 2025

I know you wrote Vulkan is still not functional but just in case, with latest push Vulkan has broken graphics but performance improvement is huge, default settings

immagine immagine

Default settings with DX12 has a great performance improvement in that particular scenario

immagine immagine

@bigol83
Copy link
Copy Markdown

bigol83 commented Dec 29, 2025

with previous commits this was fixed, with latest commit it has that line just like it does on master
immagine

Shadow of the Colossus_SCES-53326_20251229110248.gs.zip

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Dec 29, 2025

with previous commits this was fixed, with latest commit it has that line just like it does on master

Thanks for spotting that. I haven't stepped through the dump carefully, but I think that might be due to Z inaccuracy, since the heuristics are still being tweaked and Z conversions can happen at different times. In some cases it appears to fix things but it's likely by chance.

Vulkan should hopefully be functional on the last push and some small bug fixes.

@bigol83
Copy link
Copy Markdown

bigol83 commented Dec 30, 2025

GTA San Andreas looks good now and it still has the big speed improvement with Vulkan, Shadow of the Colossus though still has some issues
you can see this one quite easily on main menu

immagine

this is how it looks on master with the same settings using Vulkan
immagine

dump
Shadow of the Colossus_SCES-53326_20251230031827.gs.zip

@Mrlinkwii Mrlinkwii added this to the Release 2.8 milestone Dec 30, 2025
@TJnotJT TJnotJT force-pushed the gs-rov branch 3 times, most recently from b4565fd to 16a031a Compare January 1, 2026 05:12
@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Jan 1, 2026

GTA San Andreas looks good now and it still has the big speed improvement with Vulkan, Shadow of the Colossus though still has some issues you can see this one quite easily on main menu

Should hopefully be fixed now. It might have been an issue with depth not being enabled correctly.

@bigol83
Copy link
Copy Markdown

bigol83 commented Jan 1, 2026

Graphics issues with Vulkan have been fixed but now Dx12 crashes both with GTA San Andreas and Shadow of the Colossus

the error window is this one

Immagine 2026-01-01 120909

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Jan 1, 2026

Graphics issues with Vulkan have been fixed but now Dx12 crashes both with GTA San Andreas and Shadow of the Colossus

the error window is this one

That might be due to a mistake I made somewhere while rebasing. However, I wasn't able to reproduce on my end with basic/maximum blend and tweaking the ROV settings a bit. Could you please attach the INI you used to test?

On a side note, I changed the INI settings again, so any ROV related settings can be deleted from the INI so that the defaults are used. I'll try to make some presets so that it is easier to test in the future.

@bigol83
Copy link
Copy Markdown

bigol83 commented Jan 1, 2026

Graphics issues with Vulkan have been fixed but now Dx12 crashes both with GTA San Andreas and Shadow of the Colossus
the error window is this one

That might be due to a mistake I made somewhere while rebasing. However, I wasn't able to reproduce on my end with basic/maximum blend and tweaking the ROV settings a bit. Could you please attach the INI you used to test?

On a side note, I changed the INI settings again, so any ROV related settings can be deleted from the INI so that the defaults are used. I'll try to make some presets so that it is easier to test in the future.

i am going to delete the ini settings for the specific game and see if i can reproduce the issue with the newest commit. If i can i will post the ini like you asked

@bigol83
Copy link
Copy Markdown

bigol83 commented Jan 1, 2026

It still crashes for me
this is the specific game ini
SCES-53326_0F0C4A9C.zip

and this is the general pcsx2 ini
PCSX2.zip

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Mar 7, 2026

@bigol83 @mrrguest We’re planning to submit a bug report to Mesa, as we’ve reproduced the flickering issue on two Linux/Wayland systems with AMD GPUs.

If I recall correctly, both of your systems also showed the flickering with VK when the extra barriers weren’t enabled. Could you share a few details about your setup: OS, whether you're using Wayland (if on Linux), and your GPU model? (I believe @mrrguest's is RDNA3.)

Thanks in advance if you’re able to help

@mrrguest
Copy link
Copy Markdown

mrrguest commented Mar 7, 2026

RDNA3 AMD 780M IGPU, Wayland, CachyOS. I didn't notice any flickering just graphics issue with Burnout Revenge which was fixed with HWROVUseBarriersVK = 2.
I believe @AmandaRoseChaqueta has RDNA2 GPU on Linux and noticed flickering with Black.

EDIT: I just tried the Black single and multi frame gs dumps @AmandaRoseChaqueta posted: #13754 (comment)

I didn't notice any flickering with the following setings:
HWROV = 1
HWROVPreset = 1,2,3
HWROVUseBarriersVK = 0

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Mar 8, 2026

RDNA3 AMD 780M IGPU, Wayland, CachyOS. I didn't notice any flickering just graphics issue with Burnout Revenge which was fixed with HWROVUseBarriersVK = 2. I believe @AmandaRoseChaqueta has RDNA2 GPU on Linux and noticed flickering with Black.

EDIT: I just tried the Black single and multi frame gs dumps @AmandaRoseChaqueta posted: #13754 (comment)

I didn't notice any flecking with the following setings: HWROV = 1 HWROVPreset = 1,2,3 HWROVUseBarriersVK = 0

Got it, thanks for the information. This appears to confirm that Linux/Wayland/RDNA2 is the only setup that's affected so far.

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Apr 17, 2026

Rebased with fairly significant conflicts

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Apr 26, 2026

Last push rebases with several large conflicts. Current build should compile/run but is broken and will require some fixing. And example broken dump is attached.

Need for Speed - Most Wanted [Black Edition]_SLES-53857_20260104230853.gs.xz.zip

Note: GSDevice::PSSelector went from 12 to 16 bytes on the last pushes.

@TJnotJT TJnotJT force-pushed the gs-rov branch 2 times, most recently from 34f61ed to d50d1ca Compare April 29, 2026 03:10
@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented Apr 30, 2026

Last few pushes do the following:

  • Refactor/remove parts of the PR that were complicated and that might not have contributed much to functionality/performance (e.g. asymmetric heuristics for color/depth ROV, UAV clears in DX12).
  • Add ROV options to the UI in Graphics>Rendering (main ROV settings) and Graphics>Advanced (ROV VK barriers).
  • Add OSD stats when ROV is enabled: ROV draws, ROV barriers, and ROV depth copies. These are a displayed next to draws, barriers, and texture copies, and are separated by a '/'. (The new stats overlap with the current stats, they're double counted.)

Performance note: for simplicity the current PR adds the unordered access (DX12) / storage image (VK) specifier to all RTs even when ROV is disabled, so we should make sure this doesn't cause any performance regressions.

Other note: the current system for ROV depth creates a color clone of depth (since depth textures can't be used as UAVs). This overlaps a bit with the depth as RT functionality in depth feedback, so it's possible that they could be merged in the future.

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented May 3, 2026

The following dumps runs were done on both DX12 and VK:

  • ROV disabled with max blend.
  • ROV preset 3 (aggressive) with debug blend.
  • ROV preset 1 (balanced) with debug blend.

The main differences between PR and master in the latter two dump runs appear to be due to inaccuracies with line blending (described in OP), although the differences are usually not visually obvious or affect only individual pixels.

I'm unable to fully validate DX12 as my system continues to have severe Z fighting and other issues.

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented May 6, 2026

Last push removes the ROV preset setting and hard codes it to former 'aggressive' preset, as it appears to perform the best. Thanks to all who contributed tests to determine this.

The ROV checkbox no longer causes a GS reset when changed. However, the OSD target vram usage may now be slightly inaccurate (it will underestimate true amount when depth ROVs are used). The may need to be addressed in a future PR.

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented May 14, 2026

Last push adds ROV options to FSUI and fixes other ROV UI elements. Authored by @SternXD.

Copy link
Copy Markdown
Member

@TellowKrinkle TellowKrinkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked over everything yet, but I figured I'd leave these to for you to look at for now.

float RcpScaleFactor;
float pad0;
float pad1;
uint4 ColorMask;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to using this over FbMask?

Copy link
Copy Markdown
Contributor Author

@TJnotJT TJnotJT May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely not, we should probably use FbMask here.

I may have used this initially to avoid unexpected interactions with the current pipeline.

State m_state = State::Dirty;

// ROV state tracking
std::unique_ptr<GSTexture> m_depth_color; // For depth texture points to the parallel color texture.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of pairing these instead of creating a texture of the opposite type, doing a StretchRect, and recycling the old one?

Copy link
Copy Markdown
Contributor Author

@TJnotJT TJnotJT May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought there would be greater coupling because we might do e.g. partial copies. However, it turns out that just doing a full copy and use depth ROV for as long as possible appear to be more performant.

It does require a bit more code to keep them separate, since we'd have to modify the texture cache or add new convert pipelines. However, having done that for integer depth, it's definitely a possibility.

On a side note, you can ignore the float m_avg_barriers_rov = 1.0f; field since we removed the different ROV heuristics and made it just keep ROV active for as long as possible once it's activated. That field and related heuristics code in GSRendererHW::DetermineROVUsage() should probably be removed also.

@TJnotJT
Copy link
Copy Markdown
Contributor Author

TJnotJT commented May 18, 2026

Last push adds the ports to DX11/GL. Changed to draft until revisions are completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.