Status update from kd-11 (16-07-2021)
Added 2021-08-16 19:18:23 +0000 UTCHi,
It's kd-11 with a progress update on RSX emulation for the RPCS3 emulator.
It's been a while since the last update, and a lot has happened since then. Let us begin with a summary of the improvements made since the last update:
- 1. Fixed a bug where some optimizations made to improve performance of surface clears would fail in some corner cases and cause garbage visuals and flickering in some titles.
- 2. Implemented support for extended-length vertex programs. This is an aggressive optimization only seen in some titles with custom RSX code such as Ratchet & Clank series and Resistance series. This fixed missing particle effects in insomniac game titles.
- 3. Implemented dynamic vertex offset switching during draw call batches. This fixed vertex explosions and other corruption that would appear on screen in effects-heavy situations in insomniac titles such as Ratchet & Clank series.
- 4. Video memory management was redesigned. It was common for games to outright crash with out of memory errors when using vulkan previously, especially in some games that consume a lot of VRAM very fast. The new memory manager attempts to keep things going as much as possible - instead of crashing, performance will just slow down instead. This allows the user to at least get to the next save point and then change their settings accordingly.
- 5. Async texture streaming was disabled on NVIDIA GPUs. The whole thing will be redesigned very soon to avoid a lot of problems.
- 6. Handling of compressed 3D textures was rewritten. This fixed some comically broken visuals in some games such as Arcania.
- 7. Fixed an old regression that affected how vulkan cached shaders. This fixed some games that had good visuals under OpenGL but were completely broken when using Vulkan.
- 8. Added support for AMD FidelityFX Super Resolution. This changes how we approach some problems, such as titles that cannot be upscaled from their rendering resolution (pretty much any title that does SPU postprocessing needs patches)
- 9. Tons of other minor crash fixes and stability improvements.
I'd like to take a moment to highlight the two major tasks in the list that are worth discussing:
Video memory manager rewrite
At a glance this may not seem very important and that was the general sentiment. Afterall, in most games a 4GB GPU from 2016 can handle 1440p in RPCS3 without issues. However, we encountered two main problems. First, some games will have burst memory consumption. This happens during level transitions or background streaming situations. RPCS3 defers all operations when using Vulkan and therefore has to keep records of all the data until it has been fully worked on by the host GPU. This means it is possible to easily exhaust the 4GB of available memory if the right conditions were met. With some user testing, we were able to show that we could, in the worst case, fill up even a RTX 3090 GPU with its 24GB of VRAM. This of course is highly inefficient. There is also an influx of low-power gaming handhelds these days that only have 512MB of dedicated memory for the iGPU. This meant that we needed to improve the algorithms to have a 'working set' and 'non-working set'. We only need to keep the working set in the GPU-visible memory, we can keep the rest of it in system memory. I also added many routines to remove as much unnecessary information from the non-working set to avoid putting too much pressure on the system. With these changes, it is possible to run RPCS3 with decent memory utilization. We also did not take advantage of multiple heaps that provide the same type of access. e.g in AMD and NVIDIA GPUs, there is a special 256MB BAR region. We can take advantage of this special memory now, while before it was invisible to us. This meant that for example a iGPU with 512MB of reported dedicated VRAM in the BIOS would only show 256MB in RPCS3 which is not enough for PS3 emulation without issues. Now all compatible memory types are pooled together, exposing more usable memory to the games.
FSR integration
This one was in discussion for a while since the initial binary-only release of FSR in June. Unfortunately, it took a while for the source to be released. FSR in RPCS3 works by closing the gap between your rendering resolution and your game window output resolution. The difference compared to 'dumb' bilinear upscaling is huge in some titles, and especially with MSAA enabled to clean up edges it does work fairly well. The initial integration was mostly done for proof of concept, to just confirm that it works as well as we hoped it would. Optimizations and tuning will follow as we get our hands on more devices to test the code on.
I have some more experiments right now that I'm still investigating for now. The primary focus still remains bugfixes to improve the user experience, such as eliminating crashes or distracting visual glitches. Unfortunately, newer drivers from both AMD and NVIDIA have thrown a wrench into the works somewhat. It's really not anyone's fault, requirements are just very different between emulators and games and doing things in an efficient manner means that sometimes we are forced to devise clever shortcuts. I hope to have more information by the next update.
Thank you all for your continued support.
Regards,
kd-11
Comments
hello
Spencer Robinson
2021-08-23 03:58:16 +0000 UTCSeconded, this project has come a long way! Thanks for the write up!
Jason Guffey
2021-08-17 11:23:53 +0000 UTCThanks! That was an interesting read, especially the two highlights.
Covox
2021-08-17 05:58:30 +0000 UTC