Status update from Nekotekina (2020-09-15)
Added 2020-09-16 00:44:35 +0000 UTCHello, this is Ivan with a late update for August.
In my previous post I introduced GalCiv and Megamouse, and August will not be an exception because this month I shared the donation with another developer, Whatcookie.
Here is some examples of his work on RPCS3:
- https://github.com/RPCS3/rpcs3/pull/8013 and https://github.com/RPCS3/rpcs3/pull/8131: optimizes "Fused multiply add" (aka FMA) instructions when the added value is 0. This is a common pattern in the case of PowerPC code, since there is no vectorized multiply instruction (without an ADD). It's also somewhat common in SPU code, despite the SPU supporting a normal multiply instruction (perhaps from code that was hastily ported from PowerPC to SPU?)
- https://github.com/RPCS3/rpcs3/pull/8198: An optimization for the "Floating compare greater than" (FCGT) instruction. The way that "floating compare" instructions work on SPU is different compared to x86. This PR attempts to find places where a basic x86 comparison will produce the same result as a PS3 without any extra instructions required.
- https://github.com/RPCS3/rpcs3/pull/8316. https://github.com/RPCS3/rpcs3/pull/8338 and https://github.com/RPCS3/rpcs3/pull/8397: A whole host of changes for the FM and FMA family of SPU instructions. These changes led to a 20% performance increase in the mandelbrot homebrew. It also fixed some physics issues in games such as little big planet 2. However, there are still several unexplained regressions after this series.
- https://github.com/RPCS3/rpcs3/pull/8537: Optimizes a very common case where data is transformed by shufb or rotqby family instructions immediately after being loaded. By taking advantage of the fact that some instructions operate in reverse byte order from the PS3, we can avoid the need to swap byte order on the data we just loaded.
- https://github.com/RPCS3/rpcs3/pull/8559: A basic optimization for when the VSEL/SELB instructions are used with a constant mask
- https://github.com/RPCS3/rpcs3/pull/8704: A basic optimization for the VPERM instruction. AVX-512VBMI adds an instruction which has nearly the same behavior as the PowerPC vector instruction VPERM.
- https://github.com/RPCS3/rpcs3/pull/8712: A very tricky optimization for the SHUFB instruction using a lot of AVX-512 Icelake instructions, including even an instruction intended for cryptography purposes! This optimization was inspired by this article: http://0x80.pl/articles/avx512-galois-field-for-bit-shuffling.html
He also made an ability to disable SPU MLAA and a lot of patches to unlock FPS in certain games, such as: Resistance, Resistance 2, Oblivion, Demon's Souls, Drakengard 3, Folklore, Lollipop Chainsaw, Shadows of the Damned, Dark Souls, NieR, Sengoku Basara 4: Sumeragi, Destroy All Humans! Path of the Furon, Army of TWO: The 40th Day, Red Dead Redemption, Asura's Wrath, Unreal Tournament 3.
In the near future he is hoping to improve the accuracy of the PowerPC VREFP and VRSQRTEFP instructions. These instructions provide a vectorized reciprocal estimate, and a vectorized reciprocal square root estimate. The word "estimate" here means that these instructions provide an approximation of the real answer, and have a certain ammount of error in their result. Currently the answer that RPCS3 provides is too exact, and some games such as Sly 2 will crash due to this. It's quite tricky to get this right, since the behavior of these instructions is implementation specific (meaning that different series of PowerPC processors will produce a different result) and a lot of research is needed. Hopefully many games will be improved once it's finished.
Regarding my own plans, I will try to take a look at improving filesystem emulation, particularly, more elegant implementation of case sensitivity of PS3. Also I will try to implement some missing SPU events.
Thank you very much.
Comments
You guys are doing God's work, thanks for the updates
mkx471
2020-09-16 11:11:23 +0000 UTCThanks for the update. Maybe it's an off-topic, but reading about filesystem emulation improvements, makes me wonder, is RPCS3 going to support ISO file loading to emulate physical discs at any point in the future?. Current support of BD directories is ok, but it causes problems with some scenarios (some filepaths too long which causes problems with some backup solutions for example). Switchinjg my PS3 BDs directories for ISO files would be a good improvement. Is such improvement on the roadmap?
Jordi Bosch Creus
2020-09-16 09:41:13 +0000 UTC