XaiJu
vrengames
vrengames

patreon


Multithreading for Fun and Performance

Alright, I know I said three days ago that I was finished with my optimization work, but I couldn't leave it at "trust me, it's faster." I wanted to pin down exactly how much time the new cropping methods had sped up the drawing process, as well as do some investigation to see where the new bottleneck was.

Before the cropping update a character image draw would look something like this (roughly, I forgot to keep my logs from before the changes):

Preparing flat character image: 0.4 seconds
Converting flat character to .png: 0.6 seconds
Animating and displaying character: 0.3 seconds

Even without animation enabled this would cause a full second hitch any time a character was redrawn. With animation enabled even more time was needed to get the character image ready. The image cropping changes turned it into this:

Preparing flat character image: 0.005 seconds
Converting flat character to .png: 0.6 seconds
Animating and displaying character: 0.1 seconds

The non-animated character image is now so quick to generate that it can be done before the animation one so that something is shown to the player as soon as possible. In addition to being faster this makes the experience feel more responsive, which can be just as important.

A nearly hundred fold increase in flat character performance and halving the time for animation would have been enough to call the whole thing a success, but I noticed something else while I was checking values: the time taken when converting a character to a .png was dominated by a call to Renpy.render, not by draw.screenshot. This is important because screenshot must be done in the main thread of Ren'py, so it will always result in blocking gameplay. render doesn't have the same limitation; in theory I could break it off into a separate thread to avoid blocking gameplay until the render was complete, then the main thread could take the screenshot (not actually a screenshot, but it uses the same Ren'py code to convert a character displayable into a .png) and the character image could be displayed.

The potential performance improvements were too good to ignore, and I wanted to jump on this while all the information was fresh in my head. I pulled all of the code up to the point of rendering the image into a separate function that could be run in a thread separate from the main one. When that thread is finished it hands the render to the main thread and asks for that render to be screenshotted. I experimented with the code after this point also being handled by a separate thread, but the overhead of starting a new thread ate up all of the time savings. There was some extra work needed to make sure old character images weren't drawn if a new one had already been requested (which happens often if you're skipping through a lot of text) and to make the clothing removal animations work, but that's all done and I can report even more speed improvements! With the new multithreaded approach the process looks more like this:

Preparing the flat image: 0.005 seconds
Rendering the character image: 0.65 seconds (threaded, non-blocking).
Screenshotting the character: 0.1 seconds
Animating and displaying the character: 0.1 seconds

Now when you're playing the game the character will be drawn as a static image and you will be able to continue doing stuff. When the rendering is done in about 0.6 of a second the static image will be replaced and begin animating. Despite taking slightly longer due to the threading overhead the responsiveness of the system (how long it goes from action to response) has been improved massively. As an added bonus the image cropping changes have let me re-enable image caching and prediction, which has sped things up even further.

Any time multithreading is involved there's the risk of some very difficult to diagnose bugs, but so far the system seems to be stable and playing nicely with the save/load system. I will have to do some longer game tests to make sure this stability continues. Now I can get on to writing the new content for v0.30. As a teaser it will be focused on the university location!

Comments

Why not just render-to-texture in memory and BitBlt (using gpu) to the output texture? no longer have to worry about png format or any of that nonsense. Just blit the render directly to the screen. (Blitting is usually done on the order of ms/ns) Though I'm not sure renpy, or python in general, has the ability to do lowlevel api calls (framebuffer blitting) like that.

Mash

I'm not actually sure why he's using floats(that looks like a double) at all- integer math is almost always faster, and avoids that entire issue.

Mash

feature request: format the value displayed for trait research to integer with int(**current research value**) to avoid showing hostile numbers like 1.000000000000001

random patreon


More Creators