Skip to content

OpenGL: parallelize tile conversion

During testing of the RAW plugin, I discovered the time to first render (FCP) of a BGRA16U 8k x 5k canvas was abysmally low, ~9 seconds (+ widget overhead). This was because the canvas, although tiled, was converted sequentially and in a single thread.

The simplest speedup, barring SIMD, is to take advantage of the tiles being independent, and parallelize them. This can be done with std::for_each(std::execution::par) (unsupported in Clang with libc++), or with QtConcurrent::blockingMap. We follow the latter here since it's readily available.

This brings the FCP down to less than one second.

Before After
Captura_de_pantalla_2022-12-27_234713 Captura_de_pantalla_2022-12-27_234929

Test Plan

Build Krita. Check that the canvas's color space conversion still renders correctly.

Formalities Checklist

  • I confirmed this builds.
  • I confirmed Krita ran and the relevant functions work.
  • I tested the relevant unit tests and can confirm they are not broken. (If not possible, don't hesitate to ask for help!)
  • I made sure my commits build individually and have good descriptions as per KDE guidelines.
  • I made sure my code conforms to the standards set in the HACKING file.
  • I can confirm the code is licensed and attributed appropriately, and that unattributed code is mine, as per KDE Licensing Policy.

Merge request reports