Hardware accelerated geometry clipping
While profiling rendering code, I noticed a couple of things:
- Clipping window quads on CPU takes (way) more time than desired
- The GPU render time is very low, while the CPU render time is somewhat high (Intel users may observe the opposite, but it doesn't matter for our case)
(The chart displays the CPU and the GPU render over 10000 frames while watching a 60FPS video on YouTube. The spikes in the GPU render time at the beginning correspond to me moving windows with wobbly windows effect)
(Watching a 60FPS video on YouTube without interacting with any window. The GPU render time is very low, 0.03-0.04ms)
(This chart displays the CPU and the GPU render time while running 50 instances of weston-simple-egl. The GPU render time is still very low despite 50 applications and kwin using the GPU)
Two things negatively impact the efficiency - memory allocations and clipping the geometry on cpu. We used to clip the geometry using the scissor test with complex clip regions. However, that meant that a draw call must be issued for every rect in the clip region. Draw calls and changing the graphics pipeline state are typically not cheap. That's why we ditched the scissor test in favor of clipping geometry on CPU.
There is another way to have hardware accelerated geometry clipping that doesn't need the scissor test or multiple draw calls. One could specify user clip planes in the vertex shader and use instanced rendering. gpu-vertex-clip-demo demonstrates such an approach.
The demo renders a red rectangle at position (20, 50), which is clipped with a region that contains three rectangles. The red rectangles is rendered and clipped with a single draw call.
Before moving forward with this, it would be nice to bump OpenGL versions though. So instanced rendering and other perks from modern OpenGL can be used without performing any checks.
- OpenGL 2.0 (2004) -> OpenGL 3.3 (2010): so fences, framebuffer objects, instanced rendering, etc are guaranteed to be available
- OpenGL ES 2.0 (2007) -> OpenGL ES 3.0 (2012)
Hardware accelerated geometry clipping won't fix all the issues but it should make things a bit better, e.g.
(The CPU render has dropped quite significantly while running 50 instances of weston-simple-egl)