scenes/opengl: Implement hardware accelerated geometry clipping
While profiling the render path, I've noticed a couple of things:
(a) Clipping geometry on the cpu side takes more time than desired; (b) It takes way less time for the GPU to execute rendering commands than for the CPU to record rendering commands.
We used to clip geometry using the scissor test with complex clip regions; but due to high overhead introduced by multiple draw calls and changing the pipeline state, it was changed to clipping the geometry on the cpu side.
Fortunately, it is still possible to have hardware accelerated geometry clipping. It can be achieved by specifying user clip planes and instanced rendering. With this approach, an item tree will be rendered and clipped with a single draw call.
This change is not a magical wand that fixes all the issues in the rendering hot path. There are still too many heap allocations that kill performance, but this should make things a little bit better.