fuzzy repaint damage events (!4080) · Merge requests · Plasma / KWin

Matthias Dahl requested to merge matthiasdahl/kwin:fuzzy-repaint into master May 09, 2023

Nate asked me in bug #465158 to re-post my findings and suggestion that I initially posted to !3236 (closed) to hopefully kickstart a discussion. So, here we go (and the context is graphical glitches with fractional scaling and non-integer scale factors on Wayland):

For the past 4+ weeks, I spent the majority of my time trying to debug and fix this issue. I should qualify this with the information that for the past 8+ years, I have been fighting a serious autoimmune disease, so "the majority of my time" is definitely not 8 hours a day, more like the good hours of my day... which are anything from a few to zero.

I learnt a lot about kwin and Wayland which was really nice. So I wanted to take the opportunity to share my thoughts on the matter, in the hopes, that we can together come up with a solution and fix this for good. Please, in no way is this meant as a "brainy smurf" kind of post. I am by no means an expert on kwin and I just want to offer my humble thoughts and help. And oh my have I gained further appreciation for the work you guys do... debugging kwin is really not an easy thing to do.

The way fractional scaling and the logical <-> device coordinates conversions are implemented in kwin, introduce quite a bit of calculation errors since the conversion isn't really done at the very last minute and once but at various places, partly with accumulative further calculations and conversions and rounding. Pixel snapping isn't done uniform all over the place as well.

With damage regions that are just 1 logical pixel in height, all depends on a pixel perfect (replicable) mapping from logical to device coordinates that really matches the damaged region in device coordinates. Due to rounding effects and other variations that is extremely hard (impossible?) to do with fractional scaling.

I tried several pixel snapping (rounding) schemes, along with unifying those conversions all over the place. I debugged what kind of damage comes in and how it is dealt with, trying to figure out if something skews it along the way. Also, doing the clipping in logical space (which was a real pita) but at the end of the day, nothing really fixed the graphical glitches. Some experiments improved the situation, others introduced a whole set of new problems and glitches.

I think, instead of hopelessly trying to perfect the conversion and pixel snapping, we should look into the following approaches:

do a full repaint with fractional scale factors
redraw the complete window, not just the damaged parts
stay in logical coordinates till the very last minute and only convert once for presentation, reducing any accumulative errors
do fuzzy repaint

A full repaint is the most expensive approach but it will fix all graphical glitches once and for all. This will always work.

Instead of clipping and redrawing only the damaged parts of a window, we always redraw the entire window. This is less expensive than the full redraw but it still has some potential for glitches and we have quite a few special cases to consider (e.g. how to deal with the damage to the root window w/o a full repaint).

I tried to delay the conversion from logical to device coordinates but that was really painful and didn't work out. So this might lead to a proper solution, but I did not invest enough time to rewire everything and get it working. Thus I cannot really say if it is truly something that will work.

The most promising approach, imho, is the fuzzy repaint. With fractional scale factors, we extend the damaged region by a certain amount in all directions and let that pass through normally. This is the least expensive approach. I have been running a hackish PoC of this for the past two days and it fixed all graphical glitches for me (and there were quite a few all over the place).

The questions we would need to answer for the fuzzy repaint approach are:

Where to extend the damaged region? In my PoC, I do it too early (also affects qpainter based scene) but it was the easiest and fastest way to do it to make sure every one in the call chain later on sees the right damage region.
Should the amount of pixels always be the same or somehow depend on the scale factor (or other things)?
What safeguards/optimizations need to be implemented? Right now, I unconditionally add 4 pixels on all directions. This is obviously not okay. At the very least, we need to make sure to not cross window boundaries. Also, if a the extended damage covers all of the window, we can issue a full repaint of the window and thus by-passing the clipping later on altogether.

Enough wall of text. :-) I am really looking forward to all of your thoughts and comments.

fuzzy repaint damage events

Merge request reports