Skip to content

Improve unprintable characters paste filter

The old code was filtering things like characters from Private Use Area and surrogates (used to represent code points > 0xFFFF, e.g. emojis). In worst case, when pasted, they appear as <?>.

The filter is now limited to all C0 and C1 control characters except TAB, CR, LF.

Some messages, buttons text and characters descriptions were changed. I've aligned code point, ctrl sequence, and description on the list into columns using \t. Columns are a bit wide, but it is a lot more clear than before. I think current buttons captions are more understandable.

Screenshot_20191118_030154

Note that horizontal scrollbar presence is probably a bug in KMessageBox. The window adjusts its size to contents but it is always few pixels too small.

Test:

  • Copy example text to clipboard using following command. xclip is used to copy the data because select + copy won't work.
    echo "$(printf '\\u00%02x' $(seq 1 159)) 🐈" | xclip -selection clipboard
  • Run cat
  • Paste clipboard contents

Result:

Filter should show code points U+0001...U+001F (except 0x0009 = \t, 0x000A = \n, 0x000D = \r), U+007F, U+0080..U+009F. After accepting to filter out control characters, you should get all ASCII characters (from space to tilde) and a cat emoji.

Merge request reports