win: Set activeCodePage to UTF-8
This should enable impex plugins of all file formats to access files containing Unicode characters in their paths on Windows. Historically, Windows has used the ANSI code page (ACP) as the encoding for char strings, which can only represent a limited range of characters. Windows also provides wide versions of the APIs using `wchar_t`, which is 2-byte chars for strings encoded in UTF-16 (or UCS-2 in some cases). For libraries to support Unicode filenames on Windows, they have to go out of the way to implement it with the wide API. They don't do it consistently either -- some choose to implement wide variants for their API, while some choose to interpret char* paths in UTF-8 (which led to confusion when the caller assumed the API takes the local 8-bit char encoding). Now, by setting the activeCodePage to UTF-8, this changes the code page for our process to UTF-8. This effectively means that, all -A variants of WinAPI calls now accept UTF-8 strings instead of strings in the system ACP. By extension, C and C++ functions for accessing files that are not the 'wide' variant will now also accept UTF-8 file paths. With regards to the impex plugins, this changes their behaviour around file paths: * If the external library already accepts `wchar_t *` there should be no change in behaviour. * If the external library accepts `char *` and treats them as UTF-8: * If we correctly use `QString::toUtf8()`, there should be no change in behaviour. * If we use `QString::toLocal8Bit()` or `QFile::encodeName()` by mistake, having activeCodePage in UTF-8 will render it a non-issue. * If the external library accepts `char *` and uses C or C++ library functions to open them directly: * If we correctly use `QString::toLocal8Bit()` or `QFile::encodeName()`, they would not have been able to open files with names containing Unicode chars outside of the system ACP in the past, but will now be able to do so. * If we use `QString::toUtf8()` by mistake, having activeCodePage in UTF-8 will render it a non-issue. As illustrated above, the result is a net improvement. Potential side effect: If a Python plugin expects to be using the system ACP to interact with an external process via IPC, this can cause the encoding to become mismatch. Note that this only works starting from Windows 10 Version 1903. Reference: https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page CCMAIL: kimageshop@kde.org
parent
42d78a74
Loading
Loading
Pipeline
#178806
passed
with stage
in
37 minutes and 16 seconds
Loading
Please register or sign in to comment