Skip to content

EncryptedBodyPartFormatter::process(): fix use of encoding on decrypting

Motivation for this patch is the messageviewer_messageparttheme_rendertest unit test failing over seemingly an encoding issue:

--- "/home/jenkins/workspace/Applications/messagelib/stable-kf5-qt5 SUSEQt5.15/mimetreeparser/autotests/data/openpgp-encrypted-no-text-attachment.mbox.html"	2021-08-22 10:00:53.385046929 +0000
+++ openpgp-encrypted-no-text-attachment.mbox.out.html	2021-08-22 10:05:05.590052361 +0000
@@ -19,7 +19,7 @@
                   <a name="att"/>
                   <div id="attachmentDiv">
                     <div class="noquote">
-                      <div dir="ltr">qwertzuiopü</div>
+                      <div dir="ltr">qwertzuiop�</div>
                     </div>
                   </div>
                 </div>
FAIL!  : RenderTest::testRenderSmart(openpgp-encrypted-no-text-attachment.mbox) Compared values are not the same
   Actual   (proc.exitCode()): 1
   Expected (0)              : 0
   Loc: [/home/jenkins/workspace/Applications/messagelib/stable-kf5-qt5 SUSEQt5.15/messageviewer/src/messagepartthemes/default/autotests/util.cpp(87)]```

It hardcoded "utf-8" as it is the best guess for encrypted data (gnupg suggests to use only UTF-8 data). We allow an override via NodeHelper::setOverrideCodec(), as the user may want to show one email with a different encoding.

Also only have looked at the unit test so far. No experience with doing PGP things in KMail, so cannot test there. Anyone known who makes use of PGP encrypted attachments that end up as Content-Type: text/PGP; and thus would run into this code?

This code is used, when someone sends an attachment (if it is an text part inside, than this text is displayed). It is a rare corner case, that is only used, because you don't have a proper MUA handling PGP. I only get this, if people use PGP on the command line and than send the output as attached file.

Open questions (from @kossebau):

So my current question is: is the value KMime has set as default encoding backed by actual specification? Or is it just an artifact having survived from the intial code addition 20 years ago in https://invent.kde.org/pim/kmime/-/blob/cd9ac57a36369652fd9b874b650fe84a483bc976/kmime_headers.h ? :)

Even more when seeing how KMime::Content::parse() default to the "US-ASCII" charset instead on multiple occasions?

And should it be uppercasing or lowercasing that charset ids should standadize on? Different places in the code go for different casing (which on changing throws up tests over casing mismatch, bah :) ).

Currently investigating what the sideeffects are when changing KMime::Content::defaultCharset() from "ISO-8859-1" to "US-ASCII". There seem some things to rely on non-US-ASCII chars to be handled, at least in tests...

@kloecker

Edited by Sandro Knauß

Merge request reports

Loading