How to get two-sequence representation of UTF-8 character using JavaMail's MimeUtility or Apache Commons and...





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







1















I'm having a string which contains the German ü character. Its UTF value is 0xFC, but its quoted-printable sequence should actually be =C3=BC instead of =FC. However, using JavaMail's MimeUtility like below, I can only get the single-sequence representation.



String s = "Für";
ByteArrayOutputStream baos = new ByteArrayOutputStream ();
OutputStream encodedOut = MimeUtility.encode (baos, "quoted-printable");

encodedOut.write (s.getBytes (StandardCharsets.UTF_8));
String encoded = baos.toString (); // F=FCr


(Defining StandardCharsets.US_ASCII instead of UTF_8 resulted in F?r, which is - obviously - not what I want.)



I have also already taken a look into Apache Commons' QuotedPrintableCodec, which I used like this:



String s = "Für";
QuotedPrintableCodec qpc = new QuotedPrintableCodec ();
String encoded = qpc.encode (s, StandardCharsets.UTF_8);


However, this resulted in F=EF=BF=BDr, which is similar to the result Java's URLEncoder would produce (% instead of = as an escape character, F%EF%BF%BDr), and which is not understandable to me.



I'm getting the string from a JavaMail MimeMessage using a ByteArrayOutputStream like so:



ByteArrayOutputStream baos = new ByteArrayOutputStream ();
message.writeTo (baos);
String s = baos.toString ();


On the initial store procedure, I receive a string containing a literal (whose correct quoted-printable sequence seems to be =EF=BF=BD) instead of an umlaut-u. However, on any consecutive request Thunderbird makes (e.g. copying to Sent), I receive the correct ü. Is that something I can fix?



What I would like to receive is the two-sequence representation as required by IMAP and the respective mail clients. How would I go about that?










share|improve this question

























  • I get F=C3=BCr. It shouldn’t be possible for getBytes(UTF_8) to produce a standalone FC byte.

    – VGR
    Nov 24 '18 at 18:44













  • @VGR Oh yes. That's right. However, I get some strange string when receiving from a MimeMessage, I updated my question.

    – Alexander Leithner
    Nov 25 '18 at 10:41











  • boas.toString() is most likely your problem, as it uses the platform’s charset to decode the bytes. I would use something like String s = (String) message.getContent(), which will take the message’s charset into account.

    – VGR
    Nov 25 '18 at 15:58











  • Well, sadly that doesn't help it, since the MimeMessage returns a string which literally contains although the message's Content-Type header is set to text/plain; charset=utf-8 which should work just right. (But the consecutive requests made by Thunderbird include the correct ü, no matter of how I retrieve the message's contents. However, I need it to work on the very first request...)

    – Alexander Leithner
    Nov 26 '18 at 13:55


















1















I'm having a string which contains the German ü character. Its UTF value is 0xFC, but its quoted-printable sequence should actually be =C3=BC instead of =FC. However, using JavaMail's MimeUtility like below, I can only get the single-sequence representation.



String s = "Für";
ByteArrayOutputStream baos = new ByteArrayOutputStream ();
OutputStream encodedOut = MimeUtility.encode (baos, "quoted-printable");

encodedOut.write (s.getBytes (StandardCharsets.UTF_8));
String encoded = baos.toString (); // F=FCr


(Defining StandardCharsets.US_ASCII instead of UTF_8 resulted in F?r, which is - obviously - not what I want.)



I have also already taken a look into Apache Commons' QuotedPrintableCodec, which I used like this:



String s = "Für";
QuotedPrintableCodec qpc = new QuotedPrintableCodec ();
String encoded = qpc.encode (s, StandardCharsets.UTF_8);


However, this resulted in F=EF=BF=BDr, which is similar to the result Java's URLEncoder would produce (% instead of = as an escape character, F%EF%BF%BDr), and which is not understandable to me.



I'm getting the string from a JavaMail MimeMessage using a ByteArrayOutputStream like so:



ByteArrayOutputStream baos = new ByteArrayOutputStream ();
message.writeTo (baos);
String s = baos.toString ();


On the initial store procedure, I receive a string containing a literal (whose correct quoted-printable sequence seems to be =EF=BF=BD) instead of an umlaut-u. However, on any consecutive request Thunderbird makes (e.g. copying to Sent), I receive the correct ü. Is that something I can fix?



What I would like to receive is the two-sequence representation as required by IMAP and the respective mail clients. How would I go about that?










share|improve this question

























  • I get F=C3=BCr. It shouldn’t be possible for getBytes(UTF_8) to produce a standalone FC byte.

    – VGR
    Nov 24 '18 at 18:44













  • @VGR Oh yes. That's right. However, I get some strange string when receiving from a MimeMessage, I updated my question.

    – Alexander Leithner
    Nov 25 '18 at 10:41











  • boas.toString() is most likely your problem, as it uses the platform’s charset to decode the bytes. I would use something like String s = (String) message.getContent(), which will take the message’s charset into account.

    – VGR
    Nov 25 '18 at 15:58











  • Well, sadly that doesn't help it, since the MimeMessage returns a string which literally contains although the message's Content-Type header is set to text/plain; charset=utf-8 which should work just right. (But the consecutive requests made by Thunderbird include the correct ü, no matter of how I retrieve the message's contents. However, I need it to work on the very first request...)

    – Alexander Leithner
    Nov 26 '18 at 13:55














1












1








1








I'm having a string which contains the German ü character. Its UTF value is 0xFC, but its quoted-printable sequence should actually be =C3=BC instead of =FC. However, using JavaMail's MimeUtility like below, I can only get the single-sequence representation.



String s = "Für";
ByteArrayOutputStream baos = new ByteArrayOutputStream ();
OutputStream encodedOut = MimeUtility.encode (baos, "quoted-printable");

encodedOut.write (s.getBytes (StandardCharsets.UTF_8));
String encoded = baos.toString (); // F=FCr


(Defining StandardCharsets.US_ASCII instead of UTF_8 resulted in F?r, which is - obviously - not what I want.)



I have also already taken a look into Apache Commons' QuotedPrintableCodec, which I used like this:



String s = "Für";
QuotedPrintableCodec qpc = new QuotedPrintableCodec ();
String encoded = qpc.encode (s, StandardCharsets.UTF_8);


However, this resulted in F=EF=BF=BDr, which is similar to the result Java's URLEncoder would produce (% instead of = as an escape character, F%EF%BF%BDr), and which is not understandable to me.



I'm getting the string from a JavaMail MimeMessage using a ByteArrayOutputStream like so:



ByteArrayOutputStream baos = new ByteArrayOutputStream ();
message.writeTo (baos);
String s = baos.toString ();


On the initial store procedure, I receive a string containing a literal (whose correct quoted-printable sequence seems to be =EF=BF=BD) instead of an umlaut-u. However, on any consecutive request Thunderbird makes (e.g. copying to Sent), I receive the correct ü. Is that something I can fix?



What I would like to receive is the two-sequence representation as required by IMAP and the respective mail clients. How would I go about that?










share|improve this question
















I'm having a string which contains the German ü character. Its UTF value is 0xFC, but its quoted-printable sequence should actually be =C3=BC instead of =FC. However, using JavaMail's MimeUtility like below, I can only get the single-sequence representation.



String s = "Für";
ByteArrayOutputStream baos = new ByteArrayOutputStream ();
OutputStream encodedOut = MimeUtility.encode (baos, "quoted-printable");

encodedOut.write (s.getBytes (StandardCharsets.UTF_8));
String encoded = baos.toString (); // F=FCr


(Defining StandardCharsets.US_ASCII instead of UTF_8 resulted in F?r, which is - obviously - not what I want.)



I have also already taken a look into Apache Commons' QuotedPrintableCodec, which I used like this:



String s = "Für";
QuotedPrintableCodec qpc = new QuotedPrintableCodec ();
String encoded = qpc.encode (s, StandardCharsets.UTF_8);


However, this resulted in F=EF=BF=BDr, which is similar to the result Java's URLEncoder would produce (% instead of = as an escape character, F%EF%BF%BDr), and which is not understandable to me.



I'm getting the string from a JavaMail MimeMessage using a ByteArrayOutputStream like so:



ByteArrayOutputStream baos = new ByteArrayOutputStream ();
message.writeTo (baos);
String s = baos.toString ();


On the initial store procedure, I receive a string containing a literal (whose correct quoted-printable sequence seems to be =EF=BF=BD) instead of an umlaut-u. However, on any consecutive request Thunderbird makes (e.g. copying to Sent), I receive the correct ü. Is that something I can fix?



What I would like to receive is the two-sequence representation as required by IMAP and the respective mail clients. How would I go about that?







java javamail quoted-printable apache-commons-codec






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 25 '18 at 10:40







Alexander Leithner

















asked Nov 24 '18 at 17:42









Alexander LeithnerAlexander Leithner

874619




874619













  • I get F=C3=BCr. It shouldn’t be possible for getBytes(UTF_8) to produce a standalone FC byte.

    – VGR
    Nov 24 '18 at 18:44













  • @VGR Oh yes. That's right. However, I get some strange string when receiving from a MimeMessage, I updated my question.

    – Alexander Leithner
    Nov 25 '18 at 10:41











  • boas.toString() is most likely your problem, as it uses the platform’s charset to decode the bytes. I would use something like String s = (String) message.getContent(), which will take the message’s charset into account.

    – VGR
    Nov 25 '18 at 15:58











  • Well, sadly that doesn't help it, since the MimeMessage returns a string which literally contains although the message's Content-Type header is set to text/plain; charset=utf-8 which should work just right. (But the consecutive requests made by Thunderbird include the correct ü, no matter of how I retrieve the message's contents. However, I need it to work on the very first request...)

    – Alexander Leithner
    Nov 26 '18 at 13:55



















  • I get F=C3=BCr. It shouldn’t be possible for getBytes(UTF_8) to produce a standalone FC byte.

    – VGR
    Nov 24 '18 at 18:44













  • @VGR Oh yes. That's right. However, I get some strange string when receiving from a MimeMessage, I updated my question.

    – Alexander Leithner
    Nov 25 '18 at 10:41











  • boas.toString() is most likely your problem, as it uses the platform’s charset to decode the bytes. I would use something like String s = (String) message.getContent(), which will take the message’s charset into account.

    – VGR
    Nov 25 '18 at 15:58











  • Well, sadly that doesn't help it, since the MimeMessage returns a string which literally contains although the message's Content-Type header is set to text/plain; charset=utf-8 which should work just right. (But the consecutive requests made by Thunderbird include the correct ü, no matter of how I retrieve the message's contents. However, I need it to work on the very first request...)

    – Alexander Leithner
    Nov 26 '18 at 13:55

















I get F=C3=BCr. It shouldn’t be possible for getBytes(UTF_8) to produce a standalone FC byte.

– VGR
Nov 24 '18 at 18:44







I get F=C3=BCr. It shouldn’t be possible for getBytes(UTF_8) to produce a standalone FC byte.

– VGR
Nov 24 '18 at 18:44















@VGR Oh yes. That's right. However, I get some strange string when receiving from a MimeMessage, I updated my question.

– Alexander Leithner
Nov 25 '18 at 10:41





@VGR Oh yes. That's right. However, I get some strange string when receiving from a MimeMessage, I updated my question.

– Alexander Leithner
Nov 25 '18 at 10:41













boas.toString() is most likely your problem, as it uses the platform’s charset to decode the bytes. I would use something like String s = (String) message.getContent(), which will take the message’s charset into account.

– VGR
Nov 25 '18 at 15:58





boas.toString() is most likely your problem, as it uses the platform’s charset to decode the bytes. I would use something like String s = (String) message.getContent(), which will take the message’s charset into account.

– VGR
Nov 25 '18 at 15:58













Well, sadly that doesn't help it, since the MimeMessage returns a string which literally contains although the message's Content-Type header is set to text/plain; charset=utf-8 which should work just right. (But the consecutive requests made by Thunderbird include the correct ü, no matter of how I retrieve the message's contents. However, I need it to work on the very first request...)

– Alexander Leithner
Nov 26 '18 at 13:55





Well, sadly that doesn't help it, since the MimeMessage returns a string which literally contains although the message's Content-Type header is set to text/plain; charset=utf-8 which should work just right. (But the consecutive requests made by Thunderbird include the correct ü, no matter of how I retrieve the message's contents. However, I need it to work on the very first request...)

– Alexander Leithner
Nov 26 '18 at 13:55












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53460804%2fhow-to-get-two-sequence-representation-of-utf-8-character-using-javamails-mimeu%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53460804%2fhow-to-get-two-sequence-representation-of-utf-8-character-using-javamails-mimeu%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Hercules Kyvelos

Tangent Lines Diagram Along Smooth Curve

Yusuf al-Mu'taman ibn Hud