Skip to content

Emoji's in the subject with RFC 2047 encoding are not handled well #82

@noloader

Description

@noloader

Hi Everyone. Thanks for the tool. I've been looking for a tool to convert emails to pdfs for a while.

I noticed emoji's in the subject are not handled well. Encoding of subjects is covered in RFC2047:

encoded-word = "=?" charset "?" encoding "?" encoded-text "?="

Here's an example from GMail:

Image

When I "view Original" in GMail and then "Download Original" message, the subject is encoded per RFC 2047 as:

=?utf-8?q?=F0=9F=92=8B_Unlock_Your_Perfect_Valentine=27s_Date_Today_=F0=9F=92=9D?=

Then, after conversion using email-to-pdf-converter, I see:

Image

Notice the emojis have disappeared for the subject in the pdf. Emoji's in the body are also missing, but I am less concerned about that.

Attached is the original message, screen captures, and a converted email to pdf: test-subject-with-emojis.zip.


Here is the command I used to perform the conversion. emailconverter-3.0.0-all.jar was downloaded from this GitHub.

$ java -jar emailconverter-3.0.0-all.jar test-subject-with-emojis.eml

which results in:

$ java -jar emailconverter-3.0.0-all.jar test-subject-with-emojis.eml 
Start converting test-subject-with-emojis.eml to test-subject-with-emojis.pdf
Mime Structure of test-subject-with-emojis.eml:
-----------Mime Message-----------
> multipart/mixed
> |  text/html
----------------------------------
Extract the inline images
Start conversion to pdf
The switch --viewport-size, is not support using unpatched qt, and will be ignored.The switch --image-quality, is not support using unpatched qt, and will be ignored.Loading page (1/2)
Printing pages (2/2)                                               
Done                                                           
Conversion finished

And if needed:

$ lsb_release -a
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy

Libre Office has a similar bug at https://bugs.documentfoundation.org/show_bug.cgi?id=129523. In the past I tried to convert emails to pdf using Libre Office tools.


And to be clear, these types of messages are spam. My need to convert them to pdf is due to a hobby of mine. I like to cause problems for spammers and their service providers. I've dragged them into court in the past. Moving forward, I would like to ensure the lawyers and judge see the annoying emojis.

(It is more like return the favor to spammers and service providers, considering how much time and effort I waste on sifting through their crap while trying to maintain free software projects and their websites).

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions