Hi,
this seems to be a bug:
a)
perl -MHTML::Entities -MEncode -e '$a="abcÄ–def";
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'
Result:
=?UTF-8?Q?abc=C3=84=E2=80=93def?=
b)
perl -MHTML::Entitientities -MEncode -e '$a="abcÄ
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'
Result:
=?UTF-8?Q?abc=C4def?=
In a) the string contains "–" to force UTF-8 (the result from
HTML::Entities::decode will not fit into ISO-8859-1).
In b) the result of HTML::Entities::decode is of ISO-8859-1, not UTF-8.
The result of b) is wrong. Encode::encode() doesn't seem to properly
consider the charset of the string in this case. I think the correct result is
=?UTF-8?Q?abc=C3=84def?=
FYI, when using MIME-B encoding, the results are
=?UTF-8?B?YWJjw4TigJNkZWY=?= (with decoded '–') and
=?UTF-8?B?YWJjw4RkZWY=?= (without).
I believe both are correct.
Cheers,
-Sven
PS: I'm using perl 5.8.7