| Store | Cart

Bug in Encode::encode('MIME-Q', $iso_8859_1_string)

From: Sven Neuhaus <s...@heise.de>
Wed, 23 Nov 2005 15:54:50 +0100
Hi,

this seems to be a bug:
a)
perl -MHTML::Entities -MEncode -e '$a="abc&Auml;&ndash;def";
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'

Result:
=?UTF-8?Q?abc=C3=84=E2=80=93def?=

b)
perl -MHTML::Entitientities -MEncode -e '$a="abc&Auml;
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'

Result:
=?UTF-8?Q?abc=C4def?=

In a) the string contains "&ndash;" to force UTF-8 (the result from
HTML::Entities::decode will not fit into ISO-8859-1).

In b) the result of HTML::Entities::decode is of ISO-8859-1, not UTF-8.
The result of b) is wrong. Encode::encode() doesn't seem to properly
consider the charset of the string in this case. I think the correct result is
=?UTF-8?Q?abc=C3=84def?=

FYI, when using MIME-B encoding, the results are
=?UTF-8?B?YWJjw4TigJNkZWY=?= (with decoded '&ndash;') and
=?UTF-8?B?YWJjw4RkZWY=?=     (without).

I believe both are correct.

Cheers,
-Sven
PS: I'm using perl 5.8.7

Recent Messages in this Thread
Sven Neuhaus Nov 23, 2005 02:54 pm
John Delacour Dec 19, 2005 05:01 pm
Sven Neuhaus Dec 20, 2005 08:00 am
Messages in this thread