Bug in Encode::encode('MIME-Q', $iso_8859_1_string)

From: Sven Neuhaus <s...@heise.de>

Wed, 23 Nov 2005 15:54:50 +0100

Hi,

this seems to be a bug:
a)
perl -MHTML::Entities -MEncode -e '$a="abc&Auml;&ndash;def";
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'

Result:
=?UTF-8?Q?abc=C3=84=E2=80=93def?=

b)
perl -MHTML::Entitientities -MEncode -e '$a="abc&Auml;
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'

Result:
=?UTF-8?Q?abc=C4def?=

In a) the string contains "&ndash;" to force UTF-8 (the result from
HTML::Entities::decode will not fit into ISO-8859-1).

In b) the result of HTML::Entities::decode is of ISO-8859-1, not UTF-8.
The result of b) is wrong. Encode::encode() doesn't seem to properly
consider the charset of the string in this case. I think the correct result is
=?UTF-8?Q?abc=C3=84def?=

FYI, when using MIME-B encoding, the results are
=?UTF-8?B?YWJjw4TigJNkZWY=?= (with decoded '&ndash;') and
=?UTF-8?B?YWJjw4RkZWY=?=     (without).

I believe both are correct.

Cheers,
-Sven
PS: I'm using perl 5.8.7

Recent Messages in this Thread
Bug in Encode::encode('MIME-Q', $iso_8859_1_string)	Sven Neuhaus	Nov 23, 2005 02:54 pm
Re: Bug in Encode::encode('MIME-Q', $iso_8859_1_string)	John Delacour	Dec 19, 2005 05:01 pm
Re: Bug in Encode::encode('MIME-Q', $iso_8859_1_string)	Sven Neuhaus	Dec 20, 2005 08:00 am

◄

Messages in this thread ►

Previous post: Re: Converting between UTF8 and local codepage without specifying local codepage

Next post: Re: [perl #37757] decode_utf8 broken in perl 5.8.7

Subscribe to the perl-unicode RSS feed

Accounts

List Archives

Feedback & Information

ActiveState

© 2019 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.