| Store | Cart

Re: [perl #37757] decode_utf8 broken in perl 5.8.7

From: Yitzchak Scott-Thoennes <stho...@efn.org>
Fri, 2 Dec 2005 00:17:05 -0800
On Tue, Nov 29, 2005 at 12:34:11PM +0100, Michael Schroeder wrote:
> > On Mon, 28 Nov 2005 stho...@efn.org wrote:> > On Thu, Nov 24, 2005 at 11:42:08AM -0800, debi...@j3e. de wrote:> > > decode_utf8() doesn't return "false" if run with non-UTF-8 string. It just> > > returns the non-UTF-8 string. To see this bug in action use convmv from> > > http://j3e.de/linux/convmv/ and convert a filename from latin1 to utf8. It will> > > tell you that the file is already UTF-8 encoded. convmv evaluates decode_utf8()> > > to see if a file is already utf-8-encoded.> > > > I don't see any indication in the Encode doc that decode_utf8 would> > ever return false on error.  To use it to check for valid utf8, I> > think you'd need to specify the CHECK parameter as FB_CROAK and wrap> > the call in an eval {}; see:> > http://perldoc.perl.org/Encode.html#Handling-Malformed-Data> > > > Perhaps you should use utf8::decode() instead?> > Well, the perluniintro manpage says:> >  - How Do I Detect Data That's Not Valid In a Particular Encoding?> >    Use the "Encode" package to try converting it.  For example,> >        use Encode 'decode_utf8';>        if (decode_utf8($string_of_bytes_that_I_think_is_utf8)) {> 	   # valid>        } else {> 	   # invalid>        }

Ah, I hadn't noticed that; that doesn't agree with the doc in Encode
itself, but up through Encode 2.09 (2.08 was included with perl5.8.6),
decode_utf8 did actually just call utf8::decode when no check
parameter was passed.  Encode 2.10 (in perl5.8.7) now works as
described in the Encode doc, but doesn't work as described in

Dan, perhaps it would be a good idea to put back the old behavior
(reversing the change you made for
http://rt.cpan.org/NoAuth/Bug.html?id=8872 and changing the doc
instead) when no check parameter is passed?

Recent Messages in this Thread
Yitzchak Scott-Thoennes Dec 02, 2005 08:17 am