| Store | Cart

[Python-Dev] PEP 385: the eol-type issue

From: M.-A. Lemburg <m...@egenix.com>
Thu, 06 Aug 2009 12:40:09 +0200
Nick Coghlan wrote:
> Antoine Pitrou wrote:>> M.-A. Lemburg <mal <at> egenix.com> writes:>>> Please file a bug report for this. f.readlines() (or rather>>> the io layer) should be using Py_UNICODE_ISLINEBREAK(ch)>>> for detecting line break characters.>>>> Actually, no. It has been designed from the start to only recognize the>> "standard" line break representations found in common formats/protocols (CR, LF>> and CR+LF).>> People wanting to split on arbitrary unicode line breaks should use>> str.splitlines().> > The fairly long-standing RFE relating to an arbitrarily selectable> newline separator seems relevant here:> http://bugs.python.org/issue1152248> > As with the discussion there, the problem with using str.splitlines is> that it prevents pipelining approaches that avoid reading a whole file> into memory.> > While removing the validity check from readlines() completely is> questionable (the readrecords() approach mentioned in the tracker issue> would still be better there), loosening the validity check to be based> on Py_UNICODE_IS_LINEBREAK seems a bit more feasible. (I'd still call it> a feature requests rather than a bug though).

I've had a look at the io implementation: this appears to be
based on the universal newline support idea which addresses
only a fixed set of "new line" character combinations and is
not as straight forward to extend to support all Unicode
line break characters as I thought.

What I don't understand is why the io layer tries to reinvent
the wheel here instead of just using the codec's .readline()
method - which *does* use .splitlines() and has full support
for all Unicode line break characters (including the CRLF
combination).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 06 2009)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

Recent Messages in this Thread
Dirkjan Ochtman Aug 03, 2009 10:41 am
Nick Coghlan Aug 04, 2009 09:20 am
Mark Hammond Aug 04, 2009 11:43 pm
Neil Hodgson Aug 05, 2009 12:44 am
"Martin v. Löwis" Aug 05, 2009 07:35 am
Mark Hammond Aug 05, 2009 07:44 am
"Martin v. Löwis" Aug 05, 2009 08:09 am
Paul Moore Aug 05, 2009 10:04 am
Dirkjan Ochtman Aug 05, 2009 10:14 am
Mark Hammond Aug 05, 2009 11:22 am
John Arbash Meinel Aug 05, 2009 02:58 pm
"Martin v. Löwis" Aug 05, 2009 06:22 pm
Mark Hammond Aug 05, 2009 11:19 am
Dirkjan Ochtman Aug 05, 2009 11:28 am
Mark Hammond Aug 05, 2009 11:46 am
Glenn Linderman Aug 05, 2009 05:43 pm
Paul Moore Aug 05, 2009 04:24 pm
Neil Hodgson Aug 05, 2009 08:25 am
"Martin v. Löwis" Aug 05, 2009 08:41 am
Neil Hodgson Aug 05, 2009 09:09 am
Georg Brandl Aug 05, 2009 07:43 pm
"Martin v. Löwis" Aug 05, 2009 08:13 pm
Georg Brandl Aug 05, 2009 08:18 pm
Ben Finney Aug 05, 2009 05:56 am
Mark Hammond Aug 05, 2009 06:08 am
Ben Finney Aug 05, 2009 06:50 am
Mark Hammond Aug 05, 2009 07:31 am
Ben Finney Aug 05, 2009 08:00 am
Mark Hammond Aug 05, 2009 08:09 am
Ben Finney Aug 05, 2009 09:42 am
"Martin v. Löwis" Aug 05, 2009 08:12 am
Stephen J. Turnbull Aug 05, 2009 02:28 pm
Georg Brandl Aug 05, 2009 07:56 pm
Mark Hammond Aug 06, 2009 12:34 am
Stephen J. Turnbull Aug 06, 2009 06:00 am
"Martin v. Löwis" Aug 06, 2009 06:40 am
Stephen J. Turnbull Aug 06, 2009 07:12 am
"Martin v. Löwis" Aug 05, 2009 07:45 am
Dj Gilcrease Aug 05, 2009 06:02 am
Dirkjan Ochtman Aug 05, 2009 08:25 am
"Martin v. Löwis" Aug 05, 2009 08:51 am
Dirkjan Ochtman Aug 05, 2009 09:04 am
"Martin v. Löwis" Aug 05, 2009 09:12 am
Mark Hammond Aug 05, 2009 09:02 am
Dirkjan Ochtman Aug 05, 2009 09:09 am
"Martin v. Löwis" Aug 05, 2009 09:16 am
Mark Hammond Aug 05, 2009 09:17 am
Nick Coghlan Aug 05, 2009 12:50 pm
MRAB Aug 05, 2009 01:35 pm
Dirkjan Ochtman Aug 05, 2009 01:37 pm
Nick Coghlan Aug 05, 2009 02:12 pm
Oleg Broytmann Aug 05, 2009 01:50 pm
Oleg Broytmann Aug 05, 2009 01:57 pm
Stephen J. Turnbull Aug 05, 2009 03:34 pm
"Martin v. Löwis" Aug 05, 2009 06:37 pm
Stephen J. Turnbull Aug 06, 2009 05:00 am
"Martin v. Löwis" Aug 06, 2009 05:48 am
Neil Hodgson Aug 06, 2009 10:10 pm
M.-A. Lemburg Aug 07, 2009 08:31 am
Antoine Pitrou Aug 07, 2009 12:12 pm
M.-A. Lemburg Aug 07, 2009 12:48 pm
Neil Hodgson Aug 05, 2009 10:22 pm
M.-A. Lemburg Aug 06, 2009 08:31 am
Antoine Pitrou Aug 06, 2009 08:51 am
Nick Coghlan Aug 06, 2009 10:19 am
M.-A. Lemburg Aug 06, 2009 10:40 am
M.-A. Lemburg Aug 06, 2009 10:46 am
Antoine Pitrou Aug 06, 2009 11:01 am
M.-A. Lemburg Aug 06, 2009 11:34 am
Antoine Pitrou Aug 06, 2009 11:42 am
Dirkjan Ochtman Aug 05, 2009 02:04 pm
Messages in this thread