| Store | Cart

Problem with national characters

From: Leif B. Kristensen <abu...@solumslekt.org>
Thu, 31 Mar 2005 19:23:55 +0200
I'm developing a routine that will parse user input. For simplicity, I'm
converting the entire input string to upper case. One of the words that
will have special meaning for the parser is the word "f?r", (before in
English). However, this word is not recognized. A test in the
interactive shell reveals this:

leif at balapapa leif $ python
Python 2.3.4 (#1, Feb  7 2005, 21:31:38)
[GCC 3.3.5  (Gentoo Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 'f?r'.upper()
'F\xf8R'
>>> 'F?R'
'F\xd8R'
>>>

In Windows, the result is slightly different, but no better:

C:\Python23>python
ActivePython 2.3.2 Build 232 (ActiveState Corp.) based on
Python 2.3.2 (#49, Nov 13 2003, 10:34:54) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 'f?r'.upper()
'F\x9bR'
>>> 'F?R'
'F\x9dR'
>>>

Is there a way around this problem? My character set in Linux is
ISO-8859-1. In Windows 2000 it should be the equivavent Latin-1, though
I'm not sure about which character set the command shell is using.
-- 
Leif Biberg Kristensen
http://solumslekt.org/

Recent Messages in this Thread
Leif B. Kristensen Mar 31, 2005 05:23 pm
davi...@gmail.com Mar 31, 2005 09:53 pm
Leif B. Kristensen Mar 31, 2005 10:17 pm
Leif B. Kristensen Mar 31, 2005 10:34 pm
Leif B. Kristensen Mar 31, 2005 11:02 pm
Max M Mar 31, 2005 11:06 pm
"Martin v. Löwis" Apr 01, 2005 05:34 am
Messages in this thread