This recipe introduces a new encoding scheme known as base255. It is useful in automatically making a python string into a "C string" so that no NULL bytes are left in the processed character array.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | '''Module for string conversion.
This module provides two functions that allow
strings to be encoded and decoded in base 255.'''
################################################################################
__version__ = '$Revision: 0 $'
__date__ = 'February 11, 2007'
__author__ = 'Stephen "Zero" Chappell <my.bios@gmail.com>'
__credits__ = '''\
S. Schaub, for introducing me to programming.
B. Brown, for teaching me some math courses.
E. Skogen, for listening to my ideas.'''
################################################################################
import sys as _sys
################################################################################
def encode(string, divide=1024):
'Encode a string to base 255.'
def encode(s):
i = 0
for c in s:
i *= 257
i += ord(c) + 1
s = ''
while i:
s = chr(i % 254 + 2) + s
i /= 254
return s
return '\1'.join(encode(string[index:index+divide]) for index in xrange(0, len(string), divide))
def decode(string):
'Decode a string from base 255.'
def decode(s):
i = 0
for c in s:
i *= 254
i += ord(c) - 2
s = ''
while i:
s = chr(i % 257 - 1) + s
i /= 257
return s
return ''.join(decode(string) for string in string.split('\1'))
################################################################################
if __name__ == '__main__':
_sys.stdout.write('Content-Type: text/plain\n\n')
_sys.stdout.write(file(_sys.argv[0]).read())
|
This algorithm was developed after considering the fact that all files and string are actually just numbers encoded in base 256. The code takes this into account and NULL bytes that could be at the beginning of an array and converts the data so that it is a base 255 number that has had a 1 added to each individual byte.