a64l « Python recipes « ActiveState Code

An implementation of a64l as from the c stdlib. Convert between a radix-64 ASCII string and a 32-bit integer.

      def a64l(s):
    """
    An implementation of a64l as from the c stdlib.
    Convert between a radix-64 ASCII string and a 32-bit integer.

    '.' (dot) for 0, '/' for 1, '0' through '9' for [2,11],
    'A' through 'Z' for [12,37], and 'a' through 'z' for [38,63].

    TODO:
        do some implementations use '' instead of '.' for 0?

    >>> a64l('.')
    0
    >>> a64l('ZZZZZZ')
    40359057765L
    #only the first 6 chars are significant
    >>> a64l('ZZZZZZ.')
    40359057765L
    >>> a64l('A')
    12
    >>> a64l('Chris')
    951810894
    """
    MASK = 0xffffffff
    BITSPERCHAR = 6
    orda, ordZ, ordA, ord9, ord0 = ord('a'), ord('Z'), ord('A'), ord('9'), ord('0')

    r = 0
    for shift, c in enumerate(s[:6]):
        c = ord(c)
        if c > ordZ:
            c -= orda - ordZ - 1
        if c > ord9:
            c -= ordA - ord9 - 1
        r = (r | ((c - (ord0 - 2)) << (shift * BITSPERCHAR))) & MASK
    return r

      

Tags: conversion, radix64

3 comments

David Lambert 15 years, 4 months ago # | flag

The algorithm is incorrect.

/* convert command line args to lower 32 bits of a64l */
#include <stdlib.h>
#include <stdio.h>
int main(int ac,char*av[]) {
  while (--ac)
    fputs(*++av,stdout),putchar(' '),printf("%d\n",0xffffffff&a64l(*av));
  return 0;
}

$ ./a.out ZZZZZZ
ZZZZZZ 1704352101

In python3 on linux system:

'''
    This python3 implementation of a64l, while less portable, is

    1) Correct.

    and

    2) Gives access to the entire libc functionality
       with little additional effort per function.
'''

from ctypes import *

libc = cdll.LoadLibrary("libc.so.6")
a64l = libc.a64l
a64l.restype = c_long
a64l.argtypes = (c_char_p,)

def wrapped_a64l(s):
    return libc.a64l(s)

print(wrapped_a64l('ZZZZZZ'),'agrees with the c program output')
print(wrapped_a64l('ZZZZZZ')-40359057765,'all differences should be 0')
print(wrapped_a64l('ZZZZZZ.')-40359057765)
print(wrapped_a64l('A')-12)
print(wrapped_a64l('Chris')-951810894)

The use and output of this program is:

$ python3 a64l.py
1704352101 agrees with the c program output
-38654705664 all differences should be 0
-38654705664
0
0

David Lambert 15 years, 4 months ago # | flag

Beats me, Chris. I hope I've convinced you that your code does not match my a64l. When I do the conversion by hand using mathematica I get your result for 6 capital zees. Or with python:

Python 3.0rc1+ (py3k, Nov  5 2008, 14:44:46)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> Z=37
>>> B=64
>>> Z+B*(Z+B*(Z+B*(Z+B*(Z+B*(Z)))))
40359057765

Therefor, either we do not understand this representation, or the a64l distributed with my linux is incorrect.

Wow. I guess I'll ask redhat or the makers of my libc.

When I insert this line into my c program I find that l64a writes ZZZZZ/

puts(l64a(40359057765));

David Lambert 15 years, 4 months ago # | flag

Perhaps negative numbers are the trouble???? No. (Note that 0 is the empty string, but the digit 0 used as a place holder is "."

#        1      /             -1 zzzzz1
#        0                     0
#       -1 zzzzz1              1      /
#       -2 yzzzz1              2      0
#       -3 xzzzz1              3      1
#      -62 0zzzz1             62      y
#      -63 /zzzz1             63      z
#      -64 .zzzz1             64     ./
#      -65 zyzzz1             65     //
#      -66 yyzzz1             66     0/
#    -4094 0.zzz1           4094     yz
#    -4095 /.zzz1           4095     zz
#    -4096 ..zzz1           4096    ../
#    -4097 zzyzz1           4097    /./
#    -4098 yzyzz1           4098    0./
#  -262142 0..zz1         262142    yzz
#  -262143 /..zz1         262143    zzz
#  -262144 ...zz1         262144   .../
#  -262145 zzzyz1         262145   /../
#  -262146 yzzyz1         262146   0../

OK, before I call redhat, let's check the binary representations of these numbers AH!

LSB        broken at 32 bits--->  MSB
10100110100110100110100110100110 0000   1704352101    ZZZZZZ Correct
10100110100110100110100110100110 1001  40359057765    ZZZZZZ Too big!

s       l64a(a64l(s))
zzzzz   zzzzz
zzzzz/  zzzzz/
zzzzz0  zzzzz0
zzzzz1  zzzzz1
zzzzz2  zzzzz

6 bits per character, but 6 is not a factor of 32. The sixth character can only be ., /, 0, or 1. So, your code works for some range of positive numbers. It is clearly not a replacement for the lib c versions. My vote stands at -1, the python 3.0 (final released today) library ctypes document shows how to use libc in the microsoft environment.

◄	Python recipes (4591)	►
◄	Christoph Devenoges's recipes (2)	►

a64l (Python recipe) by Christoph Devenoges
ActiveState Code (http://code.activestate.com/recipes/576578/)

3 comments

Tags

Required Modules

Other Information and Tasks

Accounts

Code Recipes

Feedback & Information

ActiveState

a64l (Python recipe) by Christoph Devenoges ActiveState Code (http://code.activestate.com/recipes/576578/)

3 comments

Tags

Required Modules

Other Information and Tasks

Accounts

Code Recipes

Feedback & Information

ActiveState

a64l (Python recipe) by Christoph Devenoges
ActiveState Code (http://code.activestate.com/recipes/576578/)