ActiveState Code

Recipe 576578: a64l


An implementation of a64l as from the c stdlib. Convert between a radix-64 ASCII string and a 32-bit integer.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
def a64l(s):
    """
    An implementation of a64l as from the c stdlib.
    Convert between a radix-64 ASCII string and a 32-bit integer.

    '.' (dot) for 0, '/' for 1, '0' through '9' for [2,11],
    'A' through 'Z' for [12,37], and 'a' through 'z' for [38,63].

    TODO:
        do some implementations use '' instead of '.' for 0?

    >>> a64l('.')
    0
    >>> a64l('ZZZZZZ')
    40359057765L
    #only the first 6 chars are significant
    >>> a64l('ZZZZZZ.')
    40359057765L
    >>> a64l('A')
    12
    >>> a64l('Chris')
    951810894
    """
    MASK = 0xffffffff
    BITSPERCHAR = 6
    orda, ordZ, ordA, ord9, ord0 = ord('a'), ord('Z'), ord('A'), ord('9'), ord('0')

    r = 0
    for shift, c in enumerate(s[:6]):
        c = ord(c)
        if c > ordZ:
            c -= orda - ordZ - 1
        if c > ord9:
            c -= ordA - ord9 - 1
        r = (r | ((c - (ord0 - 2)) << (shift * BITSPERCHAR))) & MASK
    return r

Comments

  1. 1. At 1:45 p.m. on 3 dec 2008, David Lambert said:

    The algorithm is incorrect.

    /* convert command line args to lower 32 bits of a64l */
    #include <stdlib.h>
    #include <stdio.h>
    int main(int ac,char*av[]) {
      while (--ac)
        fputs(*++av,stdout),putchar(' '),printf("%d\n",0xffffffff&a64l(*av));
      return 0;
    }
    
    $ ./a.out ZZZZZZ
    ZZZZZZ 1704352101
    

    In python3 on linux system:

    '''
        This python3 implementation of a64l, while less portable, is
    
        1) Correct.
    
        and
    
        2) Gives access to the entire libc functionality
           with little additional effort per function.
    '''
    
    from ctypes import *
    
    libc = cdll.LoadLibrary("libc.so.6")
    a64l = libc.a64l
    a64l.restype = c_long
    a64l.argtypes = (c_char_p,)
    
    def wrapped_a64l(s):
        return libc.a64l(s)
    
    print(wrapped_a64l('ZZZZZZ'),'agrees with the c program output')
    print(wrapped_a64l('ZZZZZZ')-40359057765,'all differences should be 0')
    print(wrapped_a64l('ZZZZZZ.')-40359057765)
    print(wrapped_a64l('A')-12)
    print(wrapped_a64l('Chris')-951810894)
    

    The use and output of this program is:

    $ python3 a64l.py
    1704352101 agrees with the c program output
    -38654705664 all differences should be 0
    -38654705664
    0
    0
    
  2. 2. At 4:46 p.m. on 3 dec 2008, David Lambert said:

    Beats me, Chris. I hope I've convinced you that your code does not match my a64l. When I do the conversion by hand using mathematica I get your result for 6 capital zees. Or with python:

    Python 3.0rc1+ (py3k, Nov  5 2008, 14:44:46)
    [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> Z=37
    >>> B=64
    >>> Z+B*(Z+B*(Z+B*(Z+B*(Z+B*(Z)))))
    40359057765
    

    Therefor, either we do not understand this representation, or the a64l distributed with my linux is incorrect.

    Wow. I guess I'll ask redhat or the makers of my libc.

    When I insert this line into my c program I find that l64a writes ZZZZZ/

    puts(l64a(40359057765));
    
  3. 3. At 7:04 p.m. on 3 dec 2008, David Lambert said:

    Perhaps negative numbers are the trouble???? No. (Note that 0 is the empty string, but the digit 0 used as a place holder is "."

    #        1      /             -1 zzzzz1
    #        0                     0
    #       -1 zzzzz1              1      /
    #       -2 yzzzz1              2      0
    #       -3 xzzzz1              3      1
    #      -62 0zzzz1             62      y
    #      -63 /zzzz1             63      z
    #      -64 .zzzz1             64     ./
    #      -65 zyzzz1             65     //
    #      -66 yyzzz1             66     0/
    #    -4094 0.zzz1           4094     yz
    #    -4095 /.zzz1           4095     zz
    #    -4096 ..zzz1           4096    ../
    #    -4097 zzyzz1           4097    /./
    #    -4098 yzyzz1           4098    0./
    #  -262142 0..zz1         262142    yzz
    #  -262143 /..zz1         262143    zzz
    #  -262144 ...zz1         262144   .../
    #  -262145 zzzyz1         262145   /../
    #  -262146 yzzyz1         262146   0../
    

    OK, before I call redhat, let's check the binary representations of these numbers AH!

    LSB        broken at 32 bits--->  MSB
    10100110100110100110100110100110 0000   1704352101    ZZZZZZ Correct
    10100110100110100110100110100110 1001  40359057765    ZZZZZZ Too big!
    
    s       l64a(a64l(s))
    zzzzz   zzzzz
    zzzzz/  zzzzz/
    zzzzz0  zzzzz0
    zzzzz1  zzzzz1
    zzzzz2  zzzzz
    

    6 bits per character, but 6 is not a factor of 32. The sixth character can only be ., /, 0, or 1. So, your code works for some range of positive numbers. It is clearly not a replacement for the lib c versions. My vote stands at -1, the python 3.0 (final released today) library ctypes document shows how to use libc in the microsoft environment.

Sign in to comment