Welcome, guest | Sign In | My Account | Store | Cart

This is a quick-hack module I wrote up in a couple of hours that allows for a nicer syntax to build up struct-like binary packing and unpacking. The point was to get it to be concise and as C-like as possible. This script requires python3 for it's improved metaclass support.

Python, 127 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
import struct

def preargs(cls):
    def _pre_init(*args1, **kwargs1):
        def _my_init(*args2, **kwargs2):
            args = args1 + args2
            kwargs1.update(kwargs2)
            return cls(*args, **kwargs1)
        return _my_init
    return _pre_init

class BinaryMetaType(type):
    def __getitem__(self, val):
        return array(self, val)

class BinaryType(metaclass=BinaryMetaType):
    def __init__(self, **kwargs):
        self._kwargs = kwargs

    def to_binary(self, val):
        pass

    def from_binary(self, binary):
        pass

class SimpleBinaryType(BinaryType):
    def __init__(self, fmt):
        self._struct = struct.Struct(fmt)

    def to_binary(self, val):
        return self._struct.pack(val)

    def from_binary(self, binary):
        return (self._struct.size,
                self._struct.unpack(binary[:self._struct.size])[0])

@preargs
class array(BinaryType):
    def __init__(self, arrtype, arrlen, **kwargs):
        super().__init__(**kwargs)
        self._arrtype, self._arrlen = arrtype(**kwargs), arrlen

    def to_binary(self, val):
        res = []
        for i,v in enumerate(val):
            res.append(self._arrtype.to_binary(v))
            if i+1 == self._arrlen: break
        return b''.join(res)

    def from_binary(self, binary):
        res = []
        ssum = 0
        for i in range(self._arrlen):
            s,v = self._arrtype.from_binary(binary[ssum:])
            ssum += s
            res.append(v)
        return ssum, res

class dword(SimpleBinaryType):
    def __init__(self, **kwargs):
        super().__init__('I', **kwargs)

class char(SimpleBinaryType):
    def __init__(self, **kwargs):
        super().__init__('c', **kwargs)

class BinaryBuilder(dict):
    def __init__(self, **kwargs):
        self.members = []
        self._kwargs = kwargs

    def __setitem__(self, key, value):
        if key ==  '__module__': return
        if key not in self:
            self.members.append((key, value(**self._kwargs)))
        super().__setitem__(key, value)

class Binary(type):
    @classmethod
    def __prepare__(*bases, **kwargs):
        # In the future kwargs can contain things such as endianity
        # and alignment
        return BinaryBuilder(**kwargs)

    def __new__(cls, name, bases, classdict):
        # There are nicer ways of doing this, but as a hack it works
        def fixupdict(d):
            @classmethod
            def to_binary(clas, datadict):
                res = []
                for k,v in clas.members:
                    res.append(v.to_binary(datadict[k]))
                return b''.join(res)

            @classmethod
            def from_binary(cls, bytesin):
                res = {}
                ssum = 0
                for k,v in cls.members:
                    i, d = v.from_binary(bytesin[ssum:])
                    ssum += i
                    res[k] = d
                return ssum, res

            nd = {'to_binary': to_binary,
              'from_binary': from_binary,
              'members': d.members}
            return nd

        return super().__new__(cls, name, bases, fixupdict(classdict))


#### How one would use the above module

class BMP(metaclass=Binary):
    # The point was to try and get this C-like syntax
    bfType = char[2]
    bfSize = dword
    bfReserved = dword
    bfOffBits = dword

print(BMP.from_binary(b'BM6\x00$\x00\x00\x00\x00\x006\x00\x00\x00'))
print(BMP.to_binary(
    {'bfType': 'BM',
     'bfSize': 2359350,
     'bfReserved': 0,
     'bfOffBits': 54}))

This was a quick attempt to get a nicer python syntax for un/packing binary data structures. This requires python 3 because now we can get ordered data about the declared items using the __prepare__ meta-class-method. I also use here class descriptors for making the array syntax nicer (instead of char[2] you can use array(char, 2)). I was going to use abstract meta classes (ABC) for the types but decided it was overkill :) . This is really a quick version so there is much that can be improved but as a POC I think it works very well. I would be interested in seeing the syntax with a metaclass being used by something like the peach or construct python modules.

I only have 2 basic types but it is trivial to add the rest. Adding substruct's (as in a class inside a class) should be doable but might require fixing up the metaclass up a bit. Once you've got that doing unions is trivial. The hardest feature still missing is being able to do element dependencies like: msNumOfNumbers = dword msItems = dword[msNumOfNumbers] but I think that will require some major restructuring. I might fixup the code to allow it if I need it/have time for it. Feel free to take this code and improve it or do with it what you want. Comments would be appreciated.

Created by Daniel Brodie on Wed, 25 Feb 2009 (MIT)
Python recipes (4591)
Daniel Brodie's recipes (5)

Required Modules

  • (none specified)

Other Information and Tasks