Welcome, guest | Sign In | My Account | Store | Cart

The numpy.fromfile() function supports binary formats or decimal text. How do you read millions of hexadecimal numbers quickly?

Python, 4 lines
1
2
3
4
data = numpy.frombuffer(open(filename).read().replace('\n','').decode('hex'), dtype=numpy.uint32).byteswap()

# Slow version, for reference:
numpy.fromiter( (int(x, 16) for x in open(filename)), dtype=numpy.uint32)

Reading the numbers one by one and converting them with int(s, 16) is quite slow. This trick speeds it up by about a factor of 4 and avoid constructing millions of individual python int objects.

Note that this method does not verify the format. It assumes that the input consists of numbers with a fixed width of exactly 8 chars and contains nothing but hexadecimal digits and newlines.