Accepts one of more files and/or globs and interleaves the lines from each writing the result to stdout.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
#!/usr/bin/env python """interleave.py <glob1> [, <glob1> ... ] Accepts one of more files or globs interleaving lines and writing to stdout. """ import os import sys import glob def iter_interleave(*iterables): """ A generator that interleaves the output from a one or more iterators until they are *all* exhausted. """ iterables = map(iter, iterables) while iterables: result =  for it in iterables: try: result.append(it.next()) except StopIteration: iterables.remove(it) print result for item in result: yield item if __name__ == '__main__': files =  if len(sys.argv) < 2: print __doc__.split("\n") sys.exit(1) if sys.argv.lower() in ('-h', '--help'): print __doc__, sys.exit(0) for arg in sys.argv[1:]: for entry in glob.glob(arg): if os.path.isfile(entry): files.append(open(entry, 'U')) # Use universal newline support for line in iter_interleave(*files): print line,
I had a need to interleave lines from multiple files into a single file and didn't know of a simple UNIX command to do this - I'm sure a solution will be forthcoming now that I've said it ;-)
Here is a handy cross-platform Python version that I created in about 10 minutes.
The key part is the generator that combines lines read from file objects. It is important to note that its behaviour differs from
itertools.izip() which ends as soon as one of the iterables runs out of data. This would cause a problem here if you have input files of varying lengths.
On a GNU-based system, I can think of:
It does not take ten minutes to write, but it's clearly not better than your solution in term of readability nor portability.
You shouldn't remove items from a list when you're iterating over it.
Also, Python's docs for the itertools library show a roundrobin recipe that does this: http://docs.python.org/2/library/itertools.html (recipe only existed in docs since python 2.5.4, Dec 2008, so still pretty new when this was created)