Dump and load variables for exchanging data between matlab and numarray. The variables are in a dictionary, and each key in the variable gets dumped into a file named after the variable. If the variable is complex, the real and imag parts are dumped into separate files.
Loading of files, created by matlab is also possible. In this case, the returned dictionary has the keys derived from the file names and values from the files.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 | #!/usr/bin/python
# Created by Karthikesh Raju < karthik@james.hut.fi >
#
# Created: Sat 26 Jun 2004 12:45:35 PM EEST
# Last Modified: Thu 22 Jul 2004 08:28:17 PM EEST
"""
This script loads a dictionary of tuples, array's (numarray) to a files.
The files are named after the keys of the dictionary (+-data.dat). If the
file contains complex data, the data is loaded into two files named
key-data-real/imag.dat.
Given a directory containing a set of data files ('*.dat'), the script loads
all the files into a dictionary "result" with keys corresponding to the
principal file name. Files with real and imag names, are loaded into a
complex array.
"""
# To Do:
# 1. How to handle dumping, if file already exists?
# 2. What about the ablity to individual variables with names and load
# them into the primary workspace?
# 3. Load data with a file name or the directory.
__revision__ = "$Revision: 1.6$"
import numarray, tarfile
import glob, string, os, time
import ConfigParser
def dumpData(data, fileName):
"""
Dumps the data into files. Complex Variables get dumped into files
varName-data-real.dat, varName-data-imag.dat.
Usage:
data{'A'} = array([1,2,3,4])
data{'B'} = array([1,2,3,4]) + 1j* array([1,2,3,4])
data{'C'} = array([[1,2,3,4],[5,6,7,8]])
dumpData(data)
This creates files A-data.dat, B-data-real.dat, B-data-imag.dat, C-data.dat
"""
try:
print "Dumping Data:"
# Keys become file names, so for each key, create a file.
for key in data.keys():
print ". ",
# If the value corresponding to the key is complex,
# create seperate files for the real and imag parts.
if data[key].typecode() in ('D', 'F'):
open1 = open(str(key) + '-data-real.dat', 'w')
# Write real part to its corresponding file
# by writeData function.
writeData(open1, data[key].real)
open1.close()
open1 = open(str(key) + '-data-imag.dat', 'w')
# Write imag part to its corresponding file
# by writeData function.
writeData(open1, data[key].imag)
open1.close()
else:
# The data is real, so there is just one file
# created for the variable, and the data is
# written into the corresponding file.
open1 = open(str(key) + '-data.dat', 'w')
writeData(open1, data[key])
open1.close()
print "\n"
# Compress all the created files in an archive
# with in today's date.
files = glob.glob('*.dat')
today = time.ctime()
today = today.replace(' ', '-')
today = today.replace(':', '-')
compressName = fileName + today + '.tar.gz'
compress(files, compressName)
except AttributeError:
# If the passed variable is not dictionary, print
# "only dictionaries are supported"
print "Only dictionaries are supported. \n"
def loadData(dirName, archiveName):
"""
Given a dictionary, loadData loads all files with extension ".dat" into
a dictionary. The keys of the dictionary are derived from the file names.
Anything before -data-*.dat, forms the key for the dictionary.
Usage:
~/test/A-data.dat
~/test/B-data-real.dat
~/test/B-data-imag.dat
result = loadData('~/test')
Result:
result['A'] = array([1.,2.,3.,4.])
result['B'] = array([1.+1.j,2.+2.j,3.+3.j,4.+4.j])
Note:
All data is converted into floats. So, the resultant matrix is a matrix of
floats, even if the original data is int.
"""
# marker sets the starting part of the variable.
# File names will be dirName/A-data.dat, so marker is the
# point where A-data.dat starts
marker = len(dirName)+1
if (dirName[-1] == "/" or archiveName[0] == "/"):
archiveName = dirName + archiveName
else:
archiveName = dirName + "/" + archiveName
uncompress(archiveName)
# Choose only the extracted dat files
dirName = dirName + '/*.dat'
files = glob.glob(dirName)
# Result holds the final dictionary
result = {}
# For each file in *.dat, separate the varaible which is
# anything before -data-*. In the above case extract A from
# A-data.dat. This is the key for the dictionary.
#
# Pass the file to the matrix creator which returns an array
# of the data
print "Loading Data: "
for name in files:
print ". ",
var = name.find('-')
varName = name[marker:var]
varTypeReal = name.find('real')
varTypeImag = name.find('imag')
# file2matrix returns the contents of file "name" in an array
value = file2matrix(name)
os.remove(name)
# If the key already exists, then this value should
# be either real or imag
if result.has_key(varName):
if varTypeReal: # real value
result[varName] = value + 1j*result[varName]
elif varTypeImag: # imag value
result[varName] = result[varName] + 1j*value
else:
# If the key does not exist, then create an corresponding
# entry with the array "value" returned.
result[varName] = value
# "result", returned, contains the data in the files.
return result
def file2matrix(fileName):
"""
Given a file "fileName", this script converts the contents of the
file to a numarray matrix of size (m,) or (m,n).
"""
# Open the file
open1 = open(fileName, 'r')
line = open1.readline()
# A single line from the file is read. This should be the number
# of columns. Spliting "line" results in a list, the length of
# which is the number of columns: cols
cols = len(line.split())
# x is the placeholder for the data.
# The placeholder matrix is made of concatenating
# every line that is read from the file.
# The type code is float.
x = numarray.zeros(cols, type='f')
while (line):
temp = []
list_elements = line.split()
for e in list_elements:
temp.append(float(e))
x = numarray.concatenate((x, numarray.array(temp)))
line = open1.readline()
open1.close()
# concatenate((x,x)) is along the dimension 1,
# each new addition is an increase in columns,
# the total length of x gives m*n. We determine
# the rows as
rows = len(x)/cols
# We have added an aritifical row, so if row == 2
# the file has tuple, hence the data consists of
# everything after our artificial set of columns
if rows == 2:
return x[cols:]
else:
# If there are m+1 rows, we resize the data to (m+1)xn
# and return everything other than the artificial first
# row of zeros
x = numarray.resize(x, (rows, cols))
return x[1:, :]
def writeData(fileHandle, data):
"""
Given a handle to a file, and a matrix/tuple "data",
the data is written to the file pointed to by "fileHandle".
If "data" is a matrix, each row is a separate line. The
file is of type string.
"""
# If "data" is a matrix, each row forms a line.
try:
i, j = data.shape
for ii in range(0, i):
for jj in range(0, j):
fileHandle.write("%s " %data[ii, jj])
fileHandle.write('\n')
fileHandle.write('\n')
except ValueError:
# If "data" is a tuple, there is just a row in the file.
i = data.shape[0]
for ii in range(0, i):
fileHandle.write("%s " %data[ii])
fileHandle.write('\n')
def compress(files, compressName):
"""
Compress all the given files to a tar.gz archive
Additionally remove all the given files (other than
the archive) after compression.
"""
tar = tarfile.open(compressName, "w:gz")
for fileName in files:
tar.add(fileName)
os.remove(fileName)
tar.close()
def uncompress(archiveFiles):
"""
Given a archive file name, extract the contents of it
in the current directory.
"""
tar = tarfile.open(archiveFiles, "r:gz")
for tarinfo in tar:
tar.extract(tarinfo)
tar.close()
class CaseConfigParser(ConfigParser.ConfigParser):
""" A Case Sensitive Config Parser """
def optionxform(self, optionstr):
""" Overloading the returned optionstr to make it case sensitive """
return optionstr
def loadConfig(fileName, config={}):
"""
returns a dictionary with key's of the form
<section>.<option> and the values
"""
config = config.copy()
cp = CaseConfigParser()
cp.read(fileName)
for sec in cp.sections():
for opt in cp.options(sec):
config[sec + "." + opt] = string.strip(cp.get(sec, opt))
# The returned dictionary has the key of the form <section>.<option>
# convertConfigDict converts every one of such form to separate
# dictionaries for each of the <section>.
config = convertConfigDict(config)
# For each value, convert all the possible numerical
# values to float. The path names and other strings
# should remain the same
for key in config.keys():
for inKey in config[key].keys():
try:
config[key][inKey] = float(config[key][inKey])
if (config[key][inKey] == int(config[key][inKey])):
config[key][inKey] = int(config[key][inKey])
except ValueError:
pass
keys = config.keys()
# The first dictionary should correspond to the source
# and the second dictionary contains the parameters of
# the jammer. Return the two dictionary.
return config[keys[0]], config[keys[1]]
def convertConfigDict(origDict, sep="."):
"""
For each key in the dictionary of the form <section>.<option>,
a separate dictionary for is formed for every "key" - <section>,
while the <option>s become the keys for the inner dictionary.
Returns a dictionary of dictionary.
"""
result = {}
for keys in origDict:
tempResult = result
parts = keys.split(sep)
for part in parts[:-1]:
tempResult = tempResult.setdefault(part, {})
tempResult[parts[-1]] = origDict[keys]
return result
if __name__ == "__main__":
import numarray.random_array as ra
testData = 0
if testData:
test_data = {}
test_data['A'] = ra.random_integers(0, 10, 5)
test_data['B'] = ra.random_integers(10, 50, (5, 5))
test_data['C'] = ra.random_integers(10, 50, (5, 5)) + 0j
test_data['D'] = ra.random_integers(10, 50, (5, 5)) + 1j* \
ra.random_integers(10, 100, (5, 5))
test_data['E'] = ra.random_integers(10, 50, 5) + 1j*\
ra.random_integers(10, 100, 5)
dumpData(test_data)
result = loadData('/home/karthik/dataloader')
else:
source, jammer = loadConfig("test.ini")
print "--"*30
print "\n"
for keys in source.keys():
print "%s: \t %s \n"%(keys, source[keys])
print "--"*30
print "\n"
for keys in jammer.keys():
print "%s: \t %s \n"%(keys, jammer[keys])
print "--"*30
|
We had to move data files between matlab and numarray at different points of developing our algos. This helps to store the variables in a file in ascii format. Moreover, since the files are returned in the form of a dictionary, data can be interchanged between matlab and python seamlessly.
This file additionally loads config files. While loading config files, it preserves the type of the config entry, and its case (Floats, ints and strings).
This update adds the config functionality and the program has been check with pylint for space/tab issues and other errors and coding problems.
All suggestions are welcome.