Class for extracting and manipulating data from Microsoft Access - identifying tables, getting recordsets, iterating through the results, counting rows, getting field names, getting index information, deleting indexes, and adding and deleting relationships.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 | """
Enables you to interrogate an Access database, run queries, and get
results.
ADODB = Microsoft ActiveX Data Objects reference
ADOX = Microsoft ADO Ext
Great reference for ADODB is:
http://www.codeguru.com/cpp/data/mfc_database/ado/article.php/c4343/
Originally just an API wrapped around Douglas Savitsky's code at
http://www.ecp.cc/pyado.html
Recordset iterator taken from excel.py in Nicolas Lehuen's code at
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/440661
Handling of field types taken from Craig Anderson's code at
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/104801
An alternative approach might be
http://phplens.com/lens/adodb/adodb-py-docs.htm
v1.0.5 - added ability to add a primary-foreign table relationship
v1.0.4 - added ability to delete a relationship by name
v1.0.3 - add ability to delete a named index, and to
close (release) a table.
v1.0.2 - added Close method to connection (recordset
automatically closes self already)
v1.0.1 - added DOUBLE and reordered data const mappings
"""
#To get constant values, open Access, make sure ADODB and ADOX are references,
# open library, and look at globals
AD_OPEN_KEYSET = 1
AD_LOCK_OPTIMISTIC = 3
AD_KEY_FOREIGN = 2
AD_RI_CASCADE = 1
INTEGER = 'integer'
SMALLINT = 'smallint'
UNSIGNEDTINYINT = 'unsignedtinyint'
CURRENCY = 'currency'
DATE = 'date'
BOOLEAN = 'boolean'
TIMESTAMP = 'timestamp'
VARCHAR = 'varchar'
LONGVARCHAR = 'longvarchar'
SINGLE = 'single'
DOUBLE = 'double'
INDEX_UNIQUE = 'unique'
INDEX_NOT_UNIQUE = 'notunique'
INDEX_PRIMARY = 'indexprimary'
INDEX_NOT_PRIMARY = "indexnotprimary"
import win32com.client
#Must run makepy once -
#see http://www.thescripts.com/forum/thread482449.html e.g. the following
#way - run PYTHON\Lib\site-packages\pythonwin\pythonwin.exe (replace
#PYTHON with folder python is in). Tools>COM Makepy utility - select
#library named Microsoft ActiveX Data Objects 2.8 Library (2.8) and
#select OK. Microsoft ActiveX Data Objects Recordset 2.8 Library (2.8)
class AccessDb:
"Interface to MS Access database"
def __init__(self, data_source, user, pwd="''", mdw="''"):
"""Returns a connection to the jet database
NB use .Close() to close (NB title case unlike closing a file)"""
self.connAccess = win32com.client.Dispatch(r'ADODB.Connection')
"""DSN syntax - http://support.microsoft.com/kb/193332 and
http://www.codeproject.com/database/connectionstrings.asp?
df=100&forumid=3917&exp=0&select=1598401"""
DSN = """PROVIDER=Microsoft.Jet.OLEDB.4.0;DATA SOURCE=%s;
USER ID=%s;PASSWORD=%s;Jet OLEDB:System Database=%s;""" % \
(data_source, user, pwd, mdw)
#print DSN
try:
self.connAccess.Open(DSN)
except Exception:
raise Exception, "Unable to open MS Access database " + \
"using DSN: %s" % DSN
def getConn(self):
"Get connection"
return self.connAccess
def closeConn(self):
"Close connection"
self.connAccess.Close()
def getRecordset(self, SQL_statement, dict=True):
"Get recordset"
return Recordset(self.connAccess, SQL_statement, dict=dict)
def getTableNames(self):
"Get list of tables. NB not system tables"
cat = win32com.client.Dispatch(r'ADOX.Catalog')
cat.ActiveConnection = self.connAccess
alltables = cat.Tables
tab_names = []
for tab in alltables:
if tab.Type == 'TABLE':
tab_names.append(tab.Name)
return tab_names
def getTables(self):
"Get dictionary of table objects - table name is the key"
tab_names = self.getTableNames()
tabs = {}
for tab_name in tab_names:
tabs[tab_name] = Table(self.connAccess, tab_name)
return tabs
def runQuery(self, SQL_statement):
"Run SQL_statement"
cmd = win32com.client.Dispatch(r'ADODB.Command')
cmd.ActiveConnection = self.connAccess
cmd.CommandText = SQL_statement
cmd.Execute()
def deleteIndex(self, tab_name, idx_name):
"""
Delete index by name.
NB cannot delete an index if a table is locked.
Or if it is part of a relationship (I would expect).
Release (close) it first.
"""
cat = win32com.client.Dispatch(r'ADOX.Catalog')
cat.ActiveConnection = self.connAccess
index_coll = cat.Tables(tab_name).Indexes
try:
index_coll.Delete(idx_name)
except Exception, e:
raise Exception, "Unable to delete index - if table is " + \
"locked, make sure you release (close) it first. " + \
"Orig error: " + str(e)
cat = None
def addRelationship(self, tab_foreign_name, tab_foreign_key,
tab_primary_name, tab_primary_key,
rel_name, cascade_del=False,
cascade_update=False):
"""
Add primary table-foreign table relationship
"""
tabs = [tab_foreign_name, tab_primary_name]
for tab in tabs:
if tab not in self.getTableNames():
raise Exception, "Table \"%s\" is not in this database" \
% tab
cat = win32com.client.Dispatch(r'ADOX.Catalog')
cat.ActiveConnection = self.connAccess
tbl_foreign = cat.Tables(tab_foreign_name)
new_key = win32com.client.Dispatch(r'ADOX.Key')
try:
new_key.Name = rel_name
new_key.Type = AD_KEY_FOREIGN
new_key.RelatedTable = tab_primary_name
new_key.Columns.Append(tab_foreign_key)
new_key.Columns(tab_foreign_key).RelatedColumn = tab_primary_key
if cascade_del:
new_key.DeleteRule = AD_RI_CASCADE
if cascade_update:
new_key.UpdateRule = AD_RI_CASCADE
tbl_foreign.Keys.Append(new_key)
except Exception, e:
raise Exception, "Unable to add relationship '%s'. " % \
rel_name + "Orig error: %s" % str(e)
finally:
tbl_foreign = None
cat = None
def deleteRelationship(self, tab_foreign_name, rel_name):
"""
Delete relationship by relationship name.
Need name of "foreign" table.
http://msdn2.microsoft.com/en-us/library/aa164927(office.10).aspx
"""
if tab_foreign_name not in self.getTableNames():
raise Exception, "Table \"%s\" is not in this database" % \
tab_foreign_name
cat = win32com.client.Dispatch(r'ADOX.Catalog')
cat.ActiveConnection = self.connAccess
tbl_foreign = cat.Tables(tab_foreign_name)
tbl_keys = [x.Name for x in tbl_foreign.Keys]
if rel_name not in tbl_keys:
raise Exception, "\"%s\" is not in " % rel_name + \
"relationships for table \"%s\"" % tab_foreign_name
tbl_foreign.Keys.Delete(rel_name)
tbl_foreign = None
cat = None
class Table():
"MS Access table object with rs, name, and index properties"
def __init__(self, connAccess, tab_name):
self.connAccess = connAccess
self.rs = win32com.client.Dispatch(r'ADODB.Recordset')
try:
self.rs.Open("[%s]" % tab_name, self.connAccess, AD_OPEN_KEYSET,
AD_LOCK_OPTIMISTIC)
except Exception, e:
raise Exception, "Problem opening " + \
"table \"%s\" - " % tab_name + \
"orig error: %s" % str(e)
self.name = tab_name
self.indexes = self.__getIndexes()
def getFields(self):
"Get list of field objects"
field_names = [field.Name for field in self.rs.Fields]
fields = []
for field_name in field_names:
fields.append(Field(self.rs, field_name))
return fields
def __getIndexes(self):
"Get list of table indexes"
cat = win32com.client.Dispatch(r'ADOX.Catalog')
cat.ActiveConnection = self.connAccess
index_coll = cat.Tables(self.name).Indexes
indexes = []
for index in index_coll:
indexes.append(Index(index))
return indexes
cat = None
def close(self):
"Close table (releasing it)"
self.rs.Close()
class Index():
"""MS Access index object with following properties: name,
index type (UNIQUE or not), primary or not, and index fields -
a tuple of index fields in index"""
def __init__ (self, index):
self.name = index.Name
if index.Unique:
self.type = INDEX_UNIQUE
else:
self.type = INDEX_NOT_UNIQUE
self.fields = []
for item in index.Columns:
self.fields.append(item.Name)
if index.PrimaryKey:
self.primary = INDEX_PRIMARY
else:
self.primary = INDEX_NOT_PRIMARY
class Field():
"MS Access field object with name, type, and size properties"
def __init__ (self, rs, field_name):
self.name = field_name
adofield = rs.Fields.Item(field_name)
adotype = adofield.Type
#http://www.devguru.com/Technologies/ado/quickref/field_type.html
if adotype == win32com.client.constants.adInteger:
self.type = INTEGER
elif adotype == win32com.client.constants.adSmallInt:
self.type = SMALLINT
elif adotype == win32com.client.constants.adUnsignedTinyInt:
self.type = UNSIGNEDTINYINT
elif adotype == win32com.client.constants.adSingle:
self.type = SINGLE
elif adotype == win32com.client.constants.adDouble:
self.type = DOUBLE
elif adotype == win32com.client.constants.adCurrency:
self.type = CURRENCY
elif adotype == win32com.client.constants.adBoolean:
self.type = BOOLEAN
elif adotype == win32com.client.constants.adDate:
self.type = DATE
elif adotype == win32com.client.constants.adDBTimeStamp:
self.type = TIMESTAMP
elif adotype == win32com.client.constants.adVarWChar:
self.type = VARCHAR
elif adotype == win32com.client.constants.adLongVarWChar:
self.type = LONGVARCHAR
else:
raise "Unrecognised ADO field type %d" % adotype
self.size = adofield.DefinedSize
def encoding(value):
if isinstance(value,unicode):
value = value.strip()
if len(value)==0:
return None
else:
return value.encode("mbcs") #mbcs is a Windows, locale-specific encoding
elif isinstance(value,str):
value = value.strip()
if len(value)==0:
return None
else:
return value
else:
return value
class Recordset():
"MS Access recordset created from a query"
def __init__ (self, connAccess, SQL_statement, dict):
self.rs = win32com.client.Dispatch(r'ADODB.Recordset')
self.rs.CursorLocation = 3 #uses client - makes it possible to use RecordCount property
self.rs.Open(SQL_statement, connAccess, AD_OPEN_KEYSET,
AD_LOCK_OPTIMISTIC)
self.dict = dict
def getFieldNames(self):
"Get list of field names"
field_names = [field.Name for field in self.rs.Fields]
return field_names
def hasRows(self):
"Does the recordset contain any rows?"
try:
self.rs.MoveFirst()
except:
return False
return True
def getCount(self):
"""
Get record count - NB rs.CursorLocation had to be set to
3 (client) to enable this
"""
try:
return self.rs.RecordCount
except:
return 0
def __iter__(self):
" Returns a paged iterator by default. See paged()."
return self.paged()
def paged(self,pagesize=128):
""" Returns an iterator on the data contained in the sheet. Each row
is returned as a dictionary with row headers as keys. pagesize is
the size of the buffer of rows ; it is an implementation detail but
could have an impact on the speed of the iterator. Use pagesize=-1
to buffer the whole sheet in memory.
"""
try:
field_names = self.getFieldNames()
#field_names = [self.encoding(field.Name) for field in recordset.Fields]
ok = True
while ok:
# Thanks to Rogier Steehouder for the transposing tip
rows = zip(*self.rs.GetRows(pagesize))
if self.rs.EOF:
# close the recordset as soon as possible
self.rs.Close()
self.rs = None
ok = False
for row in rows:
if self.dict:
yield dict(zip(field_names, map(encoding,row)))
else:
yield(map(encoding, row))
except:
if self.rs is not None:
self.rs.Close()
del self.rs
raise
|
If you need to get data out of a Microsoft Access database using python, or run queries on the data, this class makes it easy. Using this class, it is possible to take data, and either work with it directly (e.g. producing HTML, csv, or XML output etc), or transfer it into MySQL, SQLite etc as fully indexed tables ready to go. Written by Grant Paton-Simpson, PSAL (www.p-s.co.nz)
Thank you for posting the above. Appreciate your help.
I'm having issues using this and not sure what I'm doing wrong. Know it has been a while since this was posted. I have an access database that I currently have data in 2 tables that I need to clear out and then update with new data.
db = 'C:\Documents and Settings\gregg\My Documents\trace_11_10\cert level\cert_data2.mdb' conn = AccessDb(db,user='Admin',mdw='w')
systable = 'Sys_Req_to_SW_Req_new' - table name got this by first time running conn.getTables() swtable = 'SW_req_to_code_split_out' tables = Table(conn,systable) - needing to get the index I think so that I can use the delete Index conn.deleteIndex(systable,tables.index)
Getting the error on the tables line:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "c:\python26\lib\site-packages\spyder-1.0.0-py2.6.egg\spyderlib\widgets\externalshell\startup.py", line 35, in __traceit __t.run('execfile("{0}")'.format(filename)) File "C:\Python26\lib\trace.py", line 498, in run exec cmd in dict, dict File "<string>", line 1, in <module> File "_gate_temp_0_Script1.py", line 388, in <module> tables = Table(conn,systable) File "_gate_temp_0_Script1.py", line 172, in __init__ "orig error: %s" % str(e) Exception: Problem opening table "Sys_Req_to_SW_Req_new" - orig error: Objects of type 'instance' can not be converted to a COM VARIANT