I am faced with the task of iterating through a substantial set of MS Access databases (*.MDB) files and, without knowing anything about their schema, writing out an XML file containing all of the table contents. The idea being that I can then run a search engine spider over the resulting XML files such that I can locate databases on the file system. This is part of a data discovery tool used for risk management.
I am seeking the wisdom of fellow monks with this task because I believe that I am not the only one to have needed to do this. I am also looking for an example of how to discover the schema of an MS Access database.
The plan so far:
- Scan the network drive for *.mdb files using File::Find.
- Create / update and ODBC DSN for each database file using Win32::ODBC to manage the conection.
- Open a connection to the database using DBI and the DBD::ODBC driver.
- Discover the schema - This is where I need some information!
- Iterate over the tables, rows, columns. Write out to an XML file using XML::Writer.
Any help on the 'Discover the Schema' bit would be most appreciated.