http://www.perlmonks.org?node_id=625238


in reply to Perl DBI adding some, not all, records to MySQL database

Nevertheless, if the ASCII data MAY contain duplicates, you have to do something about it. Ny general approach in designing a script for this type of requirement is rather different:

1) keeps sorted backup of the flat input files used per run

2) use the unix comm utility (twice per file) to create a forward and backward difference extracts based on the latest ASCII file and the previous one

3) compare with the previous run and execute batched deletion SQL from the backward differences

4) and then for the insert case, take advantage of MySql's bulk loader, using Perl to translate the flat format only for the "insert" case.

This means that updates are translated into a delete plus an insert per case, but overall in this context that performs better anyway. Unchanged records in the flat file are filtered out by comm. This package of measures overall performs many times faster than the more "obvious" per line design and was proved to be the most reliable.

Update: this is assuming there are no manual changes made in the meantime, in which case you have to back-generate an updated version of the "previous" (now simulated) ASCII file from your own data rather than use the one from the previous run.

__________________________________________________________________________________

^M Free your mind!

  • Comment on Re: Perl DBI adding some, not all, records to MySQL database