Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
Don't ask to ask, just ask
 
PerlMonks  

Re: Perl DBI adding some, not all, records to MySQL database

by Moron (Curate)
on Jul 06, 2007 at 10:51 UTC ( #625238=note: print w/ replies, xml ) Need Help??


in reply to Perl DBI adding some, not all, records to MySQL database

Nevertheless, if the ASCII data MAY contain duplicates, you have to do something about it. Ny general approach in designing a script for this type of requirement is rather different:

1) keeps sorted backup of the flat input files used per run

2) use the unix comm utility (twice per file) to create a forward and backward difference extracts based on the latest ASCII file and the previous one

3) compare with the previous run and execute batched deletion SQL from the backward differences

4) and then for the insert case, take advantage of MySql's bulk loader, using Perl to translate the flat format only for the "insert" case.

This means that updates are translated into a delete plus an insert per case, but overall in this context that performs better anyway. Unchanged records in the flat file are filtered out by comm. This package of measures overall performs many times faster than the more "obvious" per line design and was proved to be the most reliable.

Update: this is assuming there are no manual changes made in the meantime, in which case you have to back-generate an updated version of the "previous" (now simulated) ASCII file from your own data rather than use the one from the previous run.

__________________________________________________________________________________

^M Free your mind!


Comment on Re: Perl DBI adding some, not all, records to MySQL database

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://625238]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (9)
As of 2014-04-19 19:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (483 votes), past polls