Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Comparing DBI records

by parser (Acolyte)
on Feb 19, 2013 at 04:19 UTC ( [id://1019437]=perlquestion: print w/replies, xml ) Need Help??

parser has asked for the wisdom of the Perl Monks concerning the following question:

Cheers Monks. I am a tad reluctant to post this question as it is only partially a Perl question. However, I think the primary focus is upon the Perl portion of the equation.

I maintain a large MySQL database and receive monthly updates in DBF file format. Currently, I use the DBD::XBase and DBI modules. For XBase I read the update record (60+ fields/columns) using get_record_as_hash() and for MySQL I search for an existing entry and use fetchrow_hashref() to read the row. I then compare the columns using the column names as keys.

It works but is ugly, slow, and not very elegant. I am looking for something with a bit more panache with regard to comparing records. I've read gmax's DBI Recipes with interest but none of them appear at first blush to be good fits.

Thank you in advance!

Replies are listed 'Best First'.
Re: Comparing DBI records
by Tux (Canon) on Feb 19, 2013 at 07:01 UTC

    Fetching hashes is (much) slower than fetching arrays, but even faster than that is using bind_columns () so that the variables that will contain the data are nor re-allocated on every fetch. FWIW, the DBI manual comes with an example for that.

    Then in comparing, you will need to ask yourself how likely it is that records (in the sense of a set of fields or columns) will not match and where the mismatch is most likely to occur. If e.g. the mismatch is always somewhere in the first three fields of your 60 columns, have a look at List::Util's first and stop matching after the first mismatch preventing all the other fields to compare. If the mismatch is expected in the last fields, you'd have to bake your own optimization, e.g. by creating an aliased list with all the fields reversed (and be able to use first again), you would not have to call the reverse on every record or have to use loops.


    Enjoy, Have FUN! H.Merijn
      Great advice Tux. Thank you. Unfortunately, any field in the entire record could change with the exception of the primary index. Worse, many of the fields could change.

      I AM intrigued by the List::Utils module and will have a go at using it as a separate exercise.

Re: Comparing DBI records
by NetWallah (Canon) on Feb 19, 2013 at 04:32 UTC
    The Database's native search will always be faster than anything you build on top of it.

    Couldn't you just use a SELECT statement with the column names and values you are looking for ?

    Leave query optimization/performance issues with the DBA.

                 Most people believe that if it ain't broke, don't fix it.
            Engineers believe that if it ain't broke, it doesn't have enough features yet.

      I had a similar issue and solved it by using Array::Compare.

      I grab the data from both databases as an arrayref, making sure the sequence of the fields is the same in both queries and then I throw the arrayrefs to the comparator of Array::Compare.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      My blog: Imperial Deltronics
      That would be my first choice. I am a Perl developer by choice and a SQL user by necessity. I spent 5 hours last night with my head buried in a SQL book (re)learning all about left and outer joins, etc.

      I have not yet figured out how to leverage the power of SQL to perform this trick.

Re: Comparing DBI records
by nikosv (Deacon) on Feb 19, 2013 at 09:11 UTC

    why don't you bulk insert the DBF data into a Mysql table and do the comparison with sql?

    It seems like that you are doing equi joins;if that is so, then you can use a HASH table structure which is ideal for exact key retrieval.

    you insert the DBF data into a table hashed on the keys and also make a HASHed secondary index on the keys of the Mysql table, which will be used for the joining

      Great question. The short answer is that I am not SQL savvy enough to do so. I have used dbf2mysql to perform bulk imports to MySQL and that worked fairly well.

      The overarching goal is to update the MySQL database with the DBF files and to log the changes made to the master table. I will read up on equi joins and see if I cannot leverage the MySQL engine to do the heavy lifting.

      Cheers
Re: Comparing DBI records
by parser (Acolyte) on Feb 19, 2013 at 17:01 UTC
    Many thanks to everyone who replied! There is an abundance of distilled corporate wisdom in this monastery.

    For those who suggested placing the burden back upon the database - that is something I intend to do. I just need to figure out how to unleash the powers of SQL. Especially when the records being compared are being served by two different frameworks and databases.

    Thank you CountZero for the pointer to Array::Compare. That looks like a great near term solution. It may not help much with performance (with the possible exception of moving from hash to array) but should help with elegance until I earn my SQL merit badge.

    Cheers!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1019437]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2025-01-18 08:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Which URL do you most often use to access this site?












    Results (56 votes). Check out past polls.