in reply to Re: Re: Database Table to Hash of Hash of Array
in thread Database Table to Hash of Hash of Array
First a few small points:
1. Is it really necessary to call $sth->finish after a fetchall_arrayref ?
All the records are already fetched by this time, so there is nothing to "finish". (I've learned that at my expenses. ;-)
2. A "comparison of every piece of data in the database" sounds like a job for a database engine. Exporting everything into memory would deprive you of the powerful querying system of your server.
IMO, if you want to compare data, you are better off within the database engine itself (which you can wonderfully drive around with Perl/DBI of course).
Comparing hashes of arrays is not an easy task, while comparing arrays of hashes ( = database tables) is everyday's task for any DB engine.
3. If what you want to achieve is comparing data across databases in different hosts, it would be enough to read one table at the time (hence, an array of hashes) provided that you have enough memory.
If memory is an issue, here is the skeleton of a script that I use to compare my production db with the development db (lying in different hosts):
You can change the portion between my $query = qq{...} and if ($local_result ...) to suit your purposes.
Notice that, when the comparison by COUNTing gives back the same number, then you need to compare further by specific fields, which you either know in advance (if all your tables have a "telling" field with a CRC, for instance) or you have to get the field list from the engine and carry out a more detailed comparison.
A workable idea is to identify sensible grouping fields and then compare queries with a GROUP BY clause, in order to avoid unnecessary record by record comparison.
The bottom line is that you should not get crude data from the database, but first organize it and process the results with Perl.
1. Is it really necessary to call $sth->finish after a fetchall_arrayref ?
All the records are already fetched by this time, so there is nothing to "finish". (I've learned that at my expenses. ;-)
2. A "comparison of every piece of data in the database" sounds like a job for a database engine. Exporting everything into memory would deprive you of the powerful querying system of your server.
IMO, if you want to compare data, you are better off within the database engine itself (which you can wonderfully drive around with Perl/DBI of course).
Comparing hashes of arrays is not an easy task, while comparing arrays of hashes ( = database tables) is everyday's task for any DB engine.
3. If what you want to achieve is comparing data across databases in different hosts, it would be enough to read one table at the time (hence, an array of hashes) provided that you have enough memory.
If memory is an issue, here is the skeleton of a script that I use to compare my production db with the development db (lying in different hosts):
This simple code will just compare which tables are present in database 1 but missing in the database 2, and which ones have different number of records.#!/usr/bin/perl -w use strict; use DBI; # just configuration -- change according to your needs, # or use a config file my $DBD = "mysql"; # change to your DBD driver my $local_database = "monks"; my $local_host ="IP_development_DB"; my $remote_database = "monks"; my $remote_host="IP_production_DB"; my $local_password = "locsecret"; my $remote_password = "remsecret"; my $local_username = "locuser"; my $remote_username = "remuser"; my $local_DSN = "DBI:$DBD:$local_database;host=$local_host"; my $remote_DSN = "DBI:$DBD:$remote_database;host=$remote_host"; # end configuration my $local_dbh=DBI->connect($local_DSN, $local_username, $local_password, {RaiseError => 1}); my $remote_dbh=DBI->connect($remote_DSN, $remote_username, $remote_password, {RaiseError => 1}); my @local_tables = $local_dbh ->tables(); my @remote_tables = $remote_dbh->tables(); foreach my $table (@local_tables) { if (grep {$table eq $_} @remote_tables) { my $query = qq{ SELECT COUNT(*) FROM $table}; my $local_sth = $local_dbh->prepare( $query ); my $remote_sth = $remote_dbh->prepare( $query ); $local_sth->execute(); $remote_sth->execute(); my ($local_result) = $local_sth->fetchrow_array(); my ($remote_result) = $remote_sth->fetchrow_array(); if ($local_result != $remote_result) { print "$table: \tL ($local_result)\t", "R ($remote_result)\n"; } } else { print "missing $table\n" ; } } $local_dbh->disconnect(); $remote_dbh->disconnect();
You can change the portion between my $query = qq{...} and if ($local_result ...) to suit your purposes.
Notice that, when the comparison by COUNTing gives back the same number, then you need to compare further by specific fields, which you either know in advance (if all your tables have a "telling" field with a CRC, for instance) or you have to get the field list from the engine and carry out a more detailed comparison.
A workable idea is to identify sensible grouping fields and then compare queries with a GROUP BY clause, in order to avoid unnecessary record by record comparison.
The bottom line is that you should not get crude data from the database, but first organize it and process the results with Perl.
|
---|
In Section
Cool Uses for Perl