Compare 2 CSV files and create a new CSV file out of comparision

slayedbylucifer has asked for the wisdom of the Perl Monks concerning the following question:

I have 2 .csv files as below:

# cat vm.csv 
vm_name,vm_cluster
vm1,fd1
vm2,fd2
vm3,fd3
vm4,fd4
[download]

and

# cat vfiler_fd.csv 
vm_name,vFiler_IP,vFiler_cluster
vm3,1.1.1.3,fd2
vm4,1.1.1.4,fd1
vm1,1.1.1.1,fd4
vm2,1.1.1.2,fd3
[download]

Goal:

Compare the 1st field of (vm_name) the two files and if they match then write it to a 3rd CSV file with the column headings as mentioned below (Each row of this 3rd CSV should conain all information about a given vm from both the CSVs files. Basically, I am trying to merge two files lokking at the 1st field.)

vm_name,vm_cluster,vFiler_IP,vFiler_cluster
vm1,fd1,1.1.1.1,fd4
vm2,fd2,1.1.1.2,fd3
.
.
....etc.
[download]

Here is my code which not correct But you will get the idea what I am trying to do:

use strict;
use Text::CSV;
use Class::CSV;

my $report_csv = Class::CSV->new (
                               filename    => "final.csv",
                               fields      =>  [qw/VM_Name VM_Cluster 
+vFiler_IP vFiler_Cluster MisMatch/],
                                );    

my $vm_mapping = Text::CSV->new ({ binary => 1});
my $vfiler_mapping = Text::CSV->new ({ binary => 1 }); 

open my $vm, "<:encoding(utf8)", "vm.csv" ;
open my $vfiler, "<:encoding(utf8)", "vfiler_fd.csv";

while (my $vm_row = $vm_mapping->getline ($vm))
{
     while (my $vfiler_row = $vfiler_mapping->getline ($vfiler) )
     {   
          if ( $vm_row->[0] eq $vfiler_row->[0] )
    
                $report_csv->add_line( {
                                          VM_Name       =>      $vm_ro
+w->[0],
                                          VM_Cluster    =>      $vm_ro
+w->[1],
                                          vFiler_IP     =>      $vfile
+r_row->[1],
                                          vFiler_Cluster=>      $vfile
+r_row->[2],
                                          MisMatch      =>      "huhuh
+huhuhu",
                                        } );
 
                next;
      }    
      next;
}
[download]

The logic I used:

retrieve the 1st element of the vm.csv and comapre it with the 1st elements of all the rows of the vFiler.csv. if there is match, then write a new line whihc contains repsective elememnts to a new file called new.csv.

Issues:

this is not working.

I think nested while loop is a horrible practice. anyway, it is not working either.

I am not able to find a way to write "Column_names" with Text::CSV hence using Class:CSV and still it is not working.

I am pretty sure there is a better apporach to do this. Could you help. thanks.

Comment on Compare 2 CSV files and create a new CSV file out of comparision Select or Download Code

Replies are listed 'Best First'.
Re: Compare 2 CSV files and create a new CSV file out of comparision by moritz (Cardinal) on Jul 02, 2012 at 09:03 UTC
The proper way to do it is to read the first file, and use a hash to store the contents. Use the vm_name column as key and the whole row as a value. Then read the second file line by line, and look up the vmname in the hash you created earlier, and print the joined columns to the third file. Perl 6 - the future is here, just unevenly distributed	[reply]
Re^2: Compare 2 CSV files and create a new CSV file out of comparision by slayedbylucifer (Scribe) on Jul 02, 2012 at 09:23 UTC
Hi. could you get me a sample code structure. Sorry, I am asking for too much, but I didn't quite understand "Use the vm_name column as key and the whole row as a value." part. Thanks again for your time.	[reply]
Re^3: Compare 2 CSV files and create a new CSV file out of comparision by moritz (Cardinal) on Jul 02, 2012 at 10:31 UTC
See perlintro or perldata for a description of hashes. "key" and "value" are terms used together with hashes. If you search for join files or similar keywords, you'll find quite a bit of example code, for example join mutiple files on first column and join - join two files according to a common key. Perl 6 - the future is here, just unevenly distributed	[reply]
Re: Compare 2 CSV files and create a new CSV file out of comparision by zentara (Archbishop) on Jul 02, 2012 at 09:07 UTC
See Compare 2 csv files using a key set of colums I'm not really a human, but I play one on earth. Old Perl Programmer Haiku ................... flash japh	[reply]
Re: Compare 2 CSV files and create a new CSV file out of comparision by CountZero (Bishop) on Jul 02, 2012 at 19:51 UTC
Your two files are actually two tables in a relational database system. So use the power of DBI and SQL. `use Modern::Perl; use Data::Dump qw/dump/; use DBI; my $dbh = DBI->connect( "dbi:CSV:", "", "", { f_schema => 'test', f_dir => './test', f_ext => '.csv', } ) or die "Could not open database: $DBI::errstr"; $dbh->{RaiseError} = 1; my $query = 'SELECT file1.vm_name, vm_cluster, vFiler_IP, vFiler_cluster FROM file +1 join file2 WHERE file1.vm_name = file2.vm_name'; my $result = $dbh->selectall_arrayref($query); print dump($result);` [download] Output: `[ ["vm1", "fd1", "1.1.1.1", "fd4"], ["vm2", "fd2", "1.1.1.2", "fd3"], ["vm3", "fd3", "1.1.1.3", "fd2"], ["vm4", "fd4", "1.1.1.4", "fd1"], ]` [download] CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics	[reply] [d/l] [select]
Re^2: Compare 2 CSV files and create a new CSV file out of comparision by slayedbylucifer (Scribe) on Jul 03, 2012 at 10:55 UTC
CounterZero, This is Exactly what i was looking for. When I wrote my code for the 1st time and had nested "while" loops, I knew that this not the right way to do and there must be an easier way to do so. Thanks for the DBI solution. Thank you.	[reply]
Re^2: Compare 2 CSV files and create a new CSV file out of comparision by Anonymous Monk on Jan 19, 2014 at 21:48 UTC
This is an example that does not fully work: this is my improved Code ( Hope it will help somebody in the future) : use Data::Dump qw/dump/; use DBI; my $dbh = DBI->connect( "dbi:CSV:", "", "", { #f_schema => 'test', f_dir => '.', f_ext => '.csv', } ) or die "Could not open database: $DBI::errstr"; $dbh->{RaiseError} = 1; $dbh->{csv_tables}->{file1} = { file => 'vm.csv', }; $dbh->{csv_tables}->{file2} = { file => 'vfiler_ds.csv', }; my $query = "SELECT file1.vm_name, vm_cluster, vFiler_IP, vFiler_cluster FROM file1 join file2 WHERE file1.vm_name = file2.vm_name"; my $result = $dbh->selectall_arrayref($query); print dump($result);	[reply]
Re^3: Compare 2 CSV files and create a new CSV file out of comparision by Anonymous Monk on Jan 19, 2014 at 21:50 UTC
Now hopefully more readable This is an example that does not fully work: this is my improved Code ( Hope it will help somebody in the future) : use Data::Dump qw/dump/; use DBI; my $dbh = DBI->connect( "dbi:CSV:", "", "", { #f_schema => 'test', f_dir => '.', f_ext => '.csv', } ) or die "Could not open database: $DBI::errstr"; $dbh->{RaiseError} = 1; $dbh->{csv_tables}->{file1} = { file => 'vm.csv', }; $dbh->{csv_tables}->{file2} = { file => 'vfiler_ds.csv', }; my $query = "SELECT file1.vm_name, vm_cluster, vFiler_IP, vFiler_cluster FROM file1 join file2 WHERE file1.vm_name = file2.vm_name"; my $result = $dbh->selectall_arrayref($query); print dump($result);	[reply]
Re^4: Compare 2 Compare 2 CSV files and create a new CSV file out of comparision by Tux (Canon) on Jan 20, 2014 at 07:13 UTC

Back to Seekers of Perl Wisdom