Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Merging multiple files

by bluray (Sexton)
on Jan 12, 2012 at 23:03 UTC ( [id://947664]=perlquestion: print w/replies, xml ) Need Help??

bluray has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have four files to merge. There is no overlap between the IDs (4th column) in each of these files. I can utilize 'cat', but the header will be repeated. Any help would be appreciated.

data
SP GC DC ID Descr GI Locus LF cat Ts All 5320 CGA1 1892 L2 3.1 dog Ts Sp 6420 beta 1 1849 L3 4.2 ------------------------------------ The format of the data is the same in the four files.

I am trying to write a code for this. I was not sure whether to create a hash for this. I am pasting the code below that I wrote. The code is not complete.

#!/usr/bin/perl -w use strict; use warnings; my $infile1="file1.csv"; my $infile2="file2.csv"; my $infile3="file3.csv"; my $infile4="file4.csv"; open (my $in_fh1, "<", $infile1) or die "Can't open $infile1" ($!); open (my $in_fh2, "<", $infile2) or die "Can't open $infile2" ($!); open (my $in_fh3, "<", $infile3) or die "Can't open $infile3" ($!); open (my $in_fh4, "<", $infile4) or die "Can't open $infile4" ($!); open (my $out_fh, ">", "file1_4.csv"); while (defined(my $line1=<$in_fh1>) || defined (my $line2=<$in_fh2)|| +defined (my $line3=<$in_fh3)|| defined (my $line4=<$in_fh4) ){ chomp $line1; $line1=~s/\t/,/g; my @columnheadings=split (/\t/, $line1); my $headings=join(",", @columnheadings); chomp $line2; $line2=~s/\t/,/g; chomp $line3; $line3=~s/\t/,/g; chomp $line4; $line4=~s/\t/,/g; print $out_fh "$headings\n"; }

Replies are listed 'Best First'.
Re: Merging multiple files
by Eliya (Vicar) on Jan 12, 2012 at 23:28 UTC

    This concatenates the files without the extra headers:

    $ perl -ne 'print unless $skip; $skip=eof' file*.csv >out.csv

    (see eof)

Re: Merging multiple files
by umasuresh (Hermit) on Jan 12, 2012 at 23:19 UTC
    May be just easier to do this:
    # create header first grep -w SP file1.csv > merged.csv # grep out the header from each file for i in {1..4}; do grep -v SP file${i}.csv >> merged.csv; done
    A perlish hint after opening each file:
    my @f1 = <F1>; my @f2 = <F2>; ... my @all; <strike> push @all, @f1(1..$#f1),@f2(1..$#f2),@f3(1..$#f3),@f4(1..$#f4); </strike> push @all, @f1[1..$#f1],@f2[1..$#f2],@f3[1..$#f3],@f4[1..$#f4]; # header print "$f1[0]\n"; # lines print join("\t", @all); print "\n";
    UPDATE: tiredness!
Re: Merging multiple files
by roboticus (Chancellor) on Jan 12, 2012 at 23:20 UTC

    bluray:

    Perhaps something like:

    perl -pe 'next if $.>1 and /^SP   GC DC  ID  Descr   GI Locus LF/' file1.csv file2.csv file3.csv file4.csv

    Or perhaps not...

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      perl -pe 'next if ...

      The print done by the -p option cannot actually be skipped with next, as it's in a continue block, and therefore executes anyway.

      But you could of course use -n instead and say 'print unless $.>1 and ...'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://947664]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2024-03-19 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found