http://www.perlmonks.org?node_id=198667

dcb0127 has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that opens up html files strips them and places the text into .txt files. I want to be able to combine all the .txt files that the first script did into one file and I'm having problems with the coding. I'm new at this so please excuse the poor scripting. Thanks
#!/usr/bin/perl -w use strict; use DirHandle; usage() if @ARGV < 1; my @files = (); while ( my $dir = shift @ARGV ) { my $dh = new DirHandle $dir; if ( defined($dh) ) { while ( defined($_ = $dh->read) ) { if ( $_ =~ /(\.csv)$/ ) { $dir =~ s/\/$//; push @files, "$dir/$_"; } } } } foreach my $file ( @files ) { open(INF,"$file") or dienice("Can't open data files"); @data = <INF>; close(INF); open(OUTF,">>combine.csv"); print "@data\n"; close(OUTF); }

Replies are listed 'Best First'.
Re: Combing Text Files into one file
by rbc (Curate) on Sep 17, 2002 at 22:10 UTC
    If you are on a *nix system do this ...
    $ cat file1 file2 file3 > file123

      And on Win32, copy src1+src2+src3 destination will do it.

      "One word of warning: if you meet a bunch of Perl programmers on the bus or something, don't look them in the eye. They've been known to try to convert the young into Perl monks." - Frank Willison
Re: Combing Text Files into one file
by Aristotle (Chancellor) on Sep 17, 2002 at 22:14 UTC

    The only error I can see is that

    print "@data\n"; should have been print OUTF "@data\n";

    The coding is actually good, btw - not very idiomatic, so a bit awkward, but you obviously understand what's going on behind the scenes. Nice effort.

    The only thing I can see here is I wouldn't open combine.csv every time through the loop anew, and wouldn't slurp the input file into an array.

    open(OUTF,">>combine.csv") or dienice("Can't open CSV file: $!"); foreach my $file ( @files ) { open(INF,"$file") or dienice("Can't open data file: $!"); print OUTF while <INF>; close(OUTF); } close(INF);

    Update: nevermind, the above proposals are better.. duh. :-)

    Makeshifts last the longest.

Re: Combing Text Files into one file
by runrig (Abbot) on Sep 17, 2002 at 22:16 UTC
    As someone mentioned in the CB (update: and in posts above), you forgot 'OUTF' on your print statement. But here's another possibility:
    use File::Copy; ... open(OUTF, ">combine.csv") or die "Error opening combine.csv: $!"; for my $file (@files) { copy($file, \*OUTF) or die "Error copying $file: $!"; } close OUTF;
Re: Combing Text Files into one file
by bronto (Priest) on Sep 18, 2002 at 08:30 UTC

    Nothing to add to other monks' comments, but only a few minor advices.

    1. You could move the line $dir =~ s/\/$//; just a few lines up and avoid repeating the substitution if more than one .csv file is found in the same $dir. E.g.:

    while ( my $dir = shift @ARGV ) { my $dh = new DirHandle $dir; if ( defined($dh) ) { $dir =~ s/\/$//; # Here, for example... while ( defined($_ = $dh->read) ) { if ( $_ =~ /(\.csv)$/ ) { push @files, "$dir/$_"; } } } }

    2. Also, you could slightly simplify your s/// by using another delimiter, for example: s|/$||. It's not a big saving with one slash, but it's worth changing the delimiter when your pattern contains a lot of them

    3. When you manipulate $_ you could want to localize it, to avoid clashes with functions or modules that use it (like File::Find, for example). It's not the case of your simple program, of course.

    4. (but I guess you already know :-) if you want, you can leave out the $_ =~ from your pattern match.

    Just my 0.02 EUR

    Ciao!
    --bronto

    # Another Perl edition of a song:
    # The End, by The Beatles
    END {
      $you->take($love) eq $you->made($love) ;
    }