Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Using perl to extract data from multiple .txt files

by lalalala1 (Initiate)
on Nov 14, 2012 at 15:24 UTC ( #1003832=perlquestion: print w/replies, xml ) Need Help??
lalalala1 has asked for the wisdom of the Perl Monks concerning the following question:


I am new to perl, but eager to learn.

I want to extract data from a database with about 500 .txt files stored in one directory. All files contain data with the same structure. From every .txt file, I want the same information and store them in a new .txt file, in such a way that every line contains the data from 1 file. So my goal is to create a .txt file that I can use for data analysis.

For now I have the code that extracts data from one .txt file. What aproach would you suggest to use the same code on all of the .txt files to get the necessary data in one .txt file?

Thank you for your responses!

  • Comment on Using perl to extract data from multiple .txt files

Replies are listed 'Best First'.
Re: Using perl to extract data from multiple .txt files
by choroba (Bishop) on Nov 14, 2012 at 15:27 UTC
    open my $OUT, '>', 'output.txt' or die "output: $!"; for my $file (glob 'dir/*.txt') { # Your old code here. Remove opening the output file. }
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Careful with this one to make sure that you (the OP, not choroba) do not process the text file opened above.


        True. I added a path to the code (I had planned to, but forgot).
        لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Using perl to extract data from multiple .txt files
by rjt (Deacon) on Nov 14, 2012 at 15:55 UTC

    Here's a short working example you can go from:

    for (<*.txt>) { open my $fh, '<', $_ or die "Can't open $_: $!"; my $record = join(', ', map { chomp; $_ } (<$fh>)); # <== HERE print "$_: $record\n"; close $fh; }

    Just replace the line marked <== HERE with your code. This writes to STDOUT, so you can redirect your output wherever you like, or just add an open and select to automate it if you'll always be writig to the same location.

    Take care that the <*.txt> glob does not match your output file.

Re: Using perl to extract data from multiple .txt files
by blue_cowdawg (Monsignor) on Nov 14, 2012 at 16:26 UTC

    use strict; opendir(INDIR,"/path/to/directory/with/files") or die $!; while (my $fname = readdir(INDIR)){ next unless $fname =~ m@\.txt$@; open FIN," < $fname" or die "$fname: $!"; my @stuff=<FIN>; close FIN; #process lines.... }
    There's another way of doing it.

    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: Using perl to extract data from multiple .txt files
by Kenosis (Priest) on Nov 14, 2012 at 21:22 UTC

    Hi, lalalala1, and welcome to PerlMonks!

    Excellent suggestions above. The following incorporates almost all of them, and avoids a collision with the output file by grepping the glob (sounds like a horror movie title...).

    It opens each text file for the data you need, and that data is pushed onto an array that's written out to the output file:

    use strict; use warnings; my $outFile = 'extractedData.txt'; my @sampleLines; for my $file ( grep !/^\Q$outFile\E$/, <*.txt> ) { open my $fh, '<', $file or die $!; # Get data from file... push @sampleLines, $data; close $fh; } open my $outFH, '>', $outFile or die $!; print $outFH @sampleLines; close $outFH

    I know that this is more than you requested, but I hope it's helpful.

      Thank you guys, I'll give your suggestions a try and let you know how it worked out.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1003832]
Approved by marto
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2018-07-19 19:54 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (417 votes). Check out past polls.