Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Using perl to extract data from multiple .txt files

by lalalala1 (Initiate)
on Nov 14, 2012 at 15:24 UTC ( #1003832=perlquestion: print w/ replies, xml ) Need Help??
lalalala1 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am new to perl, but eager to learn.

I want to extract data from a database with about 500 .txt files stored in one directory. All files contain data with the same structure. From every .txt file, I want the same information and store them in a new .txt file, in such a way that every line contains the data from 1 file. So my goal is to create a .txt file that I can use for data analysis.

For now I have the code that extracts data from one .txt file. What aproach would you suggest to use the same code on all of the .txt files to get the necessary data in one .txt file?

Thank you for your responses!

Comment on Using perl to extract data from multiple .txt files
Re: Using perl to extract data from multiple .txt files
by choroba (Abbot) on Nov 14, 2012 at 15:27 UTC
    open my $OUT, '>', 'output.txt' or die "output: $!"; for my $file (glob 'dir/*.txt') { # Your old code here. Remove opening the output file. }
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      Careful with this one to make sure that you (the OP, not choroba) do not process the text file opened above.

      --MidLifeXis

        True. I added a path to the code (I had planned to, but forgot).
        لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Using perl to extract data from multiple .txt files
by rjt (Deacon) on Nov 14, 2012 at 15:55 UTC

    Here's a short working example you can go from:

    for (<*.txt>) { open my $fh, '<', $_ or die "Can't open $_: $!"; my $record = join(', ', map { chomp; $_ } (<$fh>)); # <== HERE print "$_: $record\n"; close $fh; }

    Just replace the line marked <== HERE with your code. This writes to STDOUT, so you can redirect your output wherever you like, or just add an open and select to automate it if you'll always be writig to the same location.

    Take care that the <*.txt> glob does not match your output file.

Re: Using perl to extract data from multiple .txt files
by blue_cowdawg (Monsignor) on Nov 14, 2012 at 16:26 UTC

    use strict; opendir(INDIR,"/path/to/directory/with/files") or die $!; while (my $fname = readdir(INDIR)){ next unless $fname =~ m@\.txt$@; open FIN," < $fname" or die "$fname: $!"; my @stuff=<FIN>; close FIN; #process lines.... }
    There's another way of doing it.


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: Using perl to extract data from multiple .txt files
by Kenosis (Priest) on Nov 14, 2012 at 21:22 UTC

    Hi, lalalala1, and welcome to PerlMonks!

    Excellent suggestions above. The following incorporates almost all of them, and avoids a collision with the output file by grepping the glob (sounds like a horror movie title...).

    It opens each text file for the data you need, and that data is pushed onto an array that's written out to the output file:

    use strict; use warnings; my $outFile = 'extractedData.txt'; my @sampleLines; for my $file ( grep !/^\Q$outFile\E$/, <*.txt> ) { open my $fh, '<', $file or die $!; # Get data from file... push @sampleLines, $data; close $fh; } open my $outFH, '>', $outFile or die $!; print $outFH @sampleLines; close $outFH

    I know that this is more than you requested, but I hope it's helpful.

      Thank you guys, I'll give your suggestions a try and let you know how it worked out.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1003832]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (16)
As of 2014-09-23 12:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (221 votes), past polls