Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

File Parsing

by shortyfw06 (Beadle)
on May 31, 2012 at 15:36 UTC ( #973550=perlquestion: print w/ replies, xml ) Need Help??
shortyfw06 has asked for the wisdom of the Perl Monks concerning the following question:

Can someone please guide me through this file parsing example? I am trying to search for the last occurence of a particular line in a file. Then skip three lines. Next, I want to create a hash starting at the fourth line and ending when a blank line is found. An example of the test is this. Here is an example of the file and my code so far. Thanks!

FAILURE CRITERIA PER PLY DIST ANGLE PLY FAILURE NUMBERS 1 2 SHEAR 0.000 0.00 -45.00 0.238 0.282 -1.459 0.000 0.00 0.00 0.971 1.369 0.004 0.000 5.00 -45.00 0.475 0.142 -1.585 0.000 5.00 0.00 1.003 1.531 -0.274 0.000 10.00 -45.00 0.721 0.037 -1.623 FAILURE CRITERIA PER PLY DIST ANGLE PLY FAILURE NUMBERS 1 2 SHEAR 0.000 0.00 -45.00 0.247 0.293 -1.514 0.000 0.00 0.00 1.008 1.422 0.004 0.000 5.00 -45.00 0.493 0.147 -1.645 0.000 5.00 0.00 1.042 1.589 -0.284 #!usr/bin/perl use Tk; use Cwd; use strict; use warnings; # ###################################################################### +##################### # GUI Building ###################################################################### +##################### # # Create Main Window my $mw=new MainWindow; my $filename; my $string1; my $string2; my $line; my $n; my $test; my $skip1; my $skip2; my $skip3; my %temp; my $ms_button = $mw->Button(-text=>"MS", -command=> \&BJSFM_MS)->pack(); MainLoop; sub BJSFM_MS { $filename="BJSFM_out.prn"; open(OUTPUT_FILE, "< $filename") or die "Can't find $filename!"; $string1="FAILURE CRITERIA PER PLY"; $string2="AUTOMATIC SEARCH FOR FAILURE"; while ($line = <OUTPUT_FILE>) { if ($line =~ $string1) { $skip1 = <OUTPUT_FILE>; $skip2 = <OUTPUT_FILE>; $skip3 = <OUTPUT_FILE>; $n=1; do { $temp{$n} = <OUTPUT_FILE>; n++; } until ???????? Im not sure here? } } close(OUTPUT_FILE); print (%temp); }

Comment on File Parsing
Download Code
Re: File Parsing
by choroba (Abbot) on May 31, 2012 at 16:14 UTC
    This should work: You declare variables in a wider scope than needed. If you do not need the skipped lines, there is no reason to keep them in variables. Also, I do not see any benefit in using a hash if its keys are integers in the sequence 1, 2, 3, ...; I used an array instead. In your original do { ... } until, you would have needed something like until eof or $temp{$n-1} =~ /^$/.

      The example in the original post is one step in my overall goal. In moving forward, I think I do want a hash, however, the hash value for key 1 should be @array. Then the @array is undefined and the next match creates a new @array with is then the hash value for key 2. This process is repeated until all matches have been found in the file for each element in @headers. I am really struggling with this one.... I hope my explanation is clear. Here is my attempt at this.

      #!usr/bin/perl use Tk; use Cwd; use strict; use warnings; # ###################################################################### +##################### # GUI Building ###################################################################### +##################### # # Create Main Window my $mw=new MainWindow; my $filename; my $line; my $n; my @headers = ("LAMINATE PROPERTIES", "LAMINATE STRESSES", "LAMINATE STRAINS", "CIRCUMFERENTIAL AND RADIAL STRESSES & STRAINS", "DISPLACEMENTS", "STRAINS PER PLY", "STRESSES PER PLY", "FAILURE CRITERIA PER PLY"); my $inner; my @array; my %hash; my $ms_button = $mw->Button(-text=>"MS", -command=> \&BJSFM_MS)->pack(); MainLoop; sub BJSFM_MS { $filename="BJSFM_out.prn"; open(OUTPUT_FILE, "< $filename") or die "Can't find $filename!"; $n=1; while ($line = <OUTPUT_FILE>) { if ($line =~ /$headers[$n-1]/) { undef @array; #This clears the previous two instances + of $string1 <OUTPUT_FILE> for 1..3; while ($inner = <OUTPUT_FILE>) { last if $inner =~ /^$/; push @array,$inner; } $hash{$n}=@array; $n++; } } close OUTPUT_FILE; print @array; print %hash; }
Re: File Parsing
by thundergnat (Deacon) on May 31, 2012 at 16:48 UTC

    chorobas solution and commentary is excellent; I don't disagree with any of it. I offer this as an alternate possible solution. (I was doing this before I saw that was posted.) Here I slurp the file in by paragraphs then pick out the lines I want to save.

    #!usr/bin/perl use strict; use warnings; use Tk; use Cwd; use Data::Dumper; # ###################################################################### +##################### # GUI Building ###################################################################### +##################### # # Create Main Window my $mw = MainWindow->new; my $n = 0; my %temp; my $ms_button = $mw->Button( -text=>"MS", -command=> \&BJSFM_MS )->pack(); MainLoop; sub BJSFM_MS { my $filename = 'BJSFM_out.prn'; #open my $infile, '<', $filename or die "Can't open $filename! $!" +; my $infile = *DATA; local $/="FAILURE CRITERIA PER PLY\n"; while (my $paragraph = <$infile>) { chomp $paragraph; $paragraph =~ s/^\s+\n|\s+$//g; my @lines = split "\n", $paragraph; next if @lines < 4; map {$temp{$n++} = $_ } @lines[3..$#lines]; } close $infile; print Dumper \%temp; } __DATA__ FAILURE CRITERIA PER PLY DIST ANGLE PLY FAILURE NUMBERS 1 2 SHEAR 0.000 0.00 -45.00 0.238 0.282 -1.459 0.000 0.00 0.00 0.971 1.369 0.004 0.000 5.00 -45.00 0.475 0.142 -1.585 0.000 5.00 0.00 1.003 1.531 -0.274 0.000 10.00 -45.00 0.721 0.037 -1.623 FAILURE CRITERIA PER PLY DIST ANGLE PLY FAILURE NUMBERS 1 2 SHEAR 0.000 0.00 -45.00 0.247 0.293 -1.514 0.000 0.00 0.00 1.008 1.422 0.004 0.000 5.00 -45.00 0.493 0.147 -1.645 0.000 5.00 0.00 1.042 1.589 -0.284

      Thank you for the replies. To complicate this a bit more, if the file looks like this and I need to parse out the last two occurances of the search string, skip three lines in each and then only print the lines of data that have 0.150 in the first field and -45.00 in the third field, what would I add? Thanks Again!

      9 FAILURE CRITERIA PER PLY FAILURE CRITERIA PER PLY DIST ANGLE PLY FAILURE NUMBERS 1 2 SHEAR 0.000 0.00 -45.00 0.238 0.282 -1.459 0.000 0.00 0.00 0.971 1.369 0.004 0.150 5.00 -45.00 0.475 0.142 -1.585 0.150 5.00 0.00 1.003 1.531 -0.274 FAILURE CRITERIA PER PLY DIST ANGLE PLY FAILURE NUMBERS 1 2 SHEAR 0.000 0.00 -45.00 0.247 0.293 -1.514 0.000 0.00 0.00 1.008 1.422 0.004 0.150 5.00 -45.00 0.493 0.147 -1.645 0.150 5.00 0.00 1.042 1.589 -0.284
        if the file looks like this and I need to parse out the last two occurances of the search string, skip three lines in each and then only print the lines of data that have 0.150 in the first field and -45.00 in the third field, what would I add?

        <smartass>"Code to do that" seems like an obvious answer.</smartass>

        Honestly though, there is code in my example above that demonstrates splitting into fields and checking for equality isn't a very obscure operation. Try it yourself and if you can't figure it out, show us your code that doesn't work. You'll get less sarcastic answers if you demonstrate that you've made a little effort.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://973550]
Approved by Eliya
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (18)
As of 2014-08-27 20:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (252 votes), past polls