Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Selecting successive lines

by zee3b (Novice)
on Sep 17, 2013 at 02:54 UTC ( #1054374=perlquestion: print w/replies, xml ) Need Help??
zee3b has asked for the wisdom of the Perl Monks concerning the following question:

Hey guys, I'm working on a text parser. I grab the data from the file and push it into an array and then I foreach it, but I'm only looking for specific successive lines when I'm calculating the average. Here's an example. My data file is of this structure
Jack Student ID - 12445 Math Score - 45 Jill Student ID - 234254 Math Score - 90 Jack Student ID -12445 Math Score2 - 33 Jill Student ID - 234254 Math Score2 - 10
So basically as soon as my regex matches the name Jill or Jack. I want it to pick the score from the 3rd line, any command to add into the if loop? e.g
if ($line ~= /Jill/) { *pick the score from the sucessive second line*
This way I can average the scores for each student for different tests.

output = Jack Average 39

Jill Average 50

Replies are listed 'Best First'.
Re: Selecting successive lines
by frozenwithjoy (Priest) on Sep 17, 2013 at 03:52 UTC
    If I were going to do this and wanted to have it expandable, I'd probably plan to make a hash like this (you can simplify it if you don't care about keeping track of student IDs and never have students with the same names):
    my %records = ( 12445 => { name => 'Jack', scores => [ 45, 10 ], }, 234254 => { name => 'Jill', scores => [ 45, 10 ], }, );

    If the data are consistently in the format shown, the following will make the hash (it can accept scores w/ decimals). It then calculates and reports the mean score.

    #!/usr/bin/env perl use strict; use warnings; use feature 'say'; use List::Util 'sum'; my %records; while ( my $name = <DATA> ) { my ($id) = <DATA> =~ /(\d+)$/; my ($score) = <DATA> =~ /(\d+\.?\d*)$/; <DATA>; chomp( $name, $id, $score ); $records{$id}{name} = $name; push @{ $records{$id}{scores} }, $score; } for ( sort { $a <=> $b } keys %records ) { my $name = $records{$_}{name}; my @scores = @{ $records{$_}{scores} }; my $mean = sum(@scores) / @scores; say "$name ($_) has an average score of $mean"; } __DATA__ Jack Student ID - 12445 Math Score - 45 Jill Student ID - 234254 Math Score - 90 Jack Student ID -12445 Math Score2 - 33 Jill Student ID - 234254 Math Score2 - 10

    OUTPUT:

    Jack (12445) has an average score of 39 Jill (234254) has an average score of 50
Re: Selecting successive lines
by kcott (Chancellor) on Sep 17, 2013 at 04:05 UTC

    G'day zee3b,

    You can read that data in paragraph mode (see perlvar for details): basically, this allows you to read each group of three lines as a single record. When you do this, you can grab the name, from line 1, and the score, from the end of line 3, in a single operaton.

    #!/usr/bin/env perl -l use strict; use warnings; my %score; { local $/ = ""; while (<DATA>) { /\A(\w+).*?(\d+)\D*\z/ms; ++$score{$1}{count}; $score{$1}{total} += $2; } } for (sort keys %score) { print $_, ' Average ', $score{$_}{total} / $score{$_}{count}; } __DATA__ Jack Student ID - 12445 Math Score - 45 Jill Student ID - 234254 Math Score - 90 Jack Student ID -12445 Math Score2 - 33 Jill Student ID - 234254 Math Score2 - 10

    Output:

    $ pm_file_parse_avg.pl Jack Average 39 Jill Average 50

    For your real application, you'll probably also want either int for whole number averages, or sprintf to format floating point results.

    -- Ken

      You can read that data in paragraph mode (see perlvar for details)

      OMG I have been doing this the hard way (with manual parse phase state and similar techniques) for over a decade.

      :: facepalm ::

      Off to read now. And let the blush drain from my face.

      Thanks for the reference!

Re: Selecting successive lines
by davido (Archbishop) on Sep 17, 2013 at 04:35 UTC

    I would read each record complete rather than one line at a time, and to keep it simple, would push scores into an anonymous array per student. Then just iterate over each student and do the math.

    use List::Util qw( sum ); my %student; local $/ = ''; while( <DATA> ) { push @{$student{$1}}, $2 and next if m/\A([^\n]+).+?(\d+)\n?\Z/s; warn "Invalid record #$.: <<$_>>\n"; } while( my( $name, $scores ) = each %student ) { print "$name: Average ", sum( @{$scores} ) / @$scores, "\n"; } __DATA__ Jack Student ID - 12445 Math Score - 45 Jill Student ID - 234254 Math Score - 90 Jack Student ID -12445 Math Score2 - 33 Jill Student ID - 234254 Math Score2 - 10

    Dave

Re: Selecting successive lines
by ansh batra (Friar) on Sep 17, 2013 at 09:01 UTC

    I have a simple way
    keep a flag,when you get the name set the flag as one and if flag is one then get score.

    foreach(traverse data) { if($data =~ /jack/) { //put some variable here for name $flag=1; } if($flag ==1 && $data=~ /Math Score2 - /) { $score=$'; $flag=0; } }

    Since you need to check for multiple names , you may also use
    if ( grep( /^$data$/, @array ) )

Re: Selecting successive lines
by sundialsvc4 (Abbot) on Sep 17, 2013 at 12:50 UTC

    The customary approach to doing this sort of thing is the approach taken by the awk utility:  

    /regular_expression /
      { code to execute if regex is matched }
    ... rinse and repeat ...

    So, in this file, there would be .. it looks like .. about five different “kinds” of lines, including blank-line, and you have things-to-do with two of them.   For a student_name line, you capture the name and proceed.   For a Score(n) line, you extract the score and do something with it, using the most-recently captured student name.   Perhaps for a blak-line you forget the name.   And so on.

    One advantage of this approach is that it is relatively “future-proof.”   You are making fewer assumptions about the data, such as “the third line.”   It is also now much easier for your program to recognize when there is a bug in the program that produced the file, which is another important consideration in production settings.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1054374]
Approved by frozenwithjoy
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2018-06-22 06:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?



    Results (122 votes). Check out past polls.

    Notices?