Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

creating and printing a sliding window

by Angharad (Pilgrim)
on Mar 04, 2009 at 14:40 UTC ( #748175=perlquestion: print w/ replies, xml ) Need Help??
Angharad has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that looks like this.
1 0 0.00 0 0 0 2 0 0.00 0 0 0 3 0 0.08 0 0 0 4 0 0.05 0 0 0 5 0 0.08 0 0 0 6 0 0.05 0 0.12 0 7 0 0.05 0 0.12 0 8 0 0.04 0 0.15 0 9 0.07 0.07 0 0.15 0.18 10 0.29 0.04 0.32 0.32 0.19 11 0.46 0.05 0.42 0.30 0.21 12 0.45 0.07 0.35 0.29 0.41 13 0.57 0.07 0.42 0.00 0.47 14 0.46 0.04 0.62 0.00 0.58 15 0.39 0.05 0.41 0.00 0.37
etc etc where the first column is a position number and the other columns my data of interest. I want to create a 'sliding window' whereby all the data points within 5 positions are taken into account and then print off the highest of the scores for my 5 data points within those 5 positions - for example for the data:
10 0.29 0.04 0.32 0.32 0.19 11 0.46 0.05 0.42 0.30 0.21 12 0.45 0.07 0.35 0.29 0.41 13 0.57 0.07 0.42 0.00 0.47 14 0.46 0.04 0.62 0.00 0.58
I would print out
10-14 0.57 0.07 0.62 0.32 0.58
This is what I've attempted so far - which simply doesnt work. I cant get the count to increment. Do I need to use an array instead of using a while loop to go though each line of the file one at a time?
#!/usr/bin/perl -w use strict; use warnings; use English; use FileHandle; use Exception; my $input = shift; my $count = 0; my $largest_cons1 = 0; my $largest_cons2 = 0; my $largest_cons3 = 0; my $largest_cons4 = 0; my $largest_cons5 = 0; open(FILE, "$input") || die "ERROR: Unable to open input file: $!\n"; while(<FILE>) { my @data = split(/\s+/, $_); $count++; my $pos = $data[1]; my $cons1 = $data[2]; my $cons2 = $data[3]; my $cons3 = $data[4]; my $cons4 = $data[5]; my $cons5 = $data[6]; if($count < 5) { if($cons1 > $largest_cons1) { $largest_cons1 = $cons1; } if($cons2 > $largest_cons2) { $largest_cons2 = $cons2; } if($cons3 > $largest_cons3) { $largest_cons3 = $cons3; } if($cons4 > $largest_cons4) { $largest_cons4 = $cons4; } if($cons5 > $largest_cons5) { $largest_cons5 = $cons5; } } print "$pos $largest_cons1 $largest_cons2 $largest_cons3 $largest_ +cons4 $largest_cons5\n"; }
Any help/suggestions much appreciated. Thanks in advance!

Comment on creating and printing a sliding window
Select or Download Code
Replies are listed 'Best First'.
Re: creating and printing a sliding window
by johngg (Abbot) on Mar 04, 2009 at 15:23 UTC

    Have a look at the List::Util core module, particularly the max() routine. Also be aware the array subscripts are zero-based so your

    ... my $pos = $data[1]; my $cons1 = $data[2]; my $cons2 = $data[3]; my $cons3 = $data[4]; my $cons4 = $data[5]; my $cons5 = $data[6]; ...

    will be pointing one element too far to the right. You can also do that in one fell swoop.

    while( <FILE> ) { my( $pos, $cons1, $cons2, $cons3, $cons4, $cons5 ) = split; ...

    The default action for split is to split $_ on whitespace.

    I hope this is helpful.

    Cheers,

    JohnGG

Re: creating and printing a sliding window
by shmem (Canon) on Mar 04, 2009 at 15:30 UTC

    Close.

    if($count < 5) { ... } else { print "$pos ... \n"; $count = 0; @data = (); }

    but you would need to reset $largest_cons<n> too, depending on your requirements.

    Some points:

    • you use English, FileHandle and Exception, but then you don't make use of them. Why?
    • Why those $largest_cons<number> variables? Wherever you are inclined to number your variables, you really want an array
    • at open(FILE, "$input") use three-argument open
    • at open(FILE, "$input") - useless use of quotes

    You could also use $. (see perlvar) in a flip-flop (".." - see perlop):

    #!/usr/bin/perl -w use strict; use warnings; my $input = shift; #open FILE, '<', "$input" or die "ERROR: Unable to open input file: $! +\n"; my @largest_cons = (0) x 6; # inhibit "uninitialized" warnings while (<DATA>) { my @data = split; # if ( 1 .. 5 ) # see update below # { $largest_cons[0] = $data[0] if $. == 1; for (1..$#data) { $largest_cons[$_] = $data[$_] if $data[$_] > $largest_cons[$_]; } if ($. == 5) { $largest_cons[0] .= '-' . $data[0]; print "@largest_cons\n"; $. = 0; @largest_cons = (0) x 6; } # } }

    Note that starting with 1, you end at 1-5 .. 11-15 rather than 10-14. For that you need a row 0.

    Update: on a second look at the code I've posted, the flip-flop-business doesn't make sense here... ;)

Re: creating and printing a sliding window
by repellent (Priest) on Mar 04, 2009 at 18:48 UTC
    I would stick with the while loop as it is more scalable and make use of List::Util.
    use warnings; use strict; use List::Util qw(max); my @window; while (<DATA>) { chomp(); # create sliding window push(@window, [ (split) ]); shift(@window) if $. > 5; # print range print $window[0][0], "-", $window[-1][0]; # print maximums for my $i (1 .. 5) { print " ", max(map { $_->[$i] } @window); } print "\n"; } __END__ 1 0 0.00 0 0 0 2 0 0.00 0 0 0 3 0 0.08 0 0 0 4 0 0.05 0 0 0 5 0 0.08 0 0 0 6 0 0.05 0 0.12 0 7 0 0.05 0 0.12 0 8 0 0.04 0 0.15 0 9 0.07 0.07 0 0.15 0.18 10 0.29 0.04 0.32 0.32 0.19 11 0.46 0.05 0.42 0.30 0.21 12 0.45 0.07 0.35 0.29 0.41 13 0.57 0.07 0.42 0.00 0.47 14 0.46 0.04 0.62 0.00 0.58 15 0.39 0.05 0.41 0.00 0.37

    Output:
    1-1 0 0.00 0 0 0 1-2 0 0.00 0 0 0 1-3 0 0.08 0 0 0 1-4 0 0.08 0 0 0 1-5 0 0.08 0 0 0 2-6 0 0.08 0 0.12 0 3-7 0 0.08 0 0.12 0 4-8 0 0.08 0 0.15 0 5-9 0.07 0.08 0 0.15 0.18 6-10 0.29 0.07 0.32 0.32 0.19 7-11 0.46 0.07 0.42 0.32 0.21 8-12 0.46 0.07 0.42 0.32 0.41 9-13 0.57 0.07 0.42 0.32 0.47 10-14 0.57 0.07 0.62 0.32 0.58 11-15 0.57 0.07 0.62 0.30 0.58
Re: creating and printing a sliding window
by Utilitarian (Vicar) on Mar 04, 2009 at 15:19 UTC
    Slurping the file into an array would be more efficient.

    hints below

    ... open FILE, "<$ARGV[0]"; @data_records=<FILE>; for ($x=0;$x<@data_records;$x++){ for ($index=$x; $index<($x+5);$index++){ @record=split(/\s+/, $data_records[$index]); for($index=1;$index<@record;$index++){ $max[$index]=$record[$index] if $max[$index]<$record[$index]; } } print "$ARGV[0]-",$ARGV[0]+5,"@max\n"; }
    EDIT, Re-read now trying to answer the question actually posed

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://748175]
Approved by Fletch
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (12)
As of 2015-07-28 07:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (253 votes), past polls