Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Counter - number of tags per interval

by roboticus (Chancellor)
on May 04, 2013 at 23:31 UTC ( #1032092=note: print w/ replies, xml ) Need Help??


in reply to Counter - number of tags per interval

baxy77bax:

It took a while to figure out what your algorithm was, but I finally doped it out. It seemed overly complex for what you're doing, though. So I coded it up in a form that's a little easier to understand--at least for me! I don't know if it's any faster than yours, or not, so give it a try and let me know how it looks.

Basically, I just keep a list of active intervals, and increment all of them when I see a '1'. When I hit a column that begins a new interval, I move it from the "waiting" list and into the "current" list. When I hit an end column, I remove all expired lists from the active list. That's pretty much it.

Please let me know if it's any better for you...

$ cat 1032018_d.pl #!/usr/bin/perl use warnings; use strict; use Getopt::Long; my ($i1, $i2, %hash1); GetOptions ('i=s' => \$i1, 't=s' => \$i2); open (IN, "<", $i1)|| die "$!"; open (IN1, "<", $i2)|| die "$!"; # Build intervals for each ID while (<IN1>){ chomp; my ($id, $beg, $end) = split /\s+/,$_; push @{$hash1{$id}}, [ $beg, $end, 0 ]; } close IN1; my $p =0; my $id = ""; my @add_intervals; my @del_intervals; my @cur_intervals; #count tags per interval while(<IN>){ s/\s+$//; #print "$.: $_\n"; if (/#(.*)/){ $p=0; $id = $1; # Intervals for this ID @add_intervals = sort {$a->[0] <=> $b->[0]} @{$hash1{$id}}; # List of interval ends my %uniq = map { $_->[1], 0 } @add_intervals; @del_intervals = sort { $a <=> $b } keys %uniq; # Start with no active intervals @cur_intervals = ( ); next; } $p++; # sometimes the first column is set to 0 /^(\d+)\s+(\d+)/ or next; # add new intervals that start on this column while (@add_intervals and ($p == $add_intervals[0][0])) { #print "Adding interval $id:$add_intervals[0][0] .. $add_inter +vals[0][1]\n"; push @cur_intervals, shift @add_intervals; } # Increment all active intervals when we find a hit if (@cur_intervals and $2 eq '1') { ++$_->[2] for @cur_intervals; } # remove ranges that end on this column if (@del_intervals and $p==$del_intervals[0]) { #print "Deleting intervals ending on column $p\n"; shift @del_intervals; @cur_intervals = grep { $_->[1] > $p } @cur_intervals; } } close IN; for my $id (sort keys %hash1) { for my $interval (@{$hash1{$id}}) { print "$id ($interval->[0] .. $interval->[1]) : $interval->[2] +\n"; } } $ perl 1032018_d.pl -i=1032018.file_2 -t=1032018.file_1 a (12 .. 15) : 2 a (12 .. 17) : 4 a (13 .. 14) : 1 a (14 .. 19) : 5 b (10 .. 15) : 5 b (12 .. 15) : 4

Note: I think you might mean to use the first capture in the last regex as your column ($p), but I didn't change it.

...roboticus

When your only tool is a hammer, all problems look like your thumb.


Comment on Re: Counter - number of tags per interval
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1032092]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2015-07-05 22:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (68 votes), past polls