### Re: Counter - number of tags per interval

by roboticus (Chancellor)
 on May 04, 2013 at 23:31 UTC ( #1032092=note: print w/replies, xml ) Need Help??

in reply to Counter - number of tags per interval

It took a while to figure out what your algorithm was, but I finally doped it out. It seemed overly complex for what you're doing, though. So I coded it up in a form that's a little easier to understand--at least for me! I don't know if it's any faster than yours, or not, so give it a try and let me know how it looks.

Basically, I just keep a list of active intervals, and increment all of them when I see a '1'. When I hit a column that begins a new interval, I move it from the "waiting" list and into the "current" list. When I hit an end column, I remove all expired lists from the active list. That's pretty much it.

Please let me know if it's any better for you...

```\$ cat 1032018_d.pl
#!/usr/bin/perl
use warnings;
use strict;
use Getopt::Long;

my (\$i1, \$i2, %hash1);

GetOptions ('i=s' => \\$i1, 't=s' => \\$i2);

open (IN, "<", \$i1)|| die "\$!";
open (IN1, "<", \$i2)|| die "\$!";

# Build intervals for each ID
while (<IN1>){
chomp;
my (\$id, \$beg, \$end) = split /\s+/,\$_;
push @{\$hash1{\$id}}, [ \$beg, \$end, 0 ];
}

close IN1;

my \$p =0;
my \$id = "";

my @del_intervals;
my @cur_intervals;

#count tags per interval
while(<IN>){
s/\s+\$//;
#print "\$.: \$_\n";
if (/#(.*)/){
\$p=0;
\$id = \$1;

# Intervals for this ID
@add_intervals = sort {\$a->[0] <=> \$b->[0]} @{\$hash1{\$id}};

# List of interval ends
my %uniq = map { \$_->[1], 0 } @add_intervals;
@del_intervals = sort { \$a <=> \$b } keys %uniq;

@cur_intervals = ( );
next;
}
\$p++; # sometimes the first column is set to 0

/^(\d+)\s+(\d+)/ or next;

# add new intervals that start on this column
+vals[0][1]\n";
}

# Increment all active intervals when we find a hit
if (@cur_intervals and \$2 eq '1') {
++\$_->[2] for @cur_intervals;
}

# remove ranges that end on this column
if (@del_intervals and \$p==\$del_intervals[0]) {
#print "Deleting intervals ending on column \$p\n";
shift @del_intervals;
@cur_intervals = grep { \$_->[1] > \$p } @cur_intervals;
}
}
close IN;

for my \$id (sort keys %hash1) {
for my \$interval (@{\$hash1{\$id}}) {
print "\$id (\$interval->[0] .. \$interval->[1]) : \$interval->[2]
+\n";
}
}

\$ perl 1032018_d.pl -i=1032018.file_2 -t=1032018.file_1
a (12 .. 15) : 2
a (12 .. 17) : 4
a (13 .. 14) : 1
a (14 .. 19) : 5
b (10 .. 15) : 5
b (12 .. 15) : 4

Note: I think you might mean to use the first capture in the last regex as your column (\$p), but I didn't change it.

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Create A New User
Node Status?
node history
Node Type: note [id://1032092]
help
Chatterbox?
 [Your Mother]: "Tick Twister." [1nickt]: My sister lives in the Sydney suburbs (carved out of the bush) and they have the Paralysis tick -- leavs you quadriplegic! [perldigious]: Hmm, thanks Lotus1... perldigious scribbles note to buy lots of opossums as "pets" in the future to hang around outside his property. :-P [1nickt]: You must twist and pull! If you burn, use tea tree oil etc, the tick barfs (inside your blood vessel) and you get all the loevely bacteria in his gut transferred to yours. [1nickt]: tweesers work just fine [hippo]: Hope everyone else is enjoying their lunch too. [1nickt]: However, note that it takes between 2 and 4 hours for a tick to be able to start transmitting bacteria. So even if it has burrowed in and started eating, if you get it soon you should be OK. [perldigious]: 1nickt: No more red meat! It would be more humane to just kill you, wouldn't it! :-) [1nickt]: Just don't make it barf. [1nickt]: perldigious that's exactly what my stepson says.

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (14)
As of 2017-05-24 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My favorite model of computation is ...

Results (184 votes). Check out past polls.