Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Can I speed this up? (repetitively scanning ranges in a large array)

by daverave (Scribe)
on Nov 01, 2010 at 14:28 UTC ( [id://868765]=perlquestion: print w/replies, xml ) Need Help??

daverave has asked for the wisdom of the Perl Monks concerning the following question:

I have a circular coordinate system. A legal coordinate is any integer in (1, max_length). For simplicity, we shall assume max_length=100.

A range is any pair of legal coordinates. It can be 'simple', like (5,20) which means 5,6,7,...,19,20, or 'wrapped' like (95,10) which means 95,96,..,99,100,1,2,3,...,9,10. Note that the coordinate system start from 1 (not 0) and that ranges are always close or inclusive (i.e. the start and end coordinate of a range are both considered in range).

I have a bunch of ranges. For each coordinate i in the coordinate system (i.e. foreach i in (1 .. max_length), I want to find the size of the smallest window centered around i, such that no range fully contains this window.

To make things clearer, let's suppose max_length = 10.

1234567890 coordinate ---------- +++ range (1,3) ++++++ range (2,7) ++ + range (0,2) ---------- 5557753113 size of smallest window...

For coordinate 1, if we look at a window of size 1 that is centered at 1, it is just (1,1). It is covered by a range (actually, two ranges), so we continue for a larger window size (let's only look at odd-sized windows). Let's look at a window of size 3 that is centered at 1. This is (0,2) and it's still covered. Now, we get to a window of size 5 which is (9,3). this is not covered by any range so the result for coordinate 1 is 5. We do the same for each coordinate. Note that for coordinate 8 we stop at the first size (1) since (8,8) is not covered by any range. The other values are displayed at the last line of the above code.

Now, I wish to calculate this last line quickly. the actual size of the coordinate system is around 4M and the number of ranges is about 25k, each of length 5k.

What I currently do is to initialize a results array of size max_length (i.e. 4M) to zeros. Then, for each range (start, end) I iterate over all coordinates i in (start,end). I take min(i-start,end-i) (i.e. the distance to the nearest end of the range) and update results(i) with it if this val is larger.

The idea is that for each coordinate I only care a bout the 'largest' range that contains this coordinate -- largest in the sense its nearest end is most distant.

This works, but it means I actually do some 25k*5k=125M 'operations' (for each range, I go over all the coordinates in it).

I usually wouldn't care, but it takes almost half an hour (with some loading of data and writing, but I have no doubt the multiple scanning takes most of the time). Since I need to this for some 1000 times (simulations), this becomes a problem.

Any idea of making this process more efficient?

I would try to upload some code later today, but I have to work on it a bit so it could become a working example for you to play with. In the meantime, if you have any conceptual ideas - I'd love to hear them.

Thank you

Dave

UPDATE

Sorry for the delay, but I tried to simplify things as much as I could to get the simplest working example. Here it is:

use 5.012; use List::Util qw (min max); my $max_length = 10; my @ranges = ( [ 1, 3 ], [ 2, 7 ], [ 10, 2 ] ); # foreach $i in 1 .. $max_length: $res_a->[$i] == half size of # smallest odd-sized window which is centered at $i and not covered # by any range my $res_a = ranges_a_to_min_uncovered_a(\@ranges); foreach my $i (1 .. $max_length) { print $res_a->[$i]*2+1, " "; # print full size # just to be consistent with the orig +inal post # prints '5 5 5 7 7 5 3 1 1 3' } # gets an array of ranges # returns a ref to an arr hich satisfies: # foreach $i in 1 .. $max_length $ret->[$i] == # half size of smallest odd-sized window which is centered at $i and n +ot covered # by any range sub ranges_a_to_min_uncovered_a { my $ranges_a = shift // die; # default value is zero - a coordiante is not covered by any range # arrat length is $max_length+1 since the first position (0) is ne +ver used # (coordinates start from 1) my @min_uncovered_window_half_sizes = (0) x ( $max_length + 1 ); foreach my $range ( @{$ranges_a} ) { my $start = $range->[0]; my $end = $range->[1]; # iterate all coordinates in range if ( $start <= $end ) { # simple range for my $i ( $start .. $end ) { ranges_a_to_min_uncovered_a_helper( $i, $range, \@min_ +uncovered_window_half_sizes ); } } else { # wrapped range for my $i ( $start .. $max_length, 1 .. $end ) { ranges_a_to_min_uncovered_a_helper( $i, $range, \@min_ +uncovered_window_half_sizes ); } } } # foreach my $range return \@min_uncovered_window_half_sizes; } # gets a range, a coordinate within it and $min_uncovered_a # updates $min_uncovered_a[$coordinate] if needed sub ranges_a_to_min_uncovered_a_helper { my $coordinate = shift // die; my $range = shift // die; my $min_uncovered_a = shift // die; my $dist = min( dist( $range->[0], $coordinate ), dist( $coordinat +e, $range->[1] ) ); if ( $dist > $min_uncovered_a->[$coordinate] ) { $min_uncovered_a->[$coordinate] = $dist; } } # returns the length covered between $start and $end (both inclusive) sub dist { my $start = shift // die; my $end = shift // die; if ( $end >= $start ) { # simple range return $end - $start + 1; } # warpped range return $max_length - $start + 1 + $end; }

I also updated the title of this post so it'll be a bit less mysterious...

Replies are listed 'Best First'.
Re: Can I speed this up?
by roboticus (Chancellor) on Nov 01, 2010 at 15:58 UTC

    daverave:

    I'd love to code it up, but I'm at work right now. Here's what I'm thinking:

    1. Normalize your lists:
      1. For each one that wraps, turn it into two lists: one extended left from the right coordinate, and one extended right from the left coordinate.
        For example: In a universe of 1..20, the range of (18,5) would normalize to (-2, 5) and (18,25).
      2. Sort your lists by the beginning member.
    2. For each column in your range:
      1. Add all lists containing the current column into your working set
      2. For each list in the working set:
        1. Compute T = 3+2*min(col-list[start],list[end]-col)
        2. The desired output is max(T)
      3. Remove from the working list any list having a lower result whose endpoint is less than the list with the maximum result.

    ...roboticus

      daverave:

      I had a bit of time on my lunch break, so I whipped a little something up. It's pretty close, but there are still a few tweaks I have in mind for it...

      #!/usr/bin/perl use strict; use warnings; my @universe = (1, 10); my @ranges = ( [ 1, 3 ], [2, 7], [10, 2] ); # Normalize the lists my @nrm_rng = map { ($$_[0] < $$_[1]) ? $_ : ( [ $$_[0]-$universe[1], $$_[1] ], [ $$_[0], $$_[1] + $universe[1]-$universe[0] ] + ) } @ranges; # Dump lists printf("% 2u ",$_) for $universe[0] .. $universe[1]; print "\n"; print "-- " for $universe[0] .. $universe[1]; print "\n"; for my $R (@nrm_rng) { printf("%2s ", ($_>= $$R[0] and $_<=$$R[1]) ? "XX" : "") for $universe[0] .. $universe[1]; print ": ($$R[0], $$R[1])\n"; } my $arMax; my @out; my @working; for my $col ($universe[0] .. $universe[1]) { $arMax=undef; # Add and remove working lists based on column position push @working, grep { $col>=$$_[0] } @nrm_rng; @nrm_rng = grep { $col < $$_[0] } @nrm_rng; @working = grep { $col<=$$_[1] } @working; # Compute result for each list for (@working) { $$_[2] = 3+2*min(abs($col-$$_[0]),abs($col-$$_[1])); if (!defined($arMax) or ($$arMax[2]<$$_[2]) or ($$arMax[2]==$$_[2] and $$arMax[1]<$$_[1]) +) { $arMax=$_; } } $out[$col] = defined($arMax) ? sprintf("% 2u",$$arMax[2]) : ' +1'; # print "$col ", # "wrk: ", join(", ", map { "(".join(",",@$_).")" } @$_, @wo +rking), "\n"; # Remove lists that will no longer give us values @working = grep { $_ eq $arMax or $$arMax[1] < $$_[1] } @worki +ng; } print "FINAL: ", join(' ', @out), "\n"; sub min { my ($l, $r) = (@_); return ($l<$r) ? $l : $r; }

      ...roboticus

      Update:I reformatted one of the lines of code to get rid of an ugly line break.

      Update 2: Code edits: (1) Set ranges as per daverave's request, (2) added debug trace (commented out print), (3) bugfix ($arMax=undef at start of loop), (4) bugfix (list normalization). I didn't replace all the code, so I may have made an error, so let me know if you see something wrong. Currently, though, running it shows me:

      roboticus@Work: /Work/Perl/PerlMonks $ perl 868827_ranges_and_such.pl 1 2 3 4 5 6 7 8 9 10 -- -- -- -- -- -- -- -- -- -- XX XX XX : (1, 3) XX XX XX XX XX XX : (2, 7) XX XX : (0, 2) XX : (10, 11) 5 5 5 7 7 5 3 1 1 3 : FINAL
        Please try running your code with my simple 3 ranges example (and max_length = 10). I think you get different results, but I only had a quick look at your code and perhaps it's related to the fact your coordinates start from 0, not 1 as I mentioned?

        I tried shifting all my ranges by -1 and run your code again, but I still different results:

        my @universe = (0, 9); my @ranges = ( [ 0, 2 ], [ 1, 6 ], [ 9, 1 ] ); FINAL: 3 5 5 7 7 5 3 3 3 3
        Specifically, note that there are coordinates (8 and 9 or 7 and 8 in the shifted version) that are not covered by any range, thus you must have ones in your results.

        Update: I forgot to write I really appreciate you spending time in your lunch break (or any other break...) helping me. Thank you!

        Update following your corrected code: I ran your code against my original version on some relatively small example (max_length = 1M, some 6k ranges of a few k's each). My original version was done in about 3 minutes.

        The good news is the results are the same. The bad news is your version ran for more than 18 minutes. Perhaps all the grepping is expensive? (also remember this array is relatively small, only 6k ranges. I usually have about 25k ranges, which means the grepped arrays will be 4 times larger).

Re: Can I speed this up?
by SuicideJunkie (Vicar) on Nov 01, 2010 at 15:58 UTC

    If I understand you correctly, your ranges are all essentially independent of each other.

    One thing I would definitely suggest is buckets for the ranges. Have an array of 2k buckets, with each bucket covering 2000 values. For each range, push a reference to it into the buckets it touches. That way, you only have to inspect a couple of ranges (avg of 30-40 I expect) for each value, and it should speed up the calculation, at the cost of a little pre-processing work.

    Addendum:

    On further thought, this would be quite different and possibly slower than your current scheme, which is pretty good.

    For a refinement, what if you simplify the calculation of distance to the edge point? Try breaking up the for loop over the range into two loops. The first goes from zero distance from the start point up to max at halfway through the range. The second goes from max at halfway, down to zero at the end point.

    Then you can simply say $score = max($score, $loopIndex);

Re: Can I speed this up?
by BrowserUk (Patriarch) on Nov 01, 2010 at 16:22 UTC
    What I currently do is to initialize a results array of size max_length (i.e. 4M) to zeros.

    Then, for each range (start, end) I iterate over all coordinates i in (start,end).

    I take min(i-start,end-i) (i.e. the distance to the nearest end of the range) and update results(i) with it if this val is larger.

    I can't reproduce your results for your 3-ranges max-10 example by following those steps? I get 0 1 1 2 2 1 0 0 0 0.

    So I think you really will have to post code.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I updated the original post.
Re: Can I speed this up?
by moritz (Cardinal) on Nov 01, 2010 at 16:09 UTC
    This works, but it means I actually do some 25k*5k=125M 'operations' (for each range, I go over all the coordinates in it).

    I don't see why you have to go over each coordinate in a range. If you show some code, maybe a clever monk can find a way to optimize this step.

    Perl 6 - links to (nearly) everything that is Perl 6.

      This leads to a question about the nature of the problem:
      Do we REALLY need to evaluate all of those 4M values?

      If the end result will be sampled in only a few places, then there are definitely more efficient ways to go about it on-demand. If the interesting points are those with certain specific properties (local maxima?) then there will likely be a way to simplify there too.

        Nice catch. I do care about local maxima. How does this help?
      Please see updated post.
Re: Can I speed this up? (repetitively scanning ranges in a large array)
by ikegami (Patriarch) on Nov 01, 2010 at 21:41 UTC

    A circle doesn't have a start. The only difference between 1..100 and 0..99 is how you store 100 internally.

    You even used zero instead of 10 in your example:

    1234567890 coordinate <------- ---------- +++ range (1,3) ++++++ range (2,7) ++ + range (0,2) <------- ---------- 5557753113 size of smallest window...
      Obviously. The only reason I'm using coordinates that start from 1, not zero, is to be consistent with the input and output formats I'm using.

      These are biological data, and the unfortunate convention in all biological databases I'm familiar with is to start counting from 1. The first position in any genome is 1. If I used zeros, I would have to remember converting between the two systems each time I input/output and from experience, I quickly forget doing that...

        I would have to remember converting between the two systems each time I input/output and from experience, I quickly forget doing that...

        You say it's obvious, then you keep talking as if a circle has a start.

        No conversion is necessary. Just start at index one of the array for the item labeled 1, then keep going for 100 elements, which is going to end you at element zero of the array.

        for (1..100) { print $result[$_ % 100]; }
Re: Can I speed this up? (repetitively scanning ranges in a large array)
by zentara (Archbishop) on Nov 02, 2010 at 12:09 UTC
    if you have any conceptual ideas

    I can't quite get my head around the problem you have, but based on the node title.... Can I speed this up? (repetitively scanning ranges in a large array) , the first thought that came into my mind was PDL's piddles. You can do operations on slices of large arrays very fast. See PDL::Indexing


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
Re: Can I speed this up? (repetitively scanning ranges in a large array)
by ikegami (Patriarch) on Nov 03, 2010 at 00:06 UTC

    Here's a simple solution:

    use strict; use warnings; sub assign_if_higher { $_[0] = $_[1] if $_[0] < $_[1]; } my $size = 10; my @ranges = ( [ 1, 3 ], # 1-based [ 2, 7 ], [ 10, 2 ], ); my @genome = (1) x $size; for (@ranges) { my $s = $_->[0] - 1; # 0-based my $e = $_->[1] - 1; my $m = ($e - $s) % $size / 2; my $x = 3; for (0..$m) { assign_if_higher($genome[($s + $_) % $size], $x); assign_if_higher($genome[($e - $_) % $size], $x); $x += 2; } } print(join(' ', @genome), "\n");

    The time is still proportional to the sum of the number of elements covered by each range, but it's done very simply.

    Any real optimisation to this approach will come from jumping over values that are already known.

    Removing completely overlapping ranges first would take care of some cases, leaving the need to optimising ranges that only partially overlap. It's definitely possible, but I'm out of time.

      Thanks. That's actually very similar to the original solution, but with in-lining the object method calls. Adopting this approach actually decreases running time by half. I'm always amazed how large it the overhead for calling functions, and it saddens me since it makes the code less maintainable and understandable. I guess you can't have it all...

        I didn't "inline" for performance reason, I didn't create a sub because there was no reason to.

        my $dist = ($start - $end + 1) % $max_length;

        is perfectly fine without transforming it into

        my $dist = dist($start, $end); sub dist { my $start = shift // die; my $end = shift // die; if ( $end >= $start ) { # simple range return $end - $start + 1; } # warpped range return $max_length - $start + 1 + $end; }

        This is what I was explaining earlier.

Re: Can I speed this up? (Algorithm error)
by BrowserUk (Patriarch) on Nov 03, 2010 at 19:45 UTC

    I trying to approach your problem in a quite different way, which means that my calculations aren't just a refactoring of your code, but worked out from scratch. I've been producing results broadly similar to yours for a while, and I get identical results for the 3-point/max 10 dataset.

    For the larger test-set, I always get some discrepancies, and have been attributing them to errors in my algorithm. But I could never work out were I was going wrong.

    I'm now convinced that your algorithm above is wrong! (Or I'm misunderstanding the question completely?)

    Given the following set of points with a MAX of 35:

    [ 25, 1 ][ 31, 1 ][ 29, 9 ][ 35, 9 ][ 28, 10 ][ 28, 10 ][ 31, 11 ] [ 34, 18 ][ 35, 20 ][ 21, 35 ][ 22, 35 ][ 13, 31 ][ 19, 31 ][ 21, 31 ] [ 30, 31 ][ 22, 30 ][ 19, 26 ][ 24, 26 ][ 7, 25 ][ 7, 24 ][ 16, 24 ] [ 17, 22 ][ 19, 22 ][ 1, 20 ][ 6, 20 ][ 10, 16 ][ 10, 16 ][ 14, 16 ] [ 2, 15 ][ 1, 13 ][ 4, 12 ][ 11, 12 ][ 6, 10 ][ 1, 9 ][ 5, 6 ]

    The OP code produces this set of half-distances:

    9 9 8 8 7 8 9 10 10 11 10 9 8 8 9 10 9 8 7 8 9 10 9 8 7 6 7 8 7 6 6 5 +6 7 8

    In order to test this hypothesis, I knocked up a script to produce a diagram similar to your ascii art in the OP, for any given data&results set, so that I could manually verify the results. This is what it produces for the above:

    c:\test>868765-plot -MAX=35 -S=20 -R=35 -DIST=op.results 1234567890123456789012345678901234567890 *|||||||||||||||||||||||***********|[ 25, 1 ] *|||||||||||||||||||||||||||||*****|[ 31, 1 ] *********|||||||||||||||||||*******|[ 29, 9 ] *********|||||||||||||||||||||||||*|[ 35, 9 ] **********|||||||||||||||||********|[ 28, 10 ] **********|||||||||||||||||********|[ 28, 10 ] ***********|||||||||||||||||||*****|[ 31, 11 ] ******************|||||||||||||||**|[ 34, 18 ] ********************||||||||||||||*|[ 35, 20 ] ||||||||||||||||||||***************|[ 21, 35 ] |||||||||||||||||||||**************|[ 22, 35 ] ||||||||||||*******************|||||[ 13, 31 ] ||||||||||||||||||*************|||||[ 19, 31 ] ||||||||||||||||||||***********|||||[ 21, 31 ] |||||||||||||||||||||||||||||**|||||[ 30, 31 ] |||||||||||||||||||||*********||||||[ 22, 30 ] ||||||||||||||||||********||||||||||[ 19, 26 ] |||||||||||||||||||||||***||||||||||[ 24, 26 ] ||||||*******************|||||||||||[ 7, 25 ] ||||||******************||||||||||||[ 7, 24 ] |||||||||||||||*********||||||||||||[ 16, 24 ] ||||||||||||||||******||||||||||||||[ 17, 22 ] ||||||||||||||||||****||||||||||||||[ 19, 22 ] ********************||||||||||||||||[ 1, 20 ] |||||***************||||||||||||||||[ 6, 20 ] |||||||||*******||||||||||||||||||||[ 10, 16 ] |||||||||*******||||||||||||||||||||[ 10, 16 ] |||||||||||||***||||||||||||||||||||[ 14, 16 ] |**************|||||||||||||||||||||[ 2, 15 ] *************|||||||||||||||||||||||[ 1, 13 ] |||*********||||||||||||||||||||||||[ 4, 12 ] ||||||||||**||||||||||||||||||||||||[ 11, 12 ] |||||*****||||||||||||||||||||||||||[ 6, 10 ] *********|||||||||||||||||||||||||||[ 1, 9 ] ||||**||||||||||||||||||||||||||||||[ 5, 6 ] +-------->||||||||||||||||<--------| 9 -+-------->||||||||||||||||<-------| 9 --+------->||||||||||||||||||<-----| 8 ---+------->||||||||||||||||||<----| 8 ----+------>||||||||||||||||||||<--| 7 -----+------->||||||||||||||||||<--| 8 ------+-------->||||||||||||||||<--| 9 -------+--------->||||||||||||||<--| 10 --------+--------->||||||||||||||<-| 10 ---------+---------->||||||||||||<-| 11 <---------+--------->||||||||||||||| 10 ||<--------+-------->||||||||||||||| 9 ||||<-------+------->||||||||||||||| 8 |||||<-------+------->|||||||||||||| 8 |||||<--------+-------->|||||||||||| 9 |||||<---------+--------->|||||||||| 10 |||||||<--------+-------->|||||||||| 9 |||||||||<-------+------->|||||||||| 8 |||||||||||<------+------>|||||||||| 7 |||||||||||<-------+------->|||||||| 8 |||||||||||<--------+-------->|||||| 9 |||||||||||<---------+--------->|||| 10 |||||||||||||<--------+-------->|||| 9 |||||||||||||||<-------+------->|||| 8 |||||||||||||||||<------+------>|||| 7 |||||||||||||||||||<-----+----->|||| 6 |||||||||||||||||||<------+------>|| 7 >||||||||||||||||||<-------+-------| 8 >||||||||||||||||||||<------+------| 7 >||||||||||||||||||||||<-----+-----| 6 ->||||||||||||||||||||||<-----+----| 6 ->||||||||||||||||||||||||<----+---| 5 --->||||||||||||||||||||||<-----+--| 6 ----->||||||||||||||||||||<------+-| 7 ------->||||||||||||||||||<-------+| 8

    If you consider position 1.

    1. Moving right (wrapping), you have to go to at least position 21 in order to escape the clutches of the 9th range 35, 20 , or the 24th range: 1, 20 ;

      Which would be a half-distance of 20.

    2. Moving left (wrapping around) you need to move to position 24 to escape the 1st range: 25, 1 ;

      Which is a half-distance of 12.

    Which means the final half-distance for position 1 should be 12; not 9 as produced by the OP code.

    Similarly, considering position 35

    1. Moving right, you have to go to at least position 21 in order to escape the clutches of the 9th range 35, 20 ;

      Which would be a half-distance of 21.

    2. Moving left you need to move to position 24 to escape the 1st range: 25, 1 ;

      Which is a half-distance of 11.

    Which means the final half-distance for position 1 should be 11; not 8 as produced by the OP code.

    You can work the other out for yourself.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I think you misunderstood the question, probably due my poor explanation. You have to look both sides at the same time in a single range. You can't pick one range that has a far right side and another that has a far left side. In other words: for each position i we are looking for the (half) size of the minimal (odd-sized) window which is centered around i, and is not contained by any (single) range. You can't 'split' the containment between two ranges.

      Going back to the example of position 1:

      half size 0: there exists a range that covers 1,1
      half size 1: there exists a range that covers 35,2
      half size 2: there exists a range that covers 34,3
      ...
      half size 8: there exists a range that covers 28,9 (e.g. 28,10)
      half size 9: there is NO range that covers 27,10. DONE!

      I hope it's more clear now.

      (p.s. in your ASCII art it seems as if MAX=36, not 35, doesn't it?)

        Thanks for the clarification. Back to my drawing board :)

        (p.s. in your ASCII art it seems as if MAX=36, not 35, doesn't it?)

        No. The last vertical line of '|'s is just a border. Notice that there are no stars in in it.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Can I speed this up? (repetitively scanning ranges in a large array)
by choroba (Cardinal) on Nov 02, 2010 at 15:43 UTC
    I used the following code to remove subranges of longer ranges (just 51 ranges survive from your corrected example): It takes less then a second on my machine to prune the ranges.
      That's a good idea that would surely help. However, I would have to check how much effect it will have.

      The example I gave is very small - less than 1000 ranges compared to the usual 25k. Also, I'm not sure if it represents the normal proportion of "contained" ranges (I would guess it's much lower than the 95% in this specific case).

      So, I will surely incorporate it and try it out, but I'm not confident it will have a great effect in the average case.

      UPDATEIt seems after in-lining the original object methods, filtering out a ranges takes about the same time as processing it (perhaps even longer). I therefore stopped using this for the meantime, although this seemed like a very nice idea. Perhaps there's a more efficient way of filtering the ranges?

Re: Can I speed this up? (repetitively scanning ranges in a large array)
by salva (Canon) on Nov 03, 2010 at 17:06 UTC
    This may be very similar to the idea described by SuicideJunkie here... but in real Perl code!
    #!/usr/bin/perl use strict; use warnings; use 5.010; my $len = 87688; open my $rfh, '<', 'ranges.txt' or die $!; my (@start, @end); while (<$rfh>) { chomp; my ($start, $end) = split /\s+/; $start--; $end--; if ($start > $end) { my $start2 = $start - $len; push @start, ($start2 < -$end ? -$end : $start2); push @end, $end; $end += $len; } push @start, $start; push @end, $end; } # sort the ranges my @ix = sort { $start[$a] <=> $start[$b] or $end[$b] <=> $end[$a] } 0..$#start; # filter out shadowed ranges my @good; my $last_end = -1; for my $ix (@ix) { if ($end[$ix] > $last_end) { push @good, $ix; $last_end = $end[$ix]; } } # create slopes my @top; my $last_ix = shift @good; for my $ix (@good) { my $x0 = int (($start[$last_ix] + $end[$last_ix] + 1) / 2); my $y0 = $end[$last_ix] - $x0; my $x1 = int (($start[$ix] + $end[$ix]) / 2); my $y01 = $x0 - $start[$ix]; $last_ix = $ix; for my $x ($x0..$x1) { $top[$x] = ($y0 > $y01 ? $y0 : $y01); $y0--; $y01++; } } $#top = $len - 1; say $_ + 1 for @top;
    Note that it does not handle the case where there are points not covered by any range.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://868765]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-03-19 08:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found