comment on

I would spend a lot of time pre-processing to cut down on the amount of looping you have to do.

In describing time complexity I'll use M for the number of points you have and N for the number of rectangles.

First, wrap every polygon in a rectangle as described in this tip. This is O(n).

Then, create four lists of rectangles sorted by xMin, xMax, yMin, and yMax. This is O(n log n) if you use quicksort.

Now, you cluster the rectangles into super-rectangles. How you do this is up to you, and should probably be informed by your data. If its evenly distributed you can do this in essentially O(n) time by simply using a grid system.

If it is not evenly distributed, you do something like the following recursively: for any super-rectangle, roll randomly from the set of xMin, xMax, yMin, and yMax. Go to the median of the list you just selected (this should be constant time if your lists are backed by arrays). Draw a line there separating half of the points on one side of the line and half on the other. i.e. if you picked yMax you draw a horizontal line there. Then you use an O(n) loop to assign rectangles to either or both of the new slightly-less-super rectangles. If a rectangle contains more than some arbitrary threshhold of your total number of rectangles, you recursively subdivide it. Otherwise, you stop that level of the recursion and add that superrectangle to your superrectangle list. (Note, you'll want a sanity check on your recursion depth here, as it is possible but unlikely in practice that this will loop forever otherwise)

So now you've got a list of super-rectangles, each of which has a list of sub-rectangles, each of which is associated with a polygon. Following so far? Now sort the list of super-rectangles by number of rectangles (smallest to largest), and sort each list of sub-rectangles by area (largest to smallest). The goal of both of these sorts is to maximize the probability that you will quickly find one qualifying polygon and be able to kill the search, skipping huge swathes of your M*N search space.

Then, and only then, you get to (forgive non-Perl pseudocode),

for(foo in points) 
  for(bar in superRectangles) 
  if (bar contains foo) 
    for (foobar in subRectangles of bar) 
      if (foobar contains foo) 
         if (polygon of foobar contains foo) 
            do whatever
            loop to next point
[download]

In reply to Re^2: Speeding up point-in-polygon by Anonymous Monk
in thread Speeding up point-in-polygon by punkish

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


No such thing as a small change
	PerlMonks