Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^2: Speeding up point-in-polygon -- take two

by punkish (Priest)
on Aug 28, 2006 at 19:38 UTC ( #570034=note: print w/ replies, xml ) Need Help??


in reply to Re: Speeding up point-in-polygon -- take two
in thread Speeding up point-in-polygon -- take two

If this is a task you're going to do repeatedly, and your polygon list is relatively static, then you should consider redefining the problem. I suspect that the loop and indexing operations my be consuming a considerable amount of your time. So maybe you can change the problem to unroll the loops.
Yes, the polys are relatively static, although the points are not so. Nevertheless, I do indeed unroll both polys and points, and move them into the SQLite db. That way I avoid going back to the files repeatedly.

You are correct about "loop and indexing operations my be consuming a considerable amount of your time". Below is the output from my latest run of 5000 polys using Devel::Profiler. More than 220k points were evaluated and updated, and my point-in-poly test was performed 220k times.

Total Elapsed Time = 251.1784 Seconds            
  User+System Time = 189.3034 Seconds            
Exclusive Times            
%Time  ExclSec  CumulS  #Calls  sec/call    Csec/c    Name
72.3   136.9    136.91  220726  0.0006      0.0006  main::_pointIsInPolygon
9.09    17.21    17.216   5001  0.0034      0.0034  DBI::st::fetchall_arrayref
6.25    11.82    11.825  13887  0.0009      0.0009  DBI::st::execute
4.84     9.162  178.99       1  9.1622    178.99    main::updatePoints
1.82     3.442    3.442   3885  0.0009      0.0009  DBI::db::commit
0.16     0.298    0.298   5001  0.0001      0.0001  DBI::st::fetchrow_array
0.07     0.141    0.141   3887  0           0       DBI::st::finish
0.01     0.01     0.01       3  0.0033      0.0033  IO::Handle::new
0        0        0.01       5  0           0.002   Geo::ShapeFile::get_bytes
0        0        0          6  0           0       DBI::_new_handle
0        0        0.01       5  0           0.002   Geo::ShapeFile::get_handle
0        0        0          1  0           0       DBI::connect
0        0        0          4  0           0       DBI::db::prepare
0        0        0          4  0           0       DBI::_new_sth
0        0        0          3  0           0       Geo::ShapeFile::dbf_handle  
What I really have to figure out is to reduce those tests. I have an idea which I have to now figure out how to implement. Basically, it is like so --

Once I get a bunch of potential points within the rect of a poly, I should find the 4 outermost points in each dimension. Once I determine those four definitively, all points within those 4 would be also within that poly, and I wouldn't have to do the checks for them.

Now the hard part -- realize the above hypothesis.

--

when small people start casting long shadows, it is time to go to bed


Comment on Re^2: Speeding up point-in-polygon -- take two
Re^3: Speeding up point-in-polygon -- take two
by bobf (Monsignor) on Aug 28, 2006 at 21:18 UTC

    Once I get a bunch of potential points within the rect of a poly, I should find the 4 outermost points in each dimension. Once I determine those four definitively, all points within those 4 would be also within that poly, and I wouldn't have to do the checks for them.

    If I understand you correctly, then given the polygon below and the four points marked as dots ("."), wouldn't that method determine that the point marked with an "x" is inside the polygon?

    _________ | . ./ | / | /x | \ | \ | . .\ ---------

    If all of your polygons were guananteed to have internal angles > 90 degrees that might work, but you only said that they were "irregular". I had a similar idea yesterday (except mine was finding the maximum internal bounding rectangle for the polygon rather than using the query points themselves), and discarded it for this reason. :-)

      yes, you are correct. Since I posted the above, I too figured all manner of cases when my assumption wouldn't work. Shucks, and thanks for confirming.
      --

      when small people start casting long shadows, it is time to go to bed
Re^3: Speeding up point-in-polygon -- take two
by roboticus (Canon) on Aug 29, 2006 at 15:19 UTC
    punkish:

    What I really have to figure out is to reduce those tests.
    That's why I made my original suggestion. There are two ways that breaking the polygons into simple shapes will help you:

    • With known shapes, your _pointIsInPolygon function is simpler, so you can use quick exits and/or the compiler can optimize it, and
    • you can increase your density of polygon vs. bounding boxes.
    In your original _pointIsInPolygon, you use variable indices into your vertex arrays, as well as adding a loop with it's inherent termination condition. By breaking your polygons into triangles or quadrilaterals, you can use fixed indices and omit the loop. That may significantly increase the subroutines speed. Also, your algorithm requires that you examine every edge in the polygon. By enforcing an order to the edges, you may be able to return early from the routine by taking advantage of that knowledge.

    Regarding the 'density of polygon vs. bounding boxes': A rectangle aligned with your X and Y axes has a 100% chance of containing a point returned by your SELECT statement. An arbitrary triangle has a 50% chance of containing a point in the bounding rectangle. So breaking your polygons into triangles and (aligned) rectangles will have a worst case coverage of 50%. And a higher coverage density directly correlates to a lower false-positive rate. (I.e., with a higher coverage, you'll need fewer tests because your SELECT will hit fewer bounding boxes.) With arbitrary polygons, you can have nearly *any* hit rate:

    *--------------------* *--------------------* *--------------------* | | \ / |*------------------*| | | \ / || || | ** \ / || || *-------------------* *-------------* ** ** ~99% ~90 ~32%

    While I still think my original suggestion would be interesting, you state:

    Yes, the polys are relatively static, although the points are not so. Nevertheless, I do indeed unroll both polys and points, and move them into the SQLite db. That way I avoid going back to the files repeatedly.
    By this, I'm assuming you mean that the points are in the same rough relative position in the polygons, meaning that (1) the points are connected in the same order, and (2) the edges will never cross. For example, the shape on the left could morph into the shape in the center, but not the one on the right:

    a----g g g / | / \ / \ / | / \ / \ b | == / \ != / \ \ e---f / \ / \ \ \ a-b f a c e c---d / / \/| / c---d / /\| / \ / / b / e d-----f
    In that case, breaking the polygons into X-aligned trapezoids (my original suggestion) may be a bit too restrictive. Perhaps using arbitrary quadrilaterals (or triangles) would give you (a) enough coverage density (i.e.just breaking the shape into quadrilaterals instead of trapezoids may simplify things enough. So you'd increase your coverage, and by breaking the polygons into triangles, you could still simplify your _pointIsInPolygon subroutine to eliminate the loops and variable indices.

    --roboticus

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://570034]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2014-09-19 06:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (131 votes), past polls