Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: searching polygons not merged

by erix (Parson)
on Oct 27, 2018 at 20:27 UTC ( #1224777=note: print w/replies, xml ) Need Help??


in reply to searching polygons not merged

How can I generate an example polygon collection? (i.e., what does "10000 EA" mean?)

I wouldn't be surprised if searching for overlapping polygons in a database of indexed polygons would be fast. If you had an example of "not that good performance and stupid way", I could see if a database search would indeed be better and "less stupid".

Replies are listed 'Best First'.
Re^2: searching polygons not merged
by LanX (Archbishop) on Oct 27, 2018 at 21:23 UTC
    How do you represent polygons in a database, such that searching for overlaps becomes "fast"?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

      To illustrate polygon (a standard postgresql data type), polygon-comparison (here: overlap), and polygon-indexing (with which my practical experience is pretty much zero - caveat emptor!), I lifted some sql from the standard regression tests in the postgres source tree ( src/test/regress/sql/polygon.sql ), and messed about with it a bit, and added comments:

      Note: the postgres polygon overlap operator is &&

      #!/bin/bash echo " drop table if exists quad_poly_tbl ; create table quad_poly_tbl (id int, p polygon); insert into quad_poly_tbl select (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + +(x + y) % 10)) from generate_series(1, 100) x, generate_series(1, 100) y ; insert into quad_poly_tbl select i, polygon '((200, 300),(210, 310),(230, 290))' from generate_series(10001, 11000) AS i ; analyze quad_poly_tbl; -- search for overlap with this polygon: select * from quad_poly_tbl where p && '((22,640),(23.0717967697245,64 +4),(26,646.928203230275),(30,648),(34,646.928203230275),(36.928203230 +2755,644),(38,640))'::polygon ; --> Time: 1.382 ms -- Seq Scan on quad_poly_tbl (3 MB) -- now add index: create index quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p); -- search again for overlap with this polygon but now WITH the polygon +-index present: select * from quad_poly_tbl where p && '((22,640),(23.0717967697245,64 +4),(26,646.928203230275),(30,648),(34,646.928203230275),(36.928203230 +2755,644),(38,640))'::polygon ; --> Time: 0.271 ms -- Bitmap Index Scan on quad_poly_tbl_idx (1 MB) " | psql -qa

      So the difference between seqscan and polygon-index in this test (searching for 6 matching rows in a table of 11000 rows) is:

      --> Time: 1.382 ms -- Seq Scan on quad_poly_tbl (3 MB) --> Time: 0.271 ms -- Bitmap Index Scan on quad_poly_tbl_idx (1 MB)

      For the OP's question, of course, loading data into the database, etc., should be taken into account.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1224777]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (9)
As of 2019-11-12 16:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Strict and warnings: which comes first?



    Results (66 votes). Check out past polls.

    Notices?