go ahead... be a heretic PerlMonks

### Re: Re: Zipcode Proximity script

by fizbin (Chaplain)
 on Apr 01, 2003 at 04:33 UTC ( #247145=note: print w/replies, xml ) Need Help??

in reply to Re: Zipcode Proximity script

Here's a way to do all that prefiltering people keep mentioning: First, some helper procedures:
```use constant EARTH_RADIUS => 3956;
use constant PI           => 4 * atan2 1, 1;

sub make_quick_reject {
my(\$lat,\$long,\$dist) = @_;
my(\$maxlatdiff) = atan2(sqrt(1 - (\$distcos * \$distcos)), \$distcos);
if (abs(\$lat) + \$maxlatdiff > PI/2) {
# special case - near poles
return sub { abs(\$lat - \$_[0]) > \$maxlatdiff; };
}
my(\$longmult);
\$longmult = 1 / cos(abs(\$lat) + \$maxlatdiff);
my(\$maxlongdiff) = \$maxlatdiff * \$longmult;
return sub { abs(\$lat - \$_[0]) > \$maxlatdiff or abs(\$long - \$_[1]) >
+ \$maxlongdiff; }
}

sub make_quick_accept {
my(\$lat,\$long,\$dist) = @_;
my(\$maxlatdiff) = atan2(sqrt(1 - (\$distcos * \$distcos)), \$distcos) /
+ sqrt(2);
my(\$longmult);
if (abs(\$lat) < \$maxlatdiff ) {
# special case - near equator
\$longmult = 1;
} else {
\$longmult = 1 / cos(abs(\$lat) - \$maxlatdiff);
}
my(\$maxlongdiff) = \$maxlatdiff * \$longmult;
return sub { abs(\$lat - \$_[0]) < \$maxlatdiff and
abs(\$long - \$_[1]) < \$maxlongdiff; }
}
Then, I assume that you've loaded your data into something like this:
```# zipcode, lat, long
@data = (['00210', 43.00589, 71.0132], ['00211', 43.00589, 71.0132], .
+..);
And then done something like this to it:
```map { \$_->[1] *= PI/180; \$_->[2] *= PI / 180;} @data;
Then you can do something like this:
```\$distance = 25;

for \$i (0..\$#data) {
my \$quick_reject_routine =
make_quick_reject(\$data[\$i]->[1], \$data[\$i]->[2], \$distance);
my \$quick_accept_routine =
make_quick_accept(\$data[\$i]->[1], \$data[\$i]->[2], \$distance);
for \$record (grep(! \$quick_reject_routine->(\$_->[1],\$_->[2]),
@data[\$i+1..\$#data])) {
if (\$quick_accept_routine->(\$record->[1],\$record->[2])
or
long_acceptance_routine(\$record, \$data[\$i], \$distance) {
# write it to some file or other
}
}
}
Where for "long_calculation_routine", you substitute in essentially what you're doing now. (or better yet, something analogous to the simpler C code suggested in the post I made before logging in) Extending this to the case of checking several distances at once shouldn't be hard - just construct an array of quick reject and quick accept procedures, and then check through them in order before trying the long acceptance routine. Something like this:
```@distances = qw(5 10 25 50 100);
@distfiles = map { "0-\$_.txt" } @distances;

for \$i (0..\$#data) {
my @quick_reject_routines =
map {
make_quick_reject(\$data[\$i]->[1], \$data[\$i]->[2], \$_);
}
@distances;
my @quick_accept_routines =
map {
make_quick_accept(\$data[\$i]->[1], \$data[\$i]->[2], \$_);
}
@distances;
OTHERCITY:
for \$j (\$i..\$#data)) {
DISTANCE:
for \$d (0..\$#distances) {
if (\$quick_reject_routines[\$d]->(\$data[\$j][1],\$data[\$j][2])) {
next DISTANCE;
}
if (\$quick_accept_routines[\$d]->(\$data[\$j][1],\$data[\$j][2])
or
long_acceptance_routine(\$data[\$j], \$data[\$i], \$distance) {
map { write_csv_line(\$data[\$i], \$data[\$j], \$_);
write_csv_line(\$data[\$j], \$data[\$i], \$_); }
@distfiles[\$d..\$#distances];
next OTHERCITY;
}
}
}
}
Of course, this still contains extraneous calls to long_acceptance_routine, but my hope is that the routines long_acceptance_routine depends on will somehow be caching recent results (see the Memoize modules) and so there won't be the performance penalty.

Replies are listed 'Best First'.
Re: Zipcode Proximity script
by fizbin (Chaplain) on Apr 01, 2003 at 15:32 UTC

Whoops. This is what I get for coding after my brain has already shut down for nightly maintenance.

Those quick_accept and quick_reject routines will give false rejections with longitudes near the 180 degree longitude. (not an issue with US locations, I know, but...) Also, there's a stupid bit of time-wasting going on near the top there.

Let's see if I can get it right this time:

```sub make_quick_reject {
my(\$lat,\$long,\$dist) = @_;
if (abs(\$lat) + \$maxlatdiff > PI/2) {
# special case - near poles
return sub { abs(\$lat - \$_[0]) > \$maxlatdiff; };
}
my(\$longmult);
\$longmult = 1 / cos(abs(\$lat) + \$maxlatdiff);
my(\$maxlongdiff) = \$maxlatdiff * \$longmult;
return sub { abs(\$lat - \$_[0]) > \$maxlatdiff or
( abs(\$long - \$_[1]) > \$maxlongdiff and
abs(\$long - \$_[1]) < 2*PI - \$maxlongdiff ); };
}

sub make_quick_accept {
my(\$lat,\$long,\$dist) = @_;
\$maxlatdiff /= sqrt(2);
my(\$longmult);
if (abs(\$lat) < \$maxlatdiff ) {
# special case - near equator
\$longmult = 1;
} else {
\$longmult = 1 / cos(abs(\$lat) - \$maxlatdiff);
}
my(\$maxlongdiff) = \$maxlatdiff * \$longmult;
return sub { abs(\$lat - \$_[0]) < \$maxlatdiff and
( abs(\$long - \$_[1]) < \$maxlongdiff or
abs(\$long - \$_[1]) > 2*PI - \$maxlongdiff ); };
}

And just again as a reminder, the sub's produced by these two routines accept longitudes and latitudes in RADIANS, not degrees. (Though reworking them to use degrees instead of radians should not be difficult at all)

And if anyone uses this code without understanding how it works, they deserve what they get. No warranty of anything useful implied at all; the code may not even pass perl -c.

Re^3: Zipcode Proximity script
by Anonymous Monk on Apr 18, 2010 at 06:33 UTC
instead of my \$maxlatdiff = atan2(sqrt(1 - (\$distcos * \$distcos)), \$distcos); you could do
```use Math::Trig;
my \$maxlatdiff = acos(\$distcos);

Create A New User
Node Status?
node history
Node Type: note [id://247145]
help
Chatterbox?
 [Discipulus]: yes erix when you specify a Tk geometry 10x10 it goes from 0,0 to 10,10 aka 11x11

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2018-04-24 09:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My travels bear the most uncanny semblance to ...

Results (85 votes). Check out past polls.

Notices?