in reply to Re: Zipcode Proximity script in thread Zipcode Proximity script
Here's a way to do all that prefiltering people keep mentioning:
First, some helper procedures:
use constant EARTH_RADIUS => 3956;
use constant PI => 4 * atan2 1, 1;
sub make_quick_reject {
# Args: lat (radians), long (radians), dist
my($lat,$long,$dist) = @_;
my($distcos) = cos($dist / EARTH_RADIUS);
my($maxlatdiff) = atan2(sqrt(1  ($distcos * $distcos)), $distcos);
if (abs($lat) + $maxlatdiff > PI/2) {
# special case  near poles
return sub { abs($lat  $_[0]) > $maxlatdiff; };
}
my($longmult);
$longmult = 1 / cos(abs($lat) + $maxlatdiff);
my($maxlongdiff) = $maxlatdiff * $longmult;
return sub { abs($lat  $_[0]) > $maxlatdiff or abs($long  $_[1]) >
+ $maxlongdiff; }
}
sub make_quick_accept {
# Args: lat (radians), long (radians), dist
my($lat,$long,$dist) = @_;
my($distcos) = cos($dist / EARTH_RADIUS);
my($maxlatdiff) = atan2(sqrt(1  ($distcos * $distcos)), $distcos) /
+ sqrt(2);
my($longmult);
if (abs($lat) < $maxlatdiff ) {
# special case  near equator
$longmult = 1;
} else {
$longmult = 1 / cos(abs($lat)  $maxlatdiff);
}
my($maxlongdiff) = $maxlatdiff * $longmult;
return sub { abs($lat  $_[0]) < $maxlatdiff and
abs($long  $_[1]) < $maxlongdiff; }
}
Then, I assume that you've loaded your data into something like this:
# zipcode, lat, long
@data = (['00210', 43.00589, 71.0132], ['00211', 43.00589, 71.0132], .
+..);
And then done something like this to it:
map { $_>[1] *= PI/180; $_>[2] *= PI / 180;} @data;
Then you can do something like this:
$distance = 25;
for $i (0..$#data) {
my $quick_reject_routine =
make_quick_reject($data[$i]>[1], $data[$i]>[2], $distance);
my $quick_accept_routine =
make_quick_accept($data[$i]>[1], $data[$i]>[2], $distance);
for $record (grep(! $quick_reject_routine>($_>[1],$_>[2]),
@data[$i+1..$#data])) {
if ($quick_accept_routine>($record>[1],$record>[2])
or
long_acceptance_routine($record, $data[$i], $distance) {
# write it to some file or other
}
}
}
Where for "long_calculation_routine", you substitute in essentially what you're doing now.
(or better yet, something analogous to the simpler C code suggested in the post I made
before logging in)
Extending this to the case of checking several distances at once shouldn't be hard  just
construct an array of quick reject and quick accept procedures, and then check through
them in order before trying the long acceptance routine.
Something like this:
@distances = qw(5 10 25 50 100);
@distfiles = map { "0$_.txt" } @distances;
for $i (0..$#data) {
my @quick_reject_routines =
map {
make_quick_reject($data[$i]>[1], $data[$i]>[2], $_);
}
@distances;
my @quick_accept_routines =
map {
make_quick_accept($data[$i]>[1], $data[$i]>[2], $_);
}
@distances;
OTHERCITY:
for $j ($i..$#data)) {
DISTANCE:
for $d (0..$#distances) {
if ($quick_reject_routines[$d]>($data[$j][1],$data[$j][2])) {
next DISTANCE;
}
if ($quick_accept_routines[$d]>($data[$j][1],$data[$j][2])
or
long_acceptance_routine($data[$j], $data[$i], $distance) {
map { write_csv_line($data[$i], $data[$j], $_);
write_csv_line($data[$j], $data[$i], $_); }
@distfiles[$d..$#distances];
next OTHERCITY;
}
}
}
}
Of course, this still contains extraneous calls to long_acceptance_routine, but my hope is
that the routines long_acceptance_routine depends on will somehow be caching recent results
(see the Memoize modules) and so there won't be the performance penalty.
Re: Zipcode Proximity script by fizbin (Chaplain) on Apr 01, 2003 at 15:32 UTC 
Whoops. This is what I get for coding after my brain has already shut down for nightly maintenance.
Those quick_accept and quick_reject routines will give false rejections with longitudes near the 180 degree longitude. (not an issue with US locations, I know, but...) Also, there's a stupid bit of timewasting going on near the top there.
Let's see if I can get it right this time:
sub make_quick_reject {
# Args: lat (radians), long (radians), dist
my($lat,$long,$dist) = @_;
my($maxlatdiff) = $dist / EARTH_RADIUS;
if (abs($lat) + $maxlatdiff > PI/2) {
# special case  near poles
return sub { abs($lat  $_[0]) > $maxlatdiff; };
}
my($longmult);
$longmult = 1 / cos(abs($lat) + $maxlatdiff);
my($maxlongdiff) = $maxlatdiff * $longmult;
return sub { abs($lat  $_[0]) > $maxlatdiff or
( abs($long  $_[1]) > $maxlongdiff and
abs($long  $_[1]) < 2*PI  $maxlongdiff ); };
}
sub make_quick_accept {
# Args: lat (radians), long (radians), dist
my($lat,$long,$dist) = @_;
my($maxlatdiff) = $dist / EARTH_RADIUS;
$maxlatdiff /= sqrt(2);
my($longmult);
if (abs($lat) < $maxlatdiff ) {
# special case  near equator
$longmult = 1;
} else {
$longmult = 1 / cos(abs($lat)  $maxlatdiff);
}
my($maxlongdiff) = $maxlatdiff * $longmult;
return sub { abs($lat  $_[0]) < $maxlatdiff and
( abs($long  $_[1]) < $maxlongdiff or
abs($long  $_[1]) > 2*PI  $maxlongdiff ); };
}
And just again as a reminder, the sub's produced by these two routines accept longitudes and latitudes in RADIANS, not degrees. (Though reworking them to use degrees instead of radians should not be difficult at all)
And if anyone uses this code without understanding how it works, they deserve what they get. No warranty of anything useful implied at all; the code may not even pass perl c.  [reply] [d/l] 
Re^3: Zipcode Proximity script by Anonymous Monk on Apr 18, 2010 at 06:33 UTC 
instead of my $maxlatdiff = atan2(sqrt(1  ($distcos * $distcos)), $distcos); you could do use Math::Trig;
my $maxlatdiff = acos($distcos);
 [reply] [d/l] [select] 
