Reading file into an array and working with it.

Speedfreak has asked for the wisdom of the Perl Monks concerning the following question:

Hej All,

One day I am going to get this nailed but until then, can someone please guide me in the ways...

Heres the problem:

I have a CSV file which contains a list of places plus there position on the planet in Latitude and Longtitude stored as decimal.

Now, the prinicpal is to select one of these locations and return a list of places within a defined square area of a certain size.

First problem is that I have to open this CSV file and search it for the place selected to get its Lat/Lon pair. I then have to re-search it for all the places who's position is close based on the search area.

My theory was to create a 2 dimensional array (sort of rows/cols if you will), by reading lines in, splitting them by the delimeter and putting the values in the second dimension. I would then index a line by the first.

However, reading my Perl docs and books I have, I cant seem to find how to do this.

However I do it I need to complete the following steps.

1) Open the CSV file and pull it into an array.
2) Go through each element of the array to find a matching place name and then get its Lat/Lon from a column on that row.
3) Re-search the array for places who's position lies within a predefined area around the chosen place.

Can anyone advise me on this? I am completely lost and dont want to have to parse a file from disk twice and putting it into an array and searching it twice seems to me to take less disk overhead than using a simple SQL query.

- Jed

Comment on Reading file into an array and working with it.

Replies are listed 'Best First'.
Re: Reading file into an array and working with it. by httptech (Chaplain) on Jun 25, 2000 at 21:10 UTC
What I would do is use Text::CSV to do the actual parsing, then push each array returned as an array reference onto a master array of locations: `use Text::CSV; my @locations; my $csv = Text::CSV->new(); open (FILE, "locations.csv") or die "Couldn't open location file: $!"; while (<FILE>) { $csv->parse($_); push(@locations, [$csv->fields]); } close FILE;` [download]	[reply] [d/l]
RE: Re: Reading file into an array and working with it. by davorg (Chancellor) on Jun 26, 2000 at 12:25 UTC
Might be worth pointing out that on CPAN there is now a `Text::CSV_XS` which, as its name implies, does the parsing in C code. It is therefore much faster than the pure Perl implementation in `Text::CSV`. -- <http://www.dave.org.uk> European Perl Conference - Sept 22/24 2000 <http://www.yapc.org/Europe/>	[reply]
RE: RE: Re: Reading file into an array and working with it. by httptech (Chaplain) on Jun 26, 2000 at 16:29 UTC
Hmm, I wonder if DBD::CSV could use it as a backend... and whether it would speed DBD:CSV up, because it is pretty slow.	[reply]
RE: RE: RE: Re: Reading file into an array and working with it. by davorg (Chancellor) on Jun 26, 2000 at 16:44 UTC
RE: RE: RE: RE: Re: Reading file into an array and working with it. by httptech (Chaplain) on Jun 26, 2000 at 16:53 UTC
Some notes below your chosen depth have not been shown here
Re: Reading file into an array and working with it. by chromatic (Archbishop) on Jun 25, 2000 at 21:17 UTC
Anytime you find yourself iterating through an array, looking for a specific value, you should stop and ask "Would a hash be better here?" What I would do: Open the file. For each line, split it into $name and $location. Put the data into two hashes -- one in the format $name => $location, the other $location => $name. Close the file. Look up the location by name from the first hash. Write a couple of loops to cover the coordinates of the preefined area around the place (add to and subtract from the Lat/Lon values). Look up names (if they exist) by location in the second hash. You might also look at the DBM file modules included with Perl, like DB_File, GDBM_File, NDBM_File, and ODBM_File. Also be aware that parsing many CSV files with a regex (even in a split statement) is tricky, so Text::CSV may come in handy.	[reply]
Re: Reading file into an array and working with it. by davorg (Chancellor) on Jun 25, 2000 at 21:16 UTC
Sounds like to need to look at `perldoc perllol` and `perldoc perdsc`, both of which cover this in some detail. In summary you'd do something like this (assuming your delimiter is a tab): `my @data; open(DATA, $file) \|\| die "Can't open $file: $!\n"; while (<DATA>) { push @data, [split /\t/]; }` [download] You would then access the various elements like this: `my $town = 'London'; my ($lat, $long); foreach (@data) { if $_->[0] eq $town; ($lat, $long) = ($_->[1], $_->[2]); last; }` [download] However, if this is how you are going to be using the data, then you might be better off building a hash of arrays, where the key to the hash is the town name and the value is a two element list containing lat and long. You would construct that something like this: `my %data; open(DATA, $file) \|\| die "Can't open $file: $!\n"; while (<DATA>) { my ($town, @vals) = split(/\t/); $data{$town} = \@vals; }` [download] you could then get the lat and long for a particular town like this: `my ($lat, $long) = @{$data{$town}};` [download] Hope this helps -- <http://www.dave.org.uk> European Perl Conference - Sept 22/24 2000 <http://www.yapc.org/Europe/>	[reply] [d/l] [select]
Re: Reading file into an array and working with it. by Ovid (Cardinal) on Jun 25, 2000 at 22:35 UTC
Well, my answer is probably overkill, but I found the problem so much fun that I wrote the code for it. This may not be the most efficient way of getting your answer, but it's what I came up with. Any optimization advice would be great! Here's the sample data file I created: `Mt. Wrangell,AK,62N,144W Hico,TX,32N,99W Neotsu,OR,45N,124W Applegate,CA,39N,121W Arbuckle,CA,39N,122W Lakeport,CA,39N,123W Hot Springs,CA,40N,121W` [download] Here's the program: #!/usr/bin/perl -w use strict; my $data = "test.txt"; # here is the raw latitude/longitude + data my ($xlat, $xlon) = qw(39N 122W); # here's what we'll search for, +/- +$variance my (%lat_lon, @lat, @lon, @final_lat, @final_lon, %dups); my $variance = 1; # change this to the degree variance + desired open (DATA, "<$data") \|\| die "Can't open $data for reading: $!\n"; while (<DATA>) { chomp; my ($city, $state, $lat, $lon) = split /,/; $lat_lon{$lat}{$lon}->[0] = $city; $lat_lon{$lat}{$lon}->[1] = $state; } close (DATA) \|\| die "Can't close $data: $!\n"; # find all lats which are +/- $variance of target lat foreach my $lat_key (keys %lat_lon) { my ($lat, $lat_NS, $xlat_NS); $lat = $1, $lat_NS = $2 if $lat_key =~ /^(\d{1,3})([NS])$/o; $xlat_NS = $2 if $xlat =~ /^(\d{1,3})([NS])$/o; if ($xlat_NS eq $lat_NS) { push (@lat, $lat_key) if ($1 <= $lat + $variance) && ($1 >= $l +at - $variance); } } # find all lons which are +/- $variance of target lon foreach my $good_lat (@lat) { foreach my $lon_key (keys %{$lat_lon{$good_lat}}) { my ($lon, $lon_WE, $xlon_WE); $lon = $1, $lon_WE = $2 if $lon_key =~ /^(\d{1,3})([WE])$/o; $xlon_WE = $2 if $xlon =~ /^(\d{1,3})([WE])$/o; if ($xlon_WE eq $lon_WE) { push (@lon, $lon_key) if ($1 <= $lon + $variance) && ($1 > += $lon - $variance); } } } # remove duplicate latitudes and longitudes foreach (@lat) { push (@final_lat, $_) unless $dups{$_}++; } foreach (@lon) { push (@final_lon, $_) unless $dups{$_}++; } [download] The two arrays, @final_lat and @final_lon, will contain all of the unique latitudes and longitudes which are within $variance of your target latitude and longitude. Incidentally, the information that you were looking for regarding multi-dimensional arrays is found in Programming Perl, by O'Reilly Books, Second Edition, starting on page 257. You'll want to read through that to see what I was doing with a hash of hashes of arrays. Cheers! Update: As mentioned in previous answers, you'll want to check out Text::CSV. My example above will fail if you have a data field with an embedded comma: `"Some city, County", State, 45N, 120W` [download] The quotes will also cause problems if you're trying to eliminate them.	[reply] [d/l] [select]
Re: Reading file into an array and working with it. by merlyn (Sage) on Jun 26, 2000 at 02:47 UTC
Check out DBD::RAM - SQL on a CSV file! -- Randal L. Schwartz, Perl hacker	[reply]
Re: Reading file into an array and working with it. by lhoward (Vicar) on Jun 26, 2000 at 02:27 UTC
You should really consider using a database for this kind of thing if you're going to be doing it more than occasionally. I implemented a "location search" on a website that I developed using this methodology and it works great. find latitude/longitude of the location the searcher is interested in determine latitude/longitude of "bounding box" that bounds an area N miles from the latitude/longitude of the search location be sure to take the curvature of the earth into account when doing this computation. select all location of interest from the DB with a simple SQL query: `select * from LOCATION where LATITUDE between LATMIN and LATMAX and LONGITUDE between LONMIN and LONMAX`	[reply]

Back to Seekers of Perl Wisdom