Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

OpenStreetMap API : Worker and Plugins architecture

by bliako (Priest)
on May 15, 2019 at 08:47 UTC ( #11100008=perlquestion: print w/replies, xml ) Need Help??

bliako has asked for the wisdom of the Perl Monks concerning the following question:

I am currently interested in fetching OpenStreetMap (OSM) data using Overpass API (see This is data regarding roads and cities: traffic lights, roundabouts, amenities, residential addresses. The wealth of data is amazing, its correctness is also amazing judging from what I can verify with my locality. (I can't really thank enough those who came up with the idea and implemented it and collected the data)

I soon came to the, possibly wrong, conclusion that learning the API is hard. What I noticed is that a lot of people just ask how to implement a specific task, e.g. how to fetch all traffic lights data within a bounding box. And so, I soon gave up on implementing a monolithic package which abstracts the queries, do the fetching and cleans the data or exports it. Instead I wrote it so that it does not know of how to do specific queries (e.g. fetch traffic lights) but will do one if user supplies the query content (via a plugin).

I thought of the following architecture (all modules are in PerlOO):

Implement Geo::OSM::Overpass which deals with all the low level communication with OSM server and fetches data -- provided one supplies it with a query text (queries can be in XML or in their own format - QL).

Implement various plugins which take in an Geo::OSM::Overpass object, through which one can run a query or do an export on its last fetched data. Each plugin will know of its specific query content and just that. E.g. how to fetch traffic lights. It may have parameters for cleaning the data afterwards e.g. selecting only traffic lights on highways (or just implement another plugin to do that). The plugins base class will be Geo::OSM::Overpass::Plugin. Any plugin must use this as its parent class and implement the run() method.

For example, I have implemented the Geo::OSM::Overpass::Plugin::FetchTrafficLights to fetch traffic lights data. And also Geo::OSM::Overpass::Plugin::ParseXML to parse the last-query-result (XML) and convert it to a hashtable.

With this model I can get away with my minimal knowledge of Overpass API and let more knowledgable people create their own plugins for fetching this or that and submit them to CPAN without my mediation. At the same time, I have satisfied my basic and urgent need: fetch traffic lights information, but also keep the code expandable and maintanable for the future rather than making a one-off script.

Contrast this model with the model where all queries are implemented in a single module by the current maintainer via user feature-requests (and then scraping on the forums any information on how to convert it to Overpass query which probably will be sub-optimal anyway if you are not an Overpass whizkid *).

Here is an example:

use Geo::BoundingBox; use Geo::OSM::Overpass; use Geo::OSM::Overpass::Plugin; use Geo::OSM::Overpass::Plugin::FetchTrafficLights; use Geo::OSM::Overpass::Plugin::ParseXML; my $engine = Geo::OSM::Overpass->new(); $engine->verbosity(2); $engine->output_filename('xyz'); my $bbox = Geo::BoundingBox->new(); $bbox->centred_at(30.0, 23.0, 100); # centre at (lat=30,lon=23), 100m +x 100m square box $engine->bounding_box($bbox); my $tfp = Geo::OSM::Overpass::Plugin::FetchTrafficLights->new({'engine +'=>$engine}); # fetch the data $tfp->run() or die; # save XML result to disk $engine->save() or die; # convert XML result to a perl hashtable my $xmlp = Geo::OSM::Overpass::Plugin::ParseXML->new({'engine'=>$engin +e}); my $hashmap_of_traffic_lights = $xmlp->run();

I am looking for comments and suggestions regarding this approach.

bw, bliako

*) After my initial success with fetching traffic lights, I pushed my luck to fetch roundabouts. And there all the weaknesses of the underlying OSM model broke: there are roundabouts, mini_roundabouts and circulars. These are nodes. However, the actual roundabouts are ways: a collection of nodes circling the roundabout structure. So, one has to know that or keep asking (see on how the information comes out bit by bit it was like an interogation) and put the puzzle pieces together. Eventually I found out how and also how to ask for the centre coordinate. The kind bureaucrats on the forum are still debating what a roundabout is ...

Replies are listed 'Best First'.
Re: OpenStreetMap API : Worker and Plugins architecture
by swl (Chaplain) on May 15, 2019 at 21:55 UTC

    You might get some mileage looking at the GDAL implementation of its OSM driver.

    Alternately, the ogr2ogr utility can be used to convert the file to some other format (e.g. here) that you could parse more easily, but you can also read the data directly in perl using Geo::GDAL::FFI::Dataset (providing your GDAL is compiled with OSM support - I have not checked if it is by default).

      I had something different in my mind: just a quick download of some terrain features, for some area, from the OSM website. But you, as I understand it, are suggesting that I download the full OSM data for an area, set up a database, import file into DB via GDAL and enquire DB using the set of tools provided by GDAL. Queries can be made also on the huge file downloaded but that's going to be slow depending on area size.

      Firstly, thanks for hinting - for the second time - about GDAL which this time, made me read a gentle introduction to it (in particular this one:

      Secondly, don't you think that there is scope for what I am proposing if someone does not want to go the database way? For example, when one wants a quick way to search for say the traffic lights in an area in a script.

        I mentioned GDAL in case it is of use, as it covers a huge number of formats (and thus has a correspondingly large code base). You might be able to glean something from its source code for the OSM driver.

        A quick download of one feature type might be possible, but to be honest I've never tried either a full or partial download of these data (hence my answer did not have any code examples). If you do download all features for a region then you should not need to set up a database, though, as you can access the contents of the file directly using Geo::GDAL::FFI::Dataset and Geo::GDAL::FFI::Layer methods, including running SQL queries over the data (see ExecuteSQL in the Geo::GDAL::Dataset::Dataset docs). The Synopsis for Geo::GDAL::FFI has an example of how to load the data, copied below.

        I would definitely be interested in seeing something implemented in Perl to access these data, and can help with testing if you need it.

        # from the Geo::GDAL::FFI synopsis, # it should detect the file type automatically (untested) use Geo::GDAL::FFI qw/Open/; my $layer = Open('test.shp')->GetLayer; $layer->ResetReading; while (my $feature = $layer->GetNextFeature) { my $value = $feature->GetField('name'); my $geom = $feature->GetGeomField; say $value, ' ', $geom->AsText; }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11100008]
Front-paged by Discipulus
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2019-05-25 05:53 GMT
Find Nodes?
    Voting Booth?
    Do you enjoy 3D movies?

    Results (151 votes). Check out past polls.

    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!