Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Monks,

(Cross-posted from module-authors, because perl.org seems to be having difficulties today).

I have written a module that deals with France's INSEE codes, which allows one to look up postcodes and stuff like that. I've been toying with Geography::FR::Postcode as a name. (any other ideas?)

The thing is, it relies on a text file that is 750KiB zipped, updated periodically. So I'm looking at a reader package that knows how to pick apart a certain format (or formats) of the data file and answer questions (for instance, what towns have the postcode 66100). Reading the unzipped file on each run and producing hashes takes about a second, which is good enough for a first version.

One problem is that the INSEE web site doesn't make it easy to predict what the new filename will be, so I can't fetch the data from INSEE during the installation process. And I would like to avoid wrapping it up as a CPAN module. So I create another package, that contains a solitary package variable that contains the URI that points to the data file on INSEE's web site, and I just update that when new versions come out.

Something like this:

Geography::FR::Postcode

depends on
Geography::FR::Postcode::Data

Installing Geography::FR::Postcode forces the dependency on GGeography::FR::Postcode::Data to be resolved first. So Data is downloaded and as part of its installation process, the file is downloaded and installed somewhere on the local system.

I suppose it will default to the site_perl directory if run in batch mode, but interactive installations allow the directory to be specified. OS distribution maintainers may wish to override the default (how? an environment variable à la PERL_G_F_P_PATH=/usr/local/share/doc/insee?)

After Geography::FR::Postcode::Data is installed, the installation of Geography::FR::Postcode goes forward (waving hands: knowing where Data put the damned file).

Next year, a new version of the INSEE file comes out. I test, and see that the current reader code can deal with it. I release a new version of Geography::FR::Postcode::Data. The client sees that there is an update for this, and installs it. New data file, everyone happy. (Assuming the installation causes the new file to overwrite the old one, otherwise Postcode will continue to run with the old file).

The following year, a new version comes out, and surprise! they've added a new column in the file. So I release a new version of Geography::FR::Postcode as well, that knows how to read both formats, and a new version of Geography::FR::Postcode::Data.

Does that sound sane? Does anyone have some pointers on how to deal with the placement of datafiles on the local system with one module, and having the other module know where to find them?

Or am I making this unnecessarily complicated? (I could just bundle the data file with the distribution, but the size of the data file, and the probability that the format is unlikely to change invites the above approach).

• another intruder with the mooring in the heart of the Perl


In reply to Modules dealing with data files by grinder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-04-24 02:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found