Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

A preview of DPAN

by brian_d_foy (Abbot)
on Nov 11, 2008 at 12:03 UTC ( #722831=perlmeditation: print w/ replies, xml ) Need Help??

...This is a preview just for Perlmonks of the continuing work I'm doing with my BackPAN indexer....

MyCPAN can now create CPAN-like directories out of a directory of distributions. Run a script then point CPAN.pm at your directory to use it as your CPAN source. This worked was sponsored by a customer at the day job (and talk to me if you can convince your boss that this might be something worthwhile to sponsor too).

Previously, you could do this task with a minicpan and CPAN::Mini::Inject. You kept two repositories. You updated minicpan, which undid all of your private stuff, then you re-injected everything. CPAN::Mini::Inject then updated the modules/02package.details.txt.gz and CHECKSUMS files. That's fine if you're injecting a few things.

My task is to create a CPAN-like structure of stuff that is mostly not on CPAN, or when nothing in the private CPAN comes from the real CPAN. We've been calling this "DPAN", for DarkPAN. You don't have to worry about what's in a distro or which author it should belong too, and you don't have parallel directories. Just dump a bunch of distros in a directory. Those might be private modules, CPAN modules, forked modules, vendor modules, and so on. DPAN doesn't care. Just dump them in a directory.

MyCPAN::Indexer pulls out all of the information and turns the source directory into something that the CPAN tools can understand.

You start with MyCPAN::Indexer. It's still in development, so some things are a little rough. Install it or get it from Github. Install the dependencies.

Inside MyCPAN::Indexer is an examples/ directory with a bunch of junk in it. You want the dpan script.

% perl examples/dpan my_modules_dir/

With the defaults, this looks for all distributions under my_modules_dir, collects information about each and puts it in the indexer_reports/ directory. It then goes through all of the reports and collects the information it needs for the CPAN index files. Finally, in my_modules_dir/ it creates the modules/ directory with the index files the CPAN tools need and puts a CHECKSUMS file in each directory that has distributions in it. You can now point CPAN.pm to this directory and install directly from it.

There are a couple of things to watch out for:

  • It indexes everything it finds, so if you have multiple versions of a distribution, they all end up in 02packages.details.txt.gz. Fixing that is on the To Do list, but not too important for my purposes right now.
  • With CPAN.pm, you can have any directory structure you like. So far, we've had to keep the authors/id/X/XX/XXXX directory structure for CPANPLUS.
  • If you try to install a module and CPAN.pm does not find it or one of its dependencies from any source in urllist, it falls back to some internal URLs. I don't know what CPANPLUS does.
  • On Strawberry Perl, Archive::Extract complains about not being able to extract the dist when it worked just fine.

The lastest version of my cpan script might help you. You can dump and load configs without fooling with the shell. The -J (capital J) will dump the current config to STDOUT. It's the same format as CPAN::Config:

% cpan -J > MyCPANConfig.pm

Edit that file how you like. I change the urllist.

I have several versions for testing different things. If I want to install Foo::Bar with my DPAN config pointing to my DarkPAN, I load the right configuration with -j (lowercase j):

% cpan -j DPANConfig.pm Foo::Bar

Now, I've said that DPAN is for DarkPAN, but it's also for another thing I want to do: DistributedPAN. If you look in 02packages.details.txt, you'll see lines like:

Foo::Bar 1.23 B/BD/BDFOY/Foo-Bar-1.23.tgz

When I created CPAN::PackageDetails to play with this, we discovered that CPAN.pm will happily deal with absolute paths there. The distributions files could be anywhere:

Foo::Bar 1.23 /usr/local/dpan/Foo-Bar-1.23.tgz Bar::Baz 2.45 /home/brian/dists/Bar-Baz-2.45

Once I started thinking about that, I wanted to make it so the files don't even have to be local:

Foo::Bar 1.23 /usr/local/dpan/Foo-Bar-1.23.tgz Bar::Baz 2.45 http://www.example.com/dists/Bar-Baz-2.45 Quux 2.45 http://www.cpan.org/authors/id/B/BD/BDFOY/Quux-2 +.45

Once that third column handling is refactored into a general URl or file fetcher, things get more interesting. I haven't looked at what that might take in CPAN.pm though.

And, since I was writing CPAN::PackageDetails, I wanted to support another possible format. This one has a column for the author and might list the same namespace several times

Foo::Bar BDFOY 1.23 B/BD/BDFOY/Foo-Bar-1.23.tgz Foo::Bar BDFOY 2.01 /home/brian/dists/Foo-Bar-2.01.tgz Foo::Bar SNUFFY 1.24 http://www.example.com/dist/Foo-Bar.tgz

Remember Synopsis 11? Perl 6 supports not only version restrictions on loading a module, but loading the same module from different authors:

use Dog:ver(Any):auth(Any); use Dog:ver(Any):auth<cpan:BDFOY>; use Dog:ver<1.2.1>:auth(Any); use Dog:ver(1.2.1..1.2.3);

With a change to 02packages.details.txt, the CPAN tools can support this too.

Not to worry though. That's just something fun to think about right now. Once the rest of DPAN seems stable, I can start adding cool features like that.

--
brian d foy <brian@stonehenge.com>
Subscribe to The Perl Review

Comment on A preview of DPAN
Select or Download Code
Re: A preview of DPAN
by CountZero (Bishop) on Nov 11, 2008 at 19:21 UTC
    This one has a column for the author and might list the same namespace several times
    Wouldn't it be better if the author-column is added to the end of each record, rather than as the second column? Programs who do not use this info then only look at the first three columns and forget about the last one. That would cause less breakage probably.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://722831]
Approved by bingos
Front-paged by bingos
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (7)
As of 2014-12-29 12:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (187 votes), past polls