Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Building a local 'ppm' repository (Windows)

by Scott7477 (Chaplain)
on Apr 04, 2007 at 22:30 UTC ( [id://608381]=note: print w/replies, xml ) Need Help??


in reply to Building a local 'ppm' repository (Windows)

Your post reminded me of a problem which I have been trying to solve involving extracting URL's pointing to a specific filetype (say a gz archive) from a web page. It turns out that at CPAN there is a page which contains an alphabetical list of all modules, with a hyperlink to the tar.gz file of each module.

The following code (given appropriate substitution of the command line input; ie gz for pdf) will create a text file with all of the hyperlinks to the tar.gz files:
use strict; use LWP::Simple; use HTML::SimpleLinkExtor; #usage getfileoftype http://www.example.com pdf > urllist.txt my $url = shift; my $filetype = shift; my $filetypelen = length($filetype); my $offset = -$filetypelen; #print $filetypelen."\n"; #print $offset."\n"; my $fileget = getstore($url,"tempfile.html"); my $extor = HTML::SimpleLinkExtor->new(); $extor->parse_file("tempfile.html"); my @a_hrefs = $extor->a; for my $element (@a_hrefs) { # print $element; # print "\n"; my $suffix = substr($element,$offset,$filetypelen); # print $suffix; # print "\n"; if ($suffix =~ m/$filetype/){ print $element; print "\n"; } }
Once you have that, you can then use the following code to automatically download all of the modules if you so choose, or whatever subset of the modules you wish to extract from the text file created by the above code:
use strict; use LWP::Simple; use File::Basename; open (DATA, "urllist.txt") || die "File open failure!"; while (my $downloadurl = <DATA>){ (my $name, my $path, my $suffix) = fileparse($downloadurl); my $finurl = $downloadurl; print $finurl."\n"; my $savefilename = $name.$suffix; print $savefilename; print "\n"; my $status = getstore($finurl,$savefilename); print $status."\n" }
Both pieces of code work nicely on my WinXP box. Yes, I know that "tempfile.html" gets clobbered, but I was just glad to get this code working, and WinXP doesn't seem to care. In any case, you can now generate a local repository of modules. Hope this helps. Suggestions for improvement in my code are welcome!

Replies are listed 'Best First'.
Re^2: Building a local 'ppm' repository (Windows)
by LittleGreyCat (Scribe) on Nov 05, 2008 at 13:02 UTC
    "Your post reminded me of a problem which I have been trying to solve involving extracting URL's pointing to a specific filetype (say a gz archive) from a web page. It turns out that at CPAN there is a page which contains an alphabetical list of all modules, with a hyperlink to the tar.gz file of each module. "

    Back on the subject after a long break!

    Unfortunately, the repository for use by 'ppm' requires a layer above the '.tar.gz' files - a directory of '.ppd' files which describe the packages.

    Therefore I have to create this structure locally to have a local repository.

    I haven't yet found a simpler way than downloading the '.zip' files each of which which contains a '.ppd' file plus a '.tar.gz' file with a subsidiary path name.

    Thanks for the code, though - it could form a basis for downloading all the '.zip' files from the ActiveState website.

    This would in turn allow the creation of an 'all_in_one' utility to download all the '.zip' files from the website, unpack them into the repository, and create the 'package.lst' file.

    The local repository could then be copied to CD and distributed as an off-line 'ppm' repository.

    Cheers

    Dave R

    Nothing succeeds like a budgie with no teeth.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://608381]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-03-29 12:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found