Beefy Boxes and Bandwidth Generously Provided by pair Networks Frank
Don't ask to ask, just ask
 
PerlMonks  

Presenting a local listing of remote files over HTTP

by hacker (Priest)
on Jun 30, 2002 at 16:32 UTC ( #178387=snippet: print w/ replies, xml ) Need Help??

Description: This will show a user on http://www.local.site/ a list of files that are buried in http://www.remote.dk/~user/files, including file size and file type.

I needed to do this because one of the developers on a project I'm working with deposits his beta builds on a server in Denmark, but only has http available at his disposal. Since the main project website is in the US, maintained by me, I needed a way to get those beta builds accessible from the website when he created them.

Previously, I would keep hitting his directory in .dk every few days/weeks and just copy the files over the the US server, write up a quick HTML blurb describing the files, and make that live on the site.

With this code, I don't have to ever do anything with mirroring of the files. When his new files appear in .dk, this listing is updated automagically on the US site when the user selects this page.

TODO

  • Need to add error checking when/if the remote server is down
  • Add file date/timestamps to the listing (LWP::Simple has this field, IIRC)
  • Add smarts to this to just grab these files and mirror them locally from this code (wget/pavuk/etc. is not sufficient for this task)
use strict;
use diagnostics;
use warnings;

use LWP::UserAgent;
use LWP::Simple;
use HTML::LinkExtor;
use URI::URL;

my $url         = "http://www.remote.dk/~user/beta/";
my $ua          = LWP::UserAgent->new;
my @links       = ();

sub callback {
        my($tag, %attr) = @_;
        return if $tag eq 'href';
        push(@links, values %attr);
        return (@links);
}
 
my $p           = HTML::LinkExtor->new(\&callback);
my $res         = $ua->request(HTTP::Request->new(GET => $url),
                       sub {$p->parse($_[0])});
my $base        = $res->base;
@links          = map {$_ = url($_, $base)->abs;} @links;
my @betas       = grep(/tar/, @links);

foreach my $beta (@betas) {
        my @remote_files = head($beta);
        my $length = $remote_files[1]; 

        my $bprecise    = sprintf "%.0f", $length;
        my $bsize       = insert_commas($bprecise);

        my $kprecise    = sprintf "%.0f", ($length/1024);
        my $ksize       = insert_commas($kprecise);

        my $archtype;   # tarball? or bzip2? or zip?
        my $betahref    = substr($beta, 40, 70);
        if ($beta =~ /tar.gz/) {
                $archtype = "(tarball)";
        } elsif ($beta =~ /bz2/) {
                $archtype = "(bzip)";
        } else {
                $archtype = "(zip)";
        }
        print a({-href=>"$beta"}, "$betahref"), 
              " $archtype<br />${ksize}kb, $bsize bytes", br,br;
} 

#################################################
#
# Insert commas in numeric lengths, so the number
# 1234567 would be 1,234,567
#
#################################################
sub insert_commas {
   local($_) = @_;
   1 while s/(\d+)(\d\d\d)/$1,$2/;  
   $_;
}

Comment on Presenting a local listing of remote files over HTTP
Download Code
Re: Presenting a local listing of remote files over HTTP
by grinder (Bishop) on Jun 30, 2002 at 17:19 UTC
    There may be something I'm missing here, but why don't you just proxy the remote site with your web server?

    That way, the results are just there... no need to run a program to keep it up to date.


    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'
(jeffa) Re: Presenting a local listing of remote files over HTTP
by jeffa (Chancellor) on Jun 30, 2002 at 23:15 UTC
    I have always preferred the Cookbook's Recipe 2.17. for 'Putting Commas in Numbers':
    sub commify { my $text = reverse $_[0]; $text =~ s/(\d{3})(?=\d)(?!\d*\.)/$1,/g; return scalar reverse $text; }
    This is a Faster Way To Do It. :)

    Also, you will probably most likely want to add \z to your regexes that check for archived files:

    if ($beta =~ /tar.gz\z/) { ...
    to avoid pesky files named something like foo.tar.gz.foo

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
      sub commify { my $text = reverse $_[0]; $text =~ s/(\d{3})(?=\d)(?!\d*\.)/$1,/g; return scalar reverse $text; }
      Except that your last line above actually reverses the size, so a file of 180,224 bytes becomes 422,081 bytes when returned from commify(). Easy fix: s/scalar reverse/scalar/

      Great tip, and benches a few millis faster in execution time on longer file listings.

        Eh? Doesn't the last line reverse the reversal from the first one, maybe?

        Makeshifts last the longest.

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://178387]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2014-04-20 11:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls