Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: LWP capabilities

by jorg (Friar)
on May 27, 2001 at 13:54 UTC ( [id://83600]=note: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.


in reply to LWP capabilities

Most linux distributions come with a tool called Wget. This allows you to download an entire site from a given URL. The restrictions that beatnik mentioned apply here as well though.

Jorg

"Do or do not, there is no try" -- Yoda

Replies are listed 'Best First'.
Re: Re: LWP capabilities
by rmckillen (Novice) on May 27, 2001 at 21:08 UTC
    PLEASE IGNORE MY ABOVE POST! I DID NOT FORMAT PROPERLY!

    I'd never heard of Wget before, it's a neat little tool. I couldn't get it to do exactly what I wanted. I'm hoping this is due to me passing the wrong parameters, but it probably has to do with limiatations placed on Wget by the remote web server. Let me set up the scenario:

    http://www.url.com/baseball/
    The "baseball" folder contains files:
    - index.html
    - picture.gif
    - page2.html
    - (folder also contains other files)

    Contained in the index.html file are references to picture.gif and page2.html. The index.html does not reference the other files... I don't know the names of these files, but I know they are there. When I run:

    wget -r -l1 --no-parent http://www.url.com/baseball/

    It will retrieve index.html, picture.gif, and page2.html, but not the other files that I know are present in the directory.

    How do I get Wget to retrieve the other files not referenced in index.html? Is it possible?

      If there are no links to documents, there is no way of checking if they exist (besides the actual guessing, which can take forever...). On the wget note, I quote :

      Basically it comes down to: if the webserver has dirlisting enabled and no index file, you can see the files in the directory. If those are accessible depends on several factors... (me on LWP)

      and:

      The restrictions that Beatnik mentioned apply here as well though. (jorg on wget)

      Greetz
      Beatnik
      ... Quidquid perl dictum sit, altum viditur.
Re: Re: LWP capabilities
by sierrathedog04 (Hermit) on May 27, 2001 at 23:58 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://83600]
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.