http://www.perlmonks.org?node_id=596502


in reply to Download web page including css files, images, etc.

I don't think wget will work in all situations.

1) it doesn't seem to handle the BASE element correctly (which I believe has been part of the HTML specification for a very long time).
2) "-k" won't translate links in CSS file to local links, consider #someid: background: url(folder/picture.jpg) center center;

Johannes

Replies are listed 'Best First'.
Re^2: Download web page including css files, images, etc.
by skx (Parson) on Jan 25, 2007 at 14:59 UTC

    True, but I think it is the most "standard" tool for the job - short of doing the parsing and rewriting myself.

    Steve
    --
      True, I just thought I'd point this out to the original poster: wget won't do the job all the time. If he needs something that works every time, he'd need to use wget and do some of the work manually in case BASE element is involved or CSS is being used for images (maybe there are other problems there I haven't thought of?) - or write it from scratch ...

      The trick would be going through the HTML and CSS specs and find every different way objects can be referenced/included/linked to etc. I'm sure there's plenty!

      Johannes

        With so many edge cases I've pretty much abandoned the use for a wget-only solution.

        I've got a mimimal tool working now using HTML::Parser, but I haven't dug deep in examining the CSS files yet. I will have to work on that later.

        Steve
        --
        A reply falls below the community's threshold of quality. You may see it by logging in.
        Oops, you are the original poster :-P