Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: [OT] HTTP downloads and caching

by syphilis (Chancellor)
on Mar 25, 2017 at 10:28 UTC ( #1185877=note: print w/replies, xml ) Need Help??


in reply to [OT] HTTP downloads and caching

This upstream malware attack has recurred again today, starting nearly 24 hours ago - same site, different files.
This time I'm seeing that the "query string" solution is still working, but it doesn't clear the cache.
My problem is that:
ppm install http://www.sisyphusion.tk/ppm/Cairo.ppd
installs an older version of the Cairo ppm package than is currently on the server.
It accesses an outdated (non-existent) Cairo.ppd, and installs outdated (non-existent) binaries.
Out of curiosity I tried:
ppm install http://www.sisyphusion.tk/ppm/Cairo.ppd?no=cache
but the ppm utlity croaks on that. Besides, it still would have installed the cached binaries.

I haven't tried wget --no-cache yet as that cleared the cache last time.
Instead, I've decided to leave the cache uncleared in case it helps my ISP (who I've contacted again) remove the malware.
My ISP did not respond last time ... let's see what happens this time.

This update is largely for my own records - but I'm also submitting it in case someone has something to add.

Cheers,
Rob

Replies are listed 'Best First'.
Re^2: [OT] HTTP downloads and caching
by syphilis (Chancellor) on Jun 09, 2017 at 11:26 UTC
    I haven't tried wget --no-cache yet as that cleared the cache last time.

    Thankfully wget --no-cache still does the trick.
    And I've come across an improvement in the form of wget --spider --no-cache which clears the cache without having to actually download the file.
    (This is particularly useful if the cached file is large ... as can be the case with some of the tar.gz files that PPM needs to download.)
    I still haven't been ale to work out just where the offending cache is located - though I have established that it's upstream of my router.

    It still remains to work out how I can patch Strawberry's ppm utility such that running this wget command in advance is not needed.

    Cheers,
    Rob

      My off the cuff guess is that your ISP is running a so called transparent proxy. (Cox?) Also, it sounds like this server is under your control. If so, you should be able to configure your web server to use HTTP to prevent the caching. I've linked to the w3 doc for 1.1, look for the section 14.9 Cache-Control which includes a comment on Pragma: no-cache for HTTP 1.0

      There are ways to probe the caches and figure out what is going on but I believe that cache-control should do the trick. The next step is less easy.

      https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
        Also, it sounds like this server is under your control

        I don't think it is - though I could be wrong.
        If it is under my control then I'd have to do a bit of research and learning before I could exercise that control.

        Back in March I opened a ticket with the company that hosts my website and asked them to mark "all files in (and below) the sisyphusion.tk/ppm directory" as "no-cache".
        They replied "Your request has been accomplished".
        Of course they don't provide any info regarding exactly what they've done (they never do), but whatever it was didn't make any difference.
        Reading various comments/docs about this "no-cache" option I see a lot of "should-do-this" statements and not many "will-do-this" ones.

        I think the best solution would be to hack PPM to send a "no-cache" argument. Make that the default behaviour, with an option to allow caching via a command line argument.
        PPM itself uses perl's LWP modules:
        C:\>perl -MPPM -le "for(keys(%INC)){print if $_ =~ /LWP/}" LWP/UserAgent.pm LWP/MemberMixin.pm LWP/Simple.pm LWP.pm LWP/Protocol.pm
        So, assuming LWP already allows for the passing of the "--no-cache" directive, such a hack might not be too difficult.
        In the meantime, running wget --spider --no-cache http://.... is tolerable, if not exactly satisfactory.

        Thank you for your reply and link !

        Cheers,
        Rob

        blah, reply to reply is bad but you can probably get away with just setting the cache options on your download directory.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1185877]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (1)
As of 2017-09-23 20:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    During the recent solar eclipse, I:









    Results (273 votes). Check out past polls.

    Notices?