Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: scraping flash content

by Anonymous Monk
on Dec 18, 2013 at 05:46 UTC ( #1067592=note: print w/ replies, xml ) Need Help??


in reply to scraping flash content

Ignoring arguing about gun control in a thread where the poster wants to know how to load bullets. Let me continue with civil monastery behavior instead of berating. To decompress a HTTP body of gzip content, I suggest you to try

$zlib = new Compress::Raw::Zlib::Inflate( -WindowBits => WANT_GZIP, -C +onsumeInput => 1 ); #LATER sub onData { my ($comp_in, $output, $status) = $_[0]; $status = $zlib->inflateReset() if $zlib->status() == Z_STREAM_EN +D; die "failed" if $status != Z_OK; $status = $zlib->inflate($comp_in, $output); die "failed" if $status != Z_OK && $status != Z_STREAM_END ; print $output; }
Of interest to you, may I suggest using Accept-Encoding, to disable gzip encoding on the HTTP compliant server? Although I must advise, for a bot, as you are writing, it is a poor use of your broadband connection, to transfer uncompressed data over it. It shall reduce the throughput and increase latency of your bot. I also suggest for you to research HEAD verb, to further reduce your bot's burden on the server and make greater economical use of your link and your processor.


Comment on Re: scraping flash content
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1067592]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (12)
As of 2014-07-28 12:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (197 votes), past polls