Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Hi Monks-

I'm seeing a strange issue when using IO::Uncompress:Gunzip on windows that I can't explain. The same code runs fine under Unix. I'd like to understand why.

The code below will take about 20-seconds to unzip a 65M zipped file on my windows xp box.

If I comment out the first gunzip line, and uncomment the second gunzip line, it will take hours on the windows box (6 hours last time I bothered to try). Any thoughts?

Thanks

-Craig

#!/opt/exp/bin/perl5.8 use strict; use warnings; use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ; my $ifile = shift || die "Missing input file name\n"; open(IFILE, "<$ifile") || die "Can't open $ifile"; binmode(IFILE); my $inputstr = join('', <IFILE>); print STDERR scalar(localtime), " - STARTING UNZIP!!!\n"; # This is fast on all platforms gunzip \$inputstr => "clearfile" or die "gzip failed: $GunzipError\n"; my $cleartxt; # This takes forever on windows, no problem on Unix #gunzip \$inputstr => \$cleartxt or die "gzip failed: $GunzipError\n"; print STDERR scalar(localtime), " - UNZIPPED COMPLETE!!!\n";
UPDATE1:

Here is some profile data I got from running both cases on a smaller gzipped input file (4.8M)...

Profile Data for FAST Method

Total Elapsed Time = 3.667486 Seconds User+System Time = 1.200486 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 37.2 0.447 0.447 7281 0.0001 0.0001 Compress::Raw::Zlib::infl +ateStream ::inflate 19.8 0.238 1.296 1 0.2377 1.2963 IO::Uncompress::Base::_rd +2 14.9 0.179 0.200 7293 0.0000 0.0000 IO::Uncompress::Base::sma +rtRead 8.66 0.104 0.979 7281 0.0000 0.0001 IO::Uncompress::Base::_ra +w_read 6.25 0.075 1.059 7283 0.0000 0.0001 IO::Uncompress::Base::rea +d 5.66 0.068 0.068 7281 0.0000 0.0000 IO::Uncompress::Base::pus +hBack 4.66 0.056 0.056 14575 0.0000 0.0000 IO::Uncompress::Base::sav +eStatus 3.25 0.039 0.039 14562 0.0000 0.0000 U64::add 3.08 0.037 0.496 7281 0.0000 0.0001 IO::Uncompress::Adapter:: +Inflate:: uncompr 1.92 0.023 0.224 7281 0.0000 0.0000 IO::Uncompress::Base::rea +dBlock 1.33 0.016 0.016 1 0.0160 0.0160 IO::bootstrap 1.33 0.016 0.076 4 0.0040 0.0191 main::BEGIN 1.33 0.016 0.060 9 0.0018 0.0067 IO::Uncompress::Gunzip::B +EGIN 1.25 0.015 0.015 10 0.0015 0.0015 strict::unimport 1.25 0.015 0.030 37 0.0004 0.0008 IO::Compress::Base::Commo +n::BEGIN

Profile Data for SLOW Method

Total Elapsed Time = 154.4478 Seconds User+System Time = 154.2188 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 98.6 152.1 152.14 7286 0.0209 0.0209 Compress::Raw::Zlib::infl +ateStream 0 ::inflate 0.34 0.519 0.573 7298 0.0001 0.0001 IO::Uncompress::Base::sma +rtRead 0.28 0.438 154.04 7286 0.0001 0.0211 IO::Uncompress::Base::_ra +w_read 0.18 0.283 152.49 7286 0.0000 0.0209 IO::Uncompress::Adapter:: +Inflate:: 9 uncompr 0.18 0.275 154.31 7288 0.0000 0.0212 IO::Uncompress::Base::rea +d 0.10 0.158 0.158 14585 0.0000 0.0000 IO::Uncompress::Base::sav +eStatus 0.10 0.151 0.151 7286 0.0000 0.0000 IO::Uncompress::Base::pus +hBack 0.09 0.134 0.134 7288 0.0000 0.0000 IO::Uncompress::Base::sma +rtEof 0.08 0.128 0.128 14572 0.0000 0.0000 U64::add 0.04 0.060 0.060 21864 0.0000 0.0000 Compress::Raw::Zlib::__AN +ON__ 0.02 0.025 154.34 1 0.0248 154.34 IO::Uncompress::Base::_rd +2 0.01 0.016 0.016 4 0.0040 0.0040 Compress::Raw::Zlib::cons +tant 0.01 0.016 0.016 8 0.0020 0.0020 Exporter::export 0.01 0.016 0.016 7 0.0023 0.0023 IO::File::BEGIN 0.01 0.016 0.016 26 0.0006 0.0006 Compress::Raw::Zlib::BEGI +N

UPDATE2:

I believe Corion and BrowserUk are correct about this issue.

Watching Pagefaults with Process Explorer, here is what I see:

~26,000,000 Pagefaults - Slow Method
~6,000 Pagefaults - Fast Method

UPDATE3:

I believe this talks about the root cause:
https://groups.google.com/forum/?fromgroups#!topic/perl.perl5.porters/44PTHwefYUk


In reply to IO::Uncompress::Gunzip to scalar takes hours (on windows) by cmv

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others avoiding work at the Monastery: (4)
    As of 2015-07-30 04:59 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (270 votes), past polls