Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Hi Monks-

I'm seeing a strange issue when using IO::Uncompress:Gunzip on windows that I can't explain. The same code runs fine under Unix. I'd like to understand why.

The code below will take about 20-seconds to unzip a 65M zipped file on my windows xp box.

If I comment out the first gunzip line, and uncomment the second gunzip line, it will take hours on the windows box (6 hours last time I bothered to try). Any thoughts?

Thanks

-Craig

#!/opt/exp/bin/perl5.8 use strict; use warnings; use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ; my $ifile = shift || die "Missing input file name\n"; open(IFILE, "<$ifile") || die "Can't open $ifile"; binmode(IFILE); my $inputstr = join('', <IFILE>); print STDERR scalar(localtime), " - STARTING UNZIP!!!\n"; # This is fast on all platforms gunzip \$inputstr => "clearfile" or die "gzip failed: $GunzipError\n"; my $cleartxt; # This takes forever on windows, no problem on Unix #gunzip \$inputstr => \$cleartxt or die "gzip failed: $GunzipError\n"; print STDERR scalar(localtime), " - UNZIPPED COMPLETE!!!\n";
UPDATE1:

Here is some profile data I got from running both cases on a smaller gzipped input file (4.8M)...

Profile Data for FAST Method

Total Elapsed Time = 3.667486 Seconds User+System Time = 1.200486 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 37.2 0.447 0.447 7281 0.0001 0.0001 Compress::Raw::Zlib::infl +ateStream ::inflate 19.8 0.238 1.296 1 0.2377 1.2963 IO::Uncompress::Base::_rd +2 14.9 0.179 0.200 7293 0.0000 0.0000 IO::Uncompress::Base::sma +rtRead 8.66 0.104 0.979 7281 0.0000 0.0001 IO::Uncompress::Base::_ra +w_read 6.25 0.075 1.059 7283 0.0000 0.0001 IO::Uncompress::Base::rea +d 5.66 0.068 0.068 7281 0.0000 0.0000 IO::Uncompress::Base::pus +hBack 4.66 0.056 0.056 14575 0.0000 0.0000 IO::Uncompress::Base::sav +eStatus 3.25 0.039 0.039 14562 0.0000 0.0000 U64::add 3.08 0.037 0.496 7281 0.0000 0.0001 IO::Uncompress::Adapter:: +Inflate:: uncompr 1.92 0.023 0.224 7281 0.0000 0.0000 IO::Uncompress::Base::rea +dBlock 1.33 0.016 0.016 1 0.0160 0.0160 IO::bootstrap 1.33 0.016 0.076 4 0.0040 0.0191 main::BEGIN 1.33 0.016 0.060 9 0.0018 0.0067 IO::Uncompress::Gunzip::B +EGIN 1.25 0.015 0.015 10 0.0015 0.0015 strict::unimport 1.25 0.015 0.030 37 0.0004 0.0008 IO::Compress::Base::Commo +n::BEGIN

Profile Data for SLOW Method

Total Elapsed Time = 154.4478 Seconds User+System Time = 154.2188 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 98.6 152.1 152.14 7286 0.0209 0.0209 Compress::Raw::Zlib::infl +ateStream 0 ::inflate 0.34 0.519 0.573 7298 0.0001 0.0001 IO::Uncompress::Base::sma +rtRead 0.28 0.438 154.04 7286 0.0001 0.0211 IO::Uncompress::Base::_ra +w_read 0.18 0.283 152.49 7286 0.0000 0.0209 IO::Uncompress::Adapter:: +Inflate:: 9 uncompr 0.18 0.275 154.31 7288 0.0000 0.0212 IO::Uncompress::Base::rea +d 0.10 0.158 0.158 14585 0.0000 0.0000 IO::Uncompress::Base::sav +eStatus 0.10 0.151 0.151 7286 0.0000 0.0000 IO::Uncompress::Base::pus +hBack 0.09 0.134 0.134 7288 0.0000 0.0000 IO::Uncompress::Base::sma +rtEof 0.08 0.128 0.128 14572 0.0000 0.0000 U64::add 0.04 0.060 0.060 21864 0.0000 0.0000 Compress::Raw::Zlib::__AN +ON__ 0.02 0.025 154.34 1 0.0248 154.34 IO::Uncompress::Base::_rd +2 0.01 0.016 0.016 4 0.0040 0.0040 Compress::Raw::Zlib::cons +tant 0.01 0.016 0.016 8 0.0020 0.0020 Exporter::export 0.01 0.016 0.016 7 0.0023 0.0023 IO::File::BEGIN 0.01 0.016 0.016 26 0.0006 0.0006 Compress::Raw::Zlib::BEGI +N

UPDATE2:

I believe Corion and BrowserUk are correct about this issue.

Watching Pagefaults with Process Explorer, here is what I see:

~26,000,000 Pagefaults - Slow Method
~6,000 Pagefaults - Fast Method

UPDATE3:

I believe this talks about the root cause:
https://groups.google.com/forum/?fromgroups#!topic/perl.perl5.porters/44PTHwefYUk


In reply to IO::Uncompress::Gunzip to scalar takes hours (on windows) by cmv

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others perusing the Monastery: (8)
    As of 2014-07-28 06:16 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (189 votes), past polls