Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: Why is Windows 100 times slower than Linux when growing a large scalar?

by syphilis (Archbishop)
on Dec 01, 2009 at 05:12 UTC ( [id://810307]=note: print w/replies, xml ) Need Help??


in reply to Re: Why is Windows 100 times slower than Linux when growing a large scalar?
in thread Why is Windows 100 times slower than Linux when growing a large scalar?

Perl has more than one allocation scheme

I've just built perl (with mingw) using perl's malloc and that takes care of the problem.
However, according to comments in the makefile.mk, if you use perl's malloc you have to build without USE_IMP_SYS (which I also did). This means that the perl that has been built with perl's malloc has no threads or fork emulation - which would be unsatisfactory for many people. It also means that the ppm packages available from the various repos are unusable with this build of perl.

It was perl-5.11.2 that I built to check this out, having first established that perl-5.11.2 exhibits the crap behaviour when built with "normal" options (and it does).

Cheers,
Rob
  • Comment on Re^2: Why is Windows 100 times slower than Linux when growing a large scalar?

Replies are listed 'Best First'.
Re^3: Why is Windows 100 times slower than Linux when growing a large scalar?
by BrowserUk (Patriarch) on Dec 01, 2009 at 05:38 UTC

    The problem seems to be pretty definitely sourced within the MS CRT.

    If I compile and run this using gcc under Ubuntu running inside a VirtualBox emulator (which ought to carry some overheads):

    #include <stdio.h> #include <stdlib.h> #include <time.h> int main( int argc, char** argv ) { long long i, n = 1000000; char *p = (char*)malloc( 1000 ); time_t start, finish; double elapsed; time( &start ); for( i = 2; i < n; ++i ) { // printf( "\r %lld\t", i * 1000 ); if( ! ( p = (char*)realloc( p, 1000 * i ) ) ) { printf( "\nFailed to realloc to %lld\n", i * 1000 ); exit( 1 ); } } time( &finish ); elapsed = difftime( finish, start ); printf( "\nfinal size: %lld; took %.3f seconds\n", n * 1000, elaps +ed ); exit( 0 ); }

    It takes less that a second to realloc a buffer to 1GB in 1000 byte increments:

    mehere@mehere-desktop:~$ gcc mem.c -o memtest mehere@mehere-desktop:~$ ./memtest final size: 1000000000; took 0.000 seconds

    However, if I compile and run this using MS VC++:

    #include <stdio.h> #include <time.h> int main( int argc, char** argv ) { __int64 i, n = 1000000; char *p = (char*)malloc( 1000 ); time_t start, finish; double elapsed; time( &start ); for( i = 2; i < n; ++i ) { // printf( "\r %I64d\t", i * 1000 ); if( ! ( p = (char*)realloc( p, i * 1000 ) ) ) { printf( "\nFailed to realloc to %I64d\n", i * 1000 ); exit( 1 ); } } time( &finish ); elapsed = difftime( finish, start ); printf( "\nfinal size: %I64d; took %.3f seconds\n", n * 1000, elap +sed ); exit( 0 ); }

    it takes over an hour to run. I haven't had the patience to let it complete yet!

    I also compiled it with MinGW (which appears to also use the MSCRT?), and it has taken 1hr 20 mins (so far and appears to only 1/4 of the way there).

    The problem seems to lie with the CRT realloc() which grows the heap in iddy-biddy chunks each time.

    In addition, it may be

    • walking the heap attempting to coallesce freespace prior to allocating extra virtual memory.
    • The virtual memory allocator zeros new commits prior to copying over teh existing data into the reallocated space.

    Bottom line: The MSCRT heap management routines are crap!


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Is the Perl inside the virtual box also threaded?

      Obviously, that makes no sense. I sleep now.

        Yes, it is a threaded perl under linux and threads work there fine.

        A perhaps more interesting question is if perl is built on win without USE_IMP_SYS so that USE_PERL_MALLOC can be enabled, does that stop you from using threads? Or just fork?

        Whilst I would miss the piped open, which i believe requires the fork emulation, I could work around it using threads. And I would infinitely prefer threads + faster memory, to the fork emulation.

        Does anyone know the answer or should I just suck it and see?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://810307]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-19 22:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found