Re^2: Why is Windows 100 times slower than Linux when growing a large scalar?

Replies are listed 'Best First'.
Re^3: Why is Windows 100 times slower than Linux when growing a large scalar? by BrowserUk (Patriarch) on Dec 01, 2009 at 05:38 UTC
The problem seems to be pretty definitely sourced within the MS CRT. If I compile and run this using gcc under Ubuntu running inside a VirtualBox emulator (which ought to carry some overheads): #include <stdio.h> #include <stdlib.h> #include <time.h> int main( int argc, char** argv ) { long long i, n = 1000000; char p = (char)malloc( 1000 ); time_t start, finish; double elapsed; time( &start ); for( i = 2; i < n; ++i ) { // printf( "\r %lld\t", i * 1000 ); if( ! ( p = (char)realloc( p, 1000 i ) ) ) { printf( "\nFailed to realloc to %lld\n", i * 1000 ); exit( 1 ); } } time( &finish ); elapsed = difftime( finish, start ); printf( "\nfinal size: %lld; took %.3f seconds\n", n * 1000, elaps +ed ); exit( 0 ); } [download] It takes less that a second to realloc a buffer to 1GB in 1000 byte increments: `mehere@mehere-desktop:~$ gcc mem.c -o memtest mehere@mehere-desktop:~$ ./memtest final size: 1000000000; took 0.000 seconds` [download] However, if I compile and run this using MS VC++: #include <stdio.h> #include <time.h> int main( int argc, char** argv ) { __int64 i, n = 1000000; char p = (char)malloc( 1000 ); time_t start, finish; double elapsed; time( &start ); for( i = 2; i < n; ++i ) { // printf( "\r %I64d\t", i * 1000 ); if( ! ( p = (char)realloc( p, i 1000 ) ) ) { printf( "\nFailed to realloc to %I64d\n", i * 1000 ); exit( 1 ); } } time( &finish ); elapsed = difftime( finish, start ); printf( "\nfinal size: %I64d; took %.3f seconds\n", n * 1000, elap +sed ); exit( 0 ); } [download] it takes over an hour to run. I haven't had the patience to let it complete yet! I also compiled it with MinGW (which appears to also use the MSCRT?), and it has taken 1hr 20 mins (so far and appears to only 1/4 of the way there). The problem seems to lie with the CRT realloc() which grows the heap in iddy-biddy chunks each time. In addition, it may be walking the heap attempting to coallesce freespace prior to allocating extra virtual memory. The virtual memory allocator zeros new commits prior to copying over teh existing data into the reallocated space. Bottom line: The MSCRT heap management routines are crap! Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP PCW It is as I've been saying!(Audio until 20090817)	[reply] [d/l] [select]
Re^4: Why is Windows 100 times slower than Linux when growing a large scalar? by ikegami (Patriarch) on Dec 01, 2009 at 07:25 UTC
~~Is the Perl inside the virtual box also threaded?~~ Obviously, that makes no sense. I sleep now.	[reply]
Re^5: Why is Windows 100 times slower than Linux when growing a large scalar? by BrowserUk (Patriarch) on Dec 01, 2009 at 08:32 UTC
Yes, it is a threaded perl under linux and threads work there fine. A perhaps more interesting question is if perl is built on win without USE_IMP_SYS so that USE_PERL_MALLOC can be enabled, does that stop you from using threads? Or just fork? Whilst I would miss the piped open, which i believe requires the fork emulation, I could work around it using threads. And I would infinitely prefer threads + faster memory, to the fork emulation. Does anyone know the answer or should I just suck it and see? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP PCW It is as I've been saying!(Audio until 20090817)	[reply]
Re^6: Why is Windows 100 times slower than Linux when growing a large scalar? by syphilis (Archbishop) on Dec 01, 2009 at 12:52 UTC
Re^7: Why is Windows 100 times slower than Linux when growing a large scalar? by BrowserUk (Patriarch) on Dec 01, 2009 at 13:07 UTC
Some notes below your chosen depth have not been shown here


Clear questions and runnable code get the best and fastest answer
	PerlMonks