Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: Huge perl binary file

by perl-diddler (Hermit)
on Jul 13, 2012 at 03:06 UTC ( #981546=note: print w/ replies, xml ) Need Help??


in reply to Re: Huge perl binary file
in thread Huge perl binary file

The startup time on today's systems is measurable in milliseconds. It's not necessarily true that a statically loaded program will load faster -- especially as processors become faster at much faster rate than disk transfer time.

If the binary you are loading has many of it's libraries already in memory and the difference in what needs to be loaded is significant enough, the dynamically loaded version may be significantly faster than the statically loaded version.

A prime example -- perl. If you already have perl running in memory, the main perl lib is already in memory, so loading time of that 16KB segment and dynamically linking goes much faster than statically reloading 1.5MB of static code. Even a 100MB/s disk will take 15ms to read that static code. The dynamic linking of things already in memory could easily take <1ms.... Even if the libraries aren't in active memory, if they are frequently used, there is often a large disk cache on linux systems, so the file is likely already in memory... again, moving around in memory is something more on the order of microseconds than milliseconds...


Comment on Re^2: Huge perl binary file
Re^3: Huge perl binary file
by MisterBark (Novice) on Jul 13, 2012 at 03:32 UTC
    Ok great advice :)
    Now, how my perl could be compiled with static libs since I've never forced static at compilation?
     
    The doc says it will be dynamic unless we force it static, or the system doesn't support it...
    How a Linux system could not support dynamic libraries?
Re^3: Huge perl binary file
by mbethke (Hermit) on Jul 13, 2012 at 03:40 UTC
    The startup time on today's systems is measurable in milliseconds. It's not necessarily true that a statically loaded program will load faster -- especially as processors become faster at much faster rate than disk transfer time.
    Not necessarily, no. But usually.
    A prime example -- perl. If you already have perl running in memory, the main perl lib is already in memory, so loading time of that 16KB segment and dynamically linking goes much faster than statically reloading 1.5MB of static code. Even a 100MB/s disk will take 15ms to read that static code. The dynamic linking of things already in memory could easily take <1ms.... Even if the libraries aren't in active memory, if they are frequently used, there is often a large disk cache on linux systems, so the file is likely already in memory... again, moving around in memory is something more on the order of microseconds than milliseconds...
    No. If you have a process of the static binary already running, it will not be loaded again from disk but the same physical memory will simply be mapped to a new virtual address space for the new process. That's the time to set up a few MMU tables and you're ready. If a dynamic binary is already running, the new copy is unlikely to hit the disk for loading anything either but the linking still has to be done. Certainly much faster than waiting for the disk but still more work than for the static version. It could potentially be faster if the program shares large parts with other programs that are already running but itself hasn't been run before.
      so the best would be to always have a sleeping perl process with the most frequently modules loaded with use ? :)
      #!/usr/bin/perl use ....; use ....; use ....; while(1){ sleep(60); }
        That's basically what these "office quickstarter" thingies do, but it's not a good idea generally. If you have enough RAM, the stuff you want to load quickly will likely be in the buffer cache anyway. If you don't, forcing it to stay resident will just slow down other things.
        That depends on what the module is...(and what you mean by "best")....

        In your example above, you 'use' all the modules which reads them in initially, but then you sleep in a while loop. never touching those modules.

        perl code isn't like binary modules since it can be executed -- modified and executed again.

        It is possible, but I very much doubt that all of the -read-only text parts of a module are stored in 1 area where they could be mapped to a read-only, Copy-on-Write memory segment.

        I'd say your best and maybe easiest bet would be to to have your module copy is to create a dir in /dev/shm/ (I needed a space to store some tmp info that I could examine -- later, shm's usage was removed, and I used str8 pipes, but had a master process, that forked off 'n' copies of itself to do queries on a large file list in rpm.

        I wanted to let them all dump their results in tmp files and exit when done -- the parent wouldn't have to try to multiplex the streams which would have created contention in my code with the parent and children contending for the lock.

        So instead, I created a tmpdir in /dev/shm to tmp files -- so no memory contention... and great thing was I could examine all of the intermediate results!

        So -- if you REALLY need to keep something in memory, -- put a tmp dir in there and create a perl /lib tree... with your needed modules --

        on my machine, /usr/lib/perl5 -- ALL OF IT (vendor, site, and a few archived previous releases)( only take up ~446M -- that's less than .5G, on a modern machine, not a major dent...depends on how important it is to keep things in memory!

      No. If you have a process of the static binary already running, it will not be loaded again from disk but the same physical memory will simply be mapped to a new virtual address space for the new process. That's the time to set up a few MMU tables and you're ready.
      Um...no...only the R/O sections. Programs have initialized R/W spaces that once written are gone. There'd be no reason to have those sections marked as COW, unless you already had someone sharing the page (like a forked copy). But any unrelated process likely wouldn't use a COW copy, as they'd need the pristine data as it was supposed to be when the program loaded.
        Um...no...only the R/O sections. Programs have initialized R/W spaces that once written are gone. There'd be no reason to have those sections marked as COW, unless you already had someone sharing the page (like a forked copy). But any unrelated process likely wouldn't use a COW copy, as they'd need the pristine data as it was supposed to be when the program loaded.
        Sure, any overview of virtual memory processes this length is bound to be oversimplified in some place. But we can safely ignore this detail as it doesn't differ between dynamically vs. statically linked programs.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://981546]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (7)
As of 2014-08-31 09:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls