Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^2: Perl script compressor

by Anonymous Monk
on Dec 08, 2019 at 18:35 UTC ( [id://11109845]=note: print w/replies, xml ) Need Help??


in reply to Re: Perl script compressor
in thread Perl script compressor

Yes, I was thinking about writing a cgi script to run on a server. If I remove spaces, the OS might be able to load the whole script with one disk read. But if it's bigger, then it might take two or three. So, the smaller the code, the faster it load and the more likely it is that it will remain in the cache. So, if it runs multiple times, the OS might not even have to load it from the disk. So, that was the whole purpose of me doing this. --harangzsolt33 (I'm currently not logged in)

Replies are listed 'Best First'.
Re^3: Perl script compressor
by davido (Cardinal) on Dec 08, 2019 at 21:40 UTC

    I suggest using strace while running your script. If your script uses strict, warnings, CGI.pm, or any other modules, they also get opened and read into memory. If your Perl is configured to use sitecustomize.pl, that will be opened and read in. And if it's been read in once, with an OS like Linux there's a chance the files are hot and ready in a cache anyway. But strace will demonstrate to you that the top level script you load is not the largest component that gets read in from a file.

    The bulk of startup time has little to do with just reading the program file in from (hopefully) an SSD. I created two Hello World scripts; one with 13368 lines, consuming 1.1 megabytes on disk, and one with seven lines, consuming 96 bytes on disk. They both start by printing "Hello world\n", and end by printing "Goodbye world\n", but in the first script there are 13361 80-column lines of comments between the two print statements. Perl must read the entire file before getting to the final Goodbye world. Here are the timings:

    $ time ./mytest.pl Hello world Goobye world real 0m0.022s user 0m0.014s sys 0m0.008s $ time ./mytest2.pl Hello world Goodbye world real 0m0.008s user 0m0.008s sys 0m0.001s

    A tremendous increase from eight milliseconds to twenty two. We go from running 45 times per second to 125 *if* the bloated script is 1 megabyte in size, and if all that bloat (including the parts that you bring in from CPAN and core-Perl libs) can be reduced to 96 bytes. What if the source script is 64kb? Let's try that:

    $ time ./mytest.pl Hello world Goodbye world real 0m0.009s user 0m0.004s sys 0m0.005s

    So now we're talking about 1 millisecond difference. Instead of 125 runs per second, we have 111 per second, for a much more typically-sized script.

    If startup time is a problem you won't solve it by minifying your Perl script. It's better solved by converting over to a daemon process that stays resident, or if that's really impossible, scaling out horizontally.


    Dave

Re^3: Perl script compressor
by marto (Cardinal) on Dec 08, 2019 at 19:32 UTC

    You want to take a step back. What measurement have you done to examine the compilation time for a script with comments and spaces vs the same code without? If you're worried about performance profile your code (Devel::NYTProf). You should also read How can I make my CGI script more efficient? Or just move to an approach which isn't CGI scripts, it was removed for good reason, and starting anything new with it is actively discouraged.

Re^3: Perl script compressor
by stevieb (Canon) on Dec 08, 2019 at 18:46 UTC

    If you're that concerned about disk reads, I'd just copy the file into shared memory space (eg: /dev/shm) on system or web server startup, then read the script from there instead.

    Or, use a system that only has to read the file once upon web server instantiation.

    This seems like premature optimization.

    Update: I thought some more about this. If you're unit testing your code (which you should be for sure!), you'd have to run the tests again on this automatically re-written code in case something is lost in translation.

    I'm all for doing things for education and learning purposes, but I don't think the risk is worth it if the sole objective is to use the code to try to make something a fraction of a nanosecond (obviously estimated) more efficient.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11109845]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-19 19:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found