Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Performance penalties of in-Perl docn vs compiled CGIs.

by phirun (Novice)
on Feb 02, 2021 at 05:11 UTC ( #11127794=perlquestion: print w/replies, xml ) Need Help??

phirun has asked for the wisdom of the Perl Monks concerning the following question:

I'm wondering what performance penalty I'm paying for including documentation in my Perl CGI scripts, using either =pod or sequential #'s. A new project has grown to the point where good docn will be essential for ongoing maintainability. An obvious solution would be to compile the Perl into e.g. C executables, but I've no experience with this, or any similar technique. Would those with experience in this area kindly care to comment for both my own enlightenment and that of other Supplicants? With thanks for any replies.
  • Comment on Performance penalties of in-Perl docn vs compiled CGIs.

Replies are listed 'Best First'.
Re: Performance penalties of in-Perl docn vs compiled CGIs.
by GrandFather (Saint) on Feb 02, 2021 at 05:37 UTC
    "... what performance penalty I'm paying for including documentation in my Perl CGI scripts ..."

    Effectively none. Perl compiles your code to an internal form at run time at which point any comments and pod become irrelevant. There is no "Perl compiler". For websites with modest traffic you can pretty much ignore any startup overhead that is incurred by perl getting going. If you have a high traffic site it may be worth investigating mod_perl and its ilk.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

      Effectively none. Perl compiles your code to an internal form at run time at which point any comments and pod become irrelevant

      It doesn't really add to the compile time either. Skipping over POD is computationally trivial.

      Seeking work! You can reach me at ikegami@adaelis.com

      Thanks for your reply. I'm aware of the differences between complied and interpreted languages, and that the latter CAN be compiled into an executable. Searching the Net for "Perl compiler" returns a swag of results, and I've long been aware of one or more "Perl compiler projects". Relevant info is available here: https://www.marcbilodeau.com/compiling-perl/ I'm also aware that Perl uses a two-pass interpreter, that the first pass tokenizes the commands whilst dropping extraneous matter. I'm also aware that the milliseconds required to do this by gigahertz CPUs is all but instantaneous. My question is therefore more about ... ? aesthetics and elegance than mere time-of-execution. However, such seemingly intellectual approaches can have practical consequences further down the development track.

        I was befuddled by your talk of "C executables" into thinking you were new to the concept of compilers, interpreters and executable code, maybe you are. "Executables" are generally considered independently of the language that was used to create them so it is unusual to talk of C Executables.

        If is not universally the case that interpreted languages can be compiled. In the general case it is not true that Perl can be compiled in the usual sense to generate an executable. There are packaging tools that pack a Perl script up with everything it depends on and a Perl interpreter into and executable that is unpacked at run time, but that is not compiling. There are compilers that can compile scripts written in a subset of Perl, but those aren't Perl compilers either.

        You are right to think that raw time of execution is almost never interesting these days and that will become more true as time goes on. The more important metric is ease of maintenance and that points directly at documentation, automated testing and perhaps code coverage metrics.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

        Perl5 compiles the source code into OPCODE objects, then interprets (loops over) the OPCODE objects.

        Seeking work! You can reach me at ikegami@adaelis.com

Re: Performance penalties of in-Perl docn vs compiled CGIs.
by hippo (Bishop) on Feb 02, 2021 at 09:45 UTC
    I'm wondering what performance penalty I'm paying for

    This will depend to some extent on your environment and therefore the only way to know for sure is to benchmark it.

    An obvious solution would be to compile the Perl into e.g. C executables

    I might humbly suggest that an even more obvious solution would be to enable back-end persistence via mod_perl or FCGI or some other proxy.

    The other easy option is to split the POD from the code and store it elsewhere.


    🦛

      > The other easy option is to split the POD from the code and store it elsewhere. Yes, that's something I've often pondered, but you end up with two versions and the ensuing hassle. Software is just a side-dabble for me, so my knowledge of real-world issues is limited, esp when a team is involved. The popularity of other interpreted languages for CGI - PHP for example - and apparent popularity of attempted Perl compilers some years back has always made me curious as to any real-world benefits of such measures. In the modern world of huge RAM and lightning-fast CPUs there's probably none, but a knowledgable comment or two from someone with experience in that area would slake decades of merely wondering.
Re: Performance penalties of in-Perl docn vs compiled CGIs.
by bliako (Monsignor) on Feb 02, 2021 at 11:31 UTC

    This is a good question because it touches a controversial subject which more often than not is futile. If Sisyphus was a programmer he would be dealing with these sort of things ...

    At least in dealing with it, one can catch up with the developments in a tour-de-force of a module: B::C by Reini Urban. It provides tools to compile Perl, either into C and then to an executable. What, in your terminology, you refer to as "C executables". And whose subtleties were nicely expounded on by GrandFather. There is also the possibility to output to some sort of "bytecode" that Perl uses internally.

    You will be able to successfully produce an executable of a simple Perl script thusly:

    # simple.pl my %ha = ( 'ha' => 1, 'he' => 2, ); print $ha{'ha'}."\n";
    perl -MO=C,-osimple.c simple.pl

    C file equivalent is now written into simple.c

    gcc simple.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

    This will compile the C file (in *nix, please don't bother me with windows - sorry for the harsh way of putting it).

    perl -MExtUtils::Embed -e ccopts -e ldopts is very useful and practical. It asks perl to dump all the CFLAGS and LDFLAGS necessary to compile with gcc a C program with embeded perl code. In your current system and specific to the exact perl executable you are calling. (So much better than static environment variables or postit notes.)

    Now that was very simple! But when I used a randomly selected perl script from my archive which calls other modules yielded a 25MB C program. I interrupted its compilation a few minutes into it. But it was compiling at least. But at what cost?

    Caveat: the motto "only Perl can parse Perl" very frequently pops up in these sort of questions. And rightly so. Without getting into formal proofs without me possesing the proper qualification, consider what will happen if you code eval's a string constant or an external piece of code - unknown at the time of creating C or bytecode. I.e. dynamic parts. Well unless there is a perl embeded in your executable in order to parse that eval and produce a C stub which your C program can dynamically load and call (and that's a theory as in practice you have to pass all the context and program state to that stub), I can't see how you can get over this obstacle. I hear you saying that you don't use evals in your code. Fine but what about any of the modules you are calling? BTW BrowserUK put it succinctly in Re^2: Perl Cannot Be Parsed: A Formal Proof.

    If you intend to go this way, could you please report on your findings?

    bw, bliako

      > subject which more often than not is futile.

      The benefits are too small or non-existant. This whole .plc mechanism was invented back in the days when RAM was sparse and processors slow,hence the compilation phase was a considerable factor.

      Turning a Perl process into a persistent demon solves most, if not all use cases nowadays. That's because IO is now most limiting factor, and neither compile-phase nor IO needs to be redone.

      I have trouble imagining a use case where pre-compilation would pay off, because this would still be hampered by IO.

      The only aspect where pre-compilation tends to be used is obfuscation to protect the code. But that's also mostly wishful thinking.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      ) RAM-Disk? Seriously?

        because IO is now most limiting factor

        Right. Given also that for 200 lines of Perl (+ 3rd party modules) I got back a 25MB C program which I don't dare to imagine what executable size will yield and when.

      Hey, thanks for a most thoughtful and informative reply, bliako. Yes, I figured that I was asking the sort of "dumb question" inevitable for amateurs and dilettantes in any subject. I've always regarded Perl as a major intellectual accomplishment, and its internal complexity and inner convolutions make traditional compilation far more difficult than the simplistic assumptions of my question, if I understand you correctly.

      So be it! Once written in Perl, it stays in Perl! I can live with that. And the analogue of Sisyphus describes VERY nicely my own view of coding as a profession, which is why I went with hardware. So no, NO WAY am I intending to "go this way". Simply lack the courage.

        By all means I did not try to discourage you. The method to compile is so simple you may as well give it a try.

Re: Performance penalties of in-Perl docn vs compiled CGIs.
by Discipulus (Abbot) on Feb 02, 2021 at 17:12 UTC
    Hello phirun and welcome to the monastery,

    you already had nice answers and a useful thread to read, just to add my 2 cents.

    bliako is right with:

    > Caveat: the motto "only Perl can parse Perl" very frequently pops up in these sort of questions.

    But a perl document can be parsed by PPI and pod and comments easily removed

    # the entire document: >perl -MPPI -e "print PPI::Document->new('example.pl')" use strict; use warnings; # safety net loaded my %ha = ( # dont use one letter variables 'ha' => 1, # this stand for.. 'he' => 2, # and this other for ); =pod =h4 documentation =cut print $ha{'ha'}."\n"; # other unuseful comment __END__ # the stripped document >perl -MPPI -e "$doc = PPI::Document->new('example.pl'); $doc->prune(' +PPI::Token::Pod'); $doc->prune('PPI::Token::Comment'); print $doc->se +rialize" use strict; use warnings; my %ha = ( 'ha' => 1, 'he' => 2, ); print $ha{'ha'}."\n"; __END__

    So you can take a big enough perl document ( 1Mb ?) with lot of comments and pod, setup a non persistent weberver (with my limited experience I mean: a server which load the content at each request) and use ab to spot differences between serving the stripped and the complete document. Have fun ;)

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

      Thanks for the welcome, Discipulus, and to all who replied. A most interesting Brotherhood I'm glad to have found since I've never discussed or debated Perl before.

      The installed docn included with the package is so valuable and easy for self-instruction that seeking outside help always struck me as rather lazy. As always, times DO arise when some back-and-forth clears up odd points and the inevitable misconceptions. Now that I've decided to move outside my comfort zone that'll be more likely in future.

      I've copied off your document for experimenting. Much appreciated.

Re: Performance penalties of in-Perl docn vs compiled CGIs.
by LanX (Sage) on Feb 02, 2021 at 11:00 UTC
    > I'm wondering what performance penalty I'm paying for including documentation in my Perl CGI scripts, using either =pod or sequential #'s.

    Well if you included many gigabytes of data in the POD you might probably be capable to experience an increased startup delay of your CGI because of a slow file system.

    BUT you can always split your POD into a .pod file.

    And of course there are ways to keep code persistent in memory and avoid reloading from the file system.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: Performance penalties of in-Perl docn vs compiled CGIs.
by Anonymous Monk on Feb 02, 2021 at 17:15 UTC
    "Perl compiler" is a bit misleading: the interpreter is still there and still interpreting.
        The main thing that B::C does is output the start-up state of the perl interpreter (post-compilation but before execution) as a bunch of C structs which can be compiled into an executable. This improves the start-up time of the code, but has little effect on the run-time performance - which is still the perl interpreter calling out to a C-level pp_foo() C function for each op in the perl OP tree (the OP tree now being a static structure hard-coded into the executable). There is no JIT compilation to machine code in the Java sense (unless something has changed recently in B::C).

        So really there aren't any perl compilers - in the sense of something that at some point (build time or run time) converts perl code into machine code for fast execution.

        Dave.

        Yes. In the original Apple ][, Steve Wozniak used an interpreter which he called "SWEET16," to very great effect. Precious memory-bytes were saved, and "Integer BASIC" ran as well as it could have.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11127794]
Approved by GrandFather
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2022-05-19 18:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (72 votes). Check out past polls.

    Notices?