http://www.perlmonks.org?node_id=11111111

xiaoyafeng has asked for the wisdom of the Perl Monks concerning the following question:

Perl use SV treating all types, IV, NV, SV, IO etc, and can switch type as any time. Most of time, it’s a very handy mechanism, neither type error nor need type cast. But when performance of program is a point, this will become irresistible.

Someone would say why not XS? Yes, XS is powerful, but it’s not silver bullet, especially, perl still consume amount of time to inbox and outbox value (perl at least use 4 words to store a int). for example, I’d like to read some strings from a CSV, then search in a DB by these strings, at last write the string retrieved from DB into a new file. In order to speed it up, I use XS module as much as possible. I use TEXT::CSV_XS for reading/wrting file, use DBI for read DB, frustratingly, the program in perl is still slow.

Why? Since perl convert strings in file to SV, then convert SV to string for Database, then convert strings read from DB to SV, at last, convert SV to string for write file. When I’ve learned perl internal recently, one idea is getting into my mind repeatedly. Why wouldn't add native type in to perl? not refcount, no magic, can't change type, and only in lexical. The native value only live in declearing lexical. Like: { my native str $bb = “ hello world”; # cstring; my $cc = “hello world”; #perl string } In this way, when doing the jobs to communicate external library/program like above, perl is more like transparent bridge and no any extra performance loss.

It’s just a thought when I read perlguts, Is it a good idea or not? Please monks enlighten me. TIA.

Replies are listed 'Best First'.
Re: Why not perl have raw/native type
by Corion (Patriarch) on Jan 07, 2020 at 09:36 UTC

    Basically since Perl is highly dynamic, you can't prevent somebody from doing:

    my $ref; { my native str $bb = 'hello world'; $ref = \$bb; } print $$ref;

    Perl expects any variable to be at least of type SV. The closest you can get to a "native" variable is to use the PV slot of the variable and point it to your raw memory. You will still get reference counting and everything, but Perl expects all variables to support that.

    Also note that by carefully looking at the code path, you can often prevent (needless) copying of data, for example by using DBI bind parameters instead of using ->fetchrow.

      > you can't prevent somebody from doing:

      Actually, all informations are available at compile time ...

      Why shouldn't Perl be able to reject it, unless the reference $ref is also typed?

      But maybe I'm missing the OP's intention?

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

        Actually, all informations are available at compile time ...

        Divining that information is the topic of Escape Analysis, and its far from trivial to find what values will remain local to a subroutine in the general case.

        If you want to put more restrictions on the "native" type, maybe you can make this easier, but that amounts to basically having a second set of data types that are not interoperable with the rest of Perl.

        How about:
        # foo() is a sub from a 3rd party library you have no control over sub foo { my $r = \$_[0] } my native str $s = "..."; foo($s);

        Dave.

Re: Why not perl have raw/native type
by ikegami (Patriarch) on Jan 07, 2020 at 17:56 UTC

    The main problem is that the opcodes into which Perl is compiled expect scalars. You would literally need to rewrite the entire interpreter, which is massive! (Remember, each "function" in perlfunc is an operator, and they basically correspond one-to-one to an opcode.)

    Similarly, subs expect scalars. Would these primitives get upgrade to scalars when passed to a sub? If so, that's going to seriously limit the amount of benefit you can get. If not, you need to massively change how sub calls are made too.

    Also, there's a huge number of unresolved problems with your proposal. The most predominant is that you mention getting rid of reference counting for these variables, but you don't mention what garbage collection scheme would you use instead or how it would help.

    These changes are so massive, it makes more sense to talk about spinning off a new language.

Re: Why not perl have raw/native type
by Tux (Canon) on Jan 07, 2020 at 14:31 UTC

    In the landscape you sketch, I bet profiling will show the overhead being at the DB (DBI + DBD + database interface) side of the program, and not in Text::CSV_XS. Yes, Text::CSV_XS does have the SV overhead, but when you use bind_columns, most of that overhead is neglectable (comparable to other overhead).

    You didn't state what database you use, and which interface and/or connection method. Sometimes specific options and attributes can really speed up your process.

    We really need more in order to be able to help you (if possible at all).


    Enjoy, Have FUN! H.Merijn
Re: Why not perl have raw/native type
by shmem (Chancellor) on Jan 07, 2020 at 20:01 UTC

    Assigning strings to a PV slot in a SV is a task with the least impact on performance, of all other tasks which the interpreter carries out: walking the optree, setting up PAD regions for lexicals, housekeeping, maintaining symbol tables and whatnot.

    Even if all your variables were set up as globals, which means the SVs holding the PVs stay in place through the entirely life of a program and (re-)assigning only takes a few steps of deallocating/allocating memory and (de)referencing, there will be no significant improvement for any non-trivial program.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

      Assigning a string is actually quite complex when dealing with scalars.

      1. Call GET magic on the scalar being assigned if any.
      2. Check what the scalar being assigned contains (which happens to be a string).
      3. Upgrade the scalar to which we are assigning to one that can contain a string if needed.
      4. Clear whatever value the scalar to which we are assigning contains. (This can require reducing reference counts (which can require calling destructors, etc), reducing COW counts, resetting OOB vars, etc.)
      5. Finally, we can start the copy. That can happen one of three ways. Hopefully, it's just a question of copying a pointer and incrementing the COW reference count.

      (I definitely skipped some steps. I got tired of typing.)

      Walking the optree, on the other hand, is simply following a linked list. (Effectively, op = op->next.) This is way simpler than assigning a string.

        I was comparing the proposition to what happens actually in the cheapest of scenarios. So 3. is skipped, and 4. amounts to deallocate the PV. Yes, there are definitely more steps involved, and maybe some are also necessary for a RAW type.

        But that's not the point. A RAW type instead of a SV(PV) would also undergo a lot of the steps the SV(PV) undergoes, even if the RAW had an inmutable type. I'm just saying that (re)assigning a string to a variable already containing just a string (i.e. only the PV slot allocated, no magic attached) is comparably cheap compared to all the "whatnot" the perl intepreter does. Is this incorrect? Then the OP makes a valid point.

        perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: Why not perl have raw/native type
by karlgoethebier (Abbot) on Jan 07, 2020 at 16:08 UTC

    You may take a look at Devel::NYTProf. And probably you are the longingly awaited prince to come to the rescue for this poor orphan 😎 Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Re: Why not perl have raw/native type
by LanX (Saint) on Jan 08, 2020 at 00:55 UTC
    you mean similar to TypeScript's optional static typing ?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: Why not perl have raw/native type
by Anonymous Monk on Jan 07, 2020 at 10:24 UTC
Re: Why not perl have raw/native type
by Anonymous Monk on Jan 07, 2020 at 14:24 UTC
    If "frustratingly the program in Perl is still slow," it is unlikely that the true cause of it is dynamic data types. Profile your code to find out where and what the slowdown actually is. Do not assume. As Kernighan and Plauger said, "don't 'diddle' code to make it faster – find a better algorithm."
      ... it is unlikely that the true cause of it is dynamic data types.

      Right. And you know this based on what? Nothing. You made it up, hoping to impress others with your "expertise". Give it a rest, mike. Please.

        Despite resembling an infamous user who said he would not come back, the anonymonk does have a point. While it is sometimes possible to find bottlenecks by a thorough understanding of the system, you are far more likely to get useful information from actual measurements — and those useful results may just be surprising even to an experienced programmer.

        In short, do not assume that perl's box/unbox routines (which should be very lightweight if you are reusing the container SVs) are the source of your performance problems — use profiling (at both Perl and C levels, so you can see the time spent in XS code) and then consider how to improve the running time of your program.

        Profiling is important. If you optimize one block of code to run in no time at all, but the program was only spending 1% of its time in that code, you have gained only 1%, but if you improve an algorithm to cut the running time of another block in half, but the program spent 70% of its time in that block, you have gained about 35%.

        A reply falls below the community's threshold of quality. You may see it by logging in.
A reply falls below the community's threshold of quality. You may see it by logging in.