Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Improving core performance with SIMD

by phizel (Acolyte)
on Aug 13, 2025 at 12:22 UTC ( [id://11166023]=perlmeditation: print w/replies, xml ) Need Help??

After running across the StringZilla library, it got me thinking that Perl could greatly benefit from similar improvements. After a glance through perlfunc and Module::CoreList the following is a list I think could be targeted for optimization. Thoughts? core modules: non-trivial:

Replies are listed 'Best First'.
Re: Improving core performance with SIMD
by Corion (Patriarch) on Aug 13, 2025 at 12:46 UTC

    Yes and no. Personally, I would look at Daniel Lemire's work on SIMD (https://github.com/lemire), but it's not trivial to employ the code.

    Intel Hyperscan does not do capturing, and does not tell you where a match started. It is mostly aimed at network traffic scanners, where that is not necessary. This limitation would make it only suitable for very limited situations.

    I think the best approach would be to find a library that does good detection of SIMD capabilities of a machine, and then port (or copy) selected parts into the appropriate Perl code guarded by the appropriate SIMD define. Finding a library that has both, (one or more) SIMD-variants and a non-SIMD plain C variant is important here, since otherwise, you will get feature disparity between SIMD and non-SIMD code.

Re: Improving core performance with SIMD
by ysth (Canon) on Aug 13, 2025 at 16:49 UTC
    There may be some things of help here, but given that perl strings may be utf8, not bytes, it isn't going to be easy.

      Were I to look for prior art I'd look at what ripgrep and the Rust regex engine do. I want to say its author had benchmarks showing that it was (one of)? the fast(est|er) UTF handling engines.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

      Yup, and a lot of processing power is required to correctly handle Unicode... no matter the programming language. That was always the compromise with Unicode, and it's a good one overall. Human languages and writing systems are messy, based on weird and complicated rules that evolved over centuries. At least we are not having to deal with codepages/"extended ASCII" anymore.

      PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
      Also check out my sisters artwork and my weekly webcomics
Re: Improving core performance with SIMD
by Anonymous Monk on Jan 20, 2026 at 22:43 UTC
    Should somebody release a StringZilla interface on CPAN?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11166023]
Approved by marto
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2026-02-09 01:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.