Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: what would you like to see in perl5.12?

by bart (Canon)
on Aug 20, 2007 at 11:45 UTC ( #633776=note: print w/ replies, xml ) Need Help??


in reply to what would you like to see in perl5.12?

First of all, the things that are promised for 5.10. :) These include:

  • defined or: // and //=
  • speed improvements in regexes as promised (and implemented) by demerphq
  • recursive regexes! (ditto)

Aside from that, I'd like to see support for matching regexes across boundaries for partially loaded buffers. That would ease processing files in blocks of a few k each, instead of having to load the entire file into a string.

As an example: say you're looking for a word "SELECT" and the buffer contains:

my $sth = $dbh->prepare('SEL
It's possible that it would have matched "SELECT" if the buffer wasn't cut off.

I'd like regexes to be able to catch that. Automatically.

I don't really care how it's done, but I personally favor a system that takes some action (die, set a variable, call a callback sub) when the lookahead "touches" the back end of the buffer. (I call that the "electric fence" approach: touch it and you're dead.)


Comment on Re: what would you like to see in perl5.12?
Select or Download Code
Re^2: what would you like to see in perl5.12?
by sgt (Chaplain) on Aug 22, 2007 at 20:56 UTC

    Yes. I do agree completely. This opens the realm of stream regexps and would facilitate greatly the construction of regexp-based tokenizer (scalar m//gc) which need to process their input in chunks. Currently you need to resort to contorted hacks to do stream tokenizing, a pity as this limits the implementation of generic parser generators in pure Perl.

    What is needed is a way to keep the state of the regexp engine at the end of the buffer -- end-of-buffer-match case--, so that when you add another chunk, the engine does not start again from the beginning. Considering all the goodies added by demerphq, maybe there is hope ;) to see something soon.

    Also I'd like to be able to switch to a smaller but faster regexp implementation just for a block. Or maybe be able to turn off parts of the main engine -- locally -- that I know I am not going to use in a given block (supposing that doing so gives extra speed of course).

    cheers --stephan

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://633776]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (18)
As of 2015-07-01 20:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (19 votes), past polls