Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re^2: Best practice or cargo cult?

by Juerd (Abbot)
on Jun 20, 2006 at 15:00 UTC ( #556417=note: print w/replies, xml ) Need Help??

in reply to Re: Best practice or cargo cult?
in thread Best practice or cargo cult?

As I understand it, these options will all effectively be turned on by default in the Perl 6 regex engine. So either Larry has decided that they are, in fact, best practice or Damian has sneaked them into the specs whilst Larry wasn't watching.

Firstly, Perl 6 is not Perl 5.

Secondly, Perl 6 gives you \N, a convenient way to write <-[\n]> (that's [^\n]). It's worse than ., but acceptable. Writing [^\n] all the time is a hard exercise for one's fingers, and makes for messy code. That's why I strongly believe you should only use /s when you really want . to include the newline character.

/m won't be turned on by default in Perl 6. Instead, we get different metacharacters for begin/end of line versus string. So again it gives best of BOTH worlds.

As for /x... I have no strong opinion about that. I don't think /\A\d+\z/ is unreadable, but I don't mind /\A \d+ \z/x at all.

Juerd # { site => '', do_not_use => 'spamtrap', perl6_server => 'feather' }

Replies are listed 'Best First'.
Re^3: Best practice or cargo cult?
by demerphq (Chancellor) on Jun 21, 2006 at 12:51 UTC

    \N could easily be added to blead. Ill check into it.


      \N{...} is already a recognized pattern in perl5 regexp language. Pick something else and add it to your current perl using the instructions at Extending Regular Expression Syntax. It's a presentation I gave to last year. Or... co-opt \N for your own use. It isn't as if \N is so common that you'd miss it if you stole it away from perl. In my demo I redefined \w and \b to something more appropriate for my own set of common tasks.

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

        Actually, ive been mulling adding some support for some special regnodes for a while. For instance more than once ive been asked for a proper fail regop. And actually I made that comment offhand, if I had remembered that \N{} is already used I would have suggested something else. Any recommendations?


      In fact, here's the implementation. This works in perl5 going back to uh... early? Have your cake today. Not that I tested it. It's simple enough I just penned this and didn't bother running it.

      use Regexp::SlashN; "A B C" =~ /(\N+)/; $1 eq "A B C" or die;


      package Regexp::SlashN; use overload; sub import { overload::constant qr => \ &convert } # A simple table of definitions my %syntax = ( '\\' => '\\', N => '[^\n]', ); sub convert { my ( $re ) = @_; $re =~ s/\\([\\N])/$syntax{$1}/g; return $re; }

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

        Well there are two reasons why I wouldnt do it this way. The first is that doing this afaict adds a high cost to compiling regexes in the scope where the overload takes effect. The second is that special metasequences like we are discussing can be handled much more efficiently by the regex engine. So for instance a NEOL regop would be a lot more efficient both in terms of storage and execution than the ANYOF regop that [^\n] is converted to.

        The ANYOF is implemented by a bitmap lookup with flags, meaning it requires more than 32 bytes to represent, and for each character inspected requires a set of bit shifting to do the correct bitmap test. Wheras an NEOL regop would be much faster as it would essentially be a straight character inequality test. Also an NEOL regop would be just 4 bytes iirc.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://556417]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2018-05-24 23:25 GMT
Find Nodes?
    Voting Booth?