Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^4: Best practice or cargo cult?

by diotalevi (Canon)
on Jun 22, 2006 at 05:25 UTC ( [id://556837]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Best practice or cargo cult?
in thread Best practice or cargo cult?

In fact, here's the implementation. This works in perl5 going back to uh... early? Have your cake today. Not that I tested it. It's simple enough I just penned this and didn't bother running it.

use Regexp::SlashN; "A B C" =~ /(\N+)/; $1 eq "A B C" or die;

Regexp/SlashN.pm

package Regexp::SlashN; use overload; sub import { overload::constant qr => \ &convert } # A simple table of definitions my %syntax = ( '\\' => '\\', N => '[^\n]', ); sub convert { my ( $re ) = @_; $re =~ s/\\([\\N])/$syntax{$1}/g; return $re; }

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Replies are listed 'Best First'.
Re^5: Best practice or cargo cult?
by demerphq (Chancellor) on Jun 22, 2006 at 07:44 UTC

    Well there are two reasons why I wouldnt do it this way. The first is that doing this afaict adds a high cost to compiling regexes in the scope where the overload takes effect. The second is that special metasequences like we are discussing can be handled much more efficiently by the regex engine. So for instance a NEOL regop would be a lot more efficient both in terms of storage and execution than the ANYOF regop that [^\n] is converted to.

    The ANYOF is implemented by a bitmap lookup with flags, meaning it requires more than 32 bytes to represent, and for each character inspected requires a set of bit shifting to do the correct bitmap test. Wheras an NEOL regop would be much faster as it would essentially be a straight character inequality test. Also an NEOL regop would be just 4 bytes iirc.

    ---
    $world=~s/war/peace/g

      This is nothing a little conditional can't cure. From a syntax standpoint, \N is the right symbol to use since \n means "newline" and we have the practice of saying \w|\W and \s|\S. I would think you'd either want to shuffle off the unicode name or just not do the work.

      sub import { if ( $] >= 5.010 ) { # Thanks to demerphq, this is native and the overloading isn't + needed. } else { overload::constant qr => \ &convert; } }

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://556837]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2024-04-25 08:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found