Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^4: Parse ISO 8601 date/times (never \d)

by tobyink (Abbot)
on Nov 07, 2012 at 17:46 UTC ( #1002698=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Parse ISO 8601 date/times (never \d)
in thread Parse ISO 8601 date/times

The following pragma will "fix" \d. However, re::engine::Plugin does not currently support s/// or split //, just matching. (And it doesn't support named captures either.) Still, it may be helpful for some.

use 5.010; use strict; use utf8::all; BEGIN { package re::engine::SaneDigits; no thanks; use constant TAINT => ${^TAINT}; use re::engine::Plugin (); use Carp; sub import { re::engine::Plugin->import( comp => \&comp, exec => \&exec, ); } *unimport = \&re::engine::Plugin::unimport; sub comp { my ($rx) = @_; my $real = $rx->pattern; $real =~ s{\\d}{[0-9]}g; $real =~ s{\\D}{[^0-9]}g; my %mods = my %mod = $rx->mod; my $mods = join q(), keys %mods; $real =~ s{/}{\/}g; $real = eval qq{ qr/$real/$mods }; $rx->stash({ real => $real }); $rx->num_captures( FETCH => sub { my ($rx, $paren) = @_; croak sprintf( "%s variable not supported with %s", { 0 => q($&), -1 => q($'), -2 => q($`) }->{$paren} +, __PACKAGE__, ) if $paren < 1; my $rv = $rx->stash->{last}[$paren]; return $rv unless TAINT; $rv =~ /(.*)/; return $1; }, ); } sub exec { my ($rx, $str) = @_; my @results = ($str =~ $rx->stash->{real}); unshift @results, scalar pos; $rx->stash->{last} = \@results; return not defined $results[0]; } }; my $str = "foo23 bar5 bar42"; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH"; use re::engine::SaneDigits; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH";

Update: Meh... come to think of it, a re::engine is overkill. Constant overloading does the trick much easier...

use 5.010; use strict; use utf8::all; BEGIN { package re::SaneDigits; no thanks; use overload (); my %_const_handlers = (qr => \&_qr); my %_remove_handlers = map { $_ => undef } %_const_handlers; sub import { overload::constant %_const_handlers } sub unimport { overload::remove_constant %_remove_handlers } sub _qr { for (@_) { s/\\d/[0-9]/g; s/\\D/[^0-9]/g; return $_; } } }; my $str = "foo23 bar5 bar42"; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH"; use re::SaneDigits; say $str =~ m/bar(\d+)/i ? "GOT $1" : "NO MATCH";

Another CPAN candidate I think.

Update II: Looks like PerlMonks might be breaking my UTF8 again. The "5" character which appears in $str should not be a normal ASCII 5, but a fullwidth 5 (U+U+FF15), which is a character used to include an Arabic numeral 5 within CJK text.

perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'


Comment on Re^4: Parse ISO 8601 date/times (never \d)
Select or Download Code
Replies are listed 'Best First'.
Re^5: Parse ISO 8601 date/times (still never \d)
by tye (Cardinal) on Nov 07, 2012 at 18:08 UTC

    Wow, that's a lot of machinery to avoid running :s/\\d/[0-9]/gc in your editor. I think I'll always choose to avoid the recurring cost.

    And I'm mostly not talking about CPU cost (but I suspect that is non-trivial), but the cost of things like mentally having to track new rules about how changing m// to s/// or split() requires extra attention, having to track that \d means different things in different places, having to search for pragmas each time I see \d inside m// if I care which meaning it has, the risk of having to debug the chain of code required to support this, etc.

    The risk of just wasting time because of a bug in the added pile of code required to support this is my biggest concern (after having repeatedly been burned by such things), especially when I consider the risk of this idea of pretending \d isn't \d getting in the way of some other tricky module's reasonable-sounding assumptions.

    Just because something is possible doesn't mean it is a good idea. :)

    - tye        

      The second version, which uses overloading, should work with s/// and split //.

      perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002698]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (10)
As of 2015-07-08 06:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (94 votes), past polls