Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: How to use "less than" and "greater than" inside a regex for a $variable number

by LanX (Saint)
on Oct 01, 2012 at 20:52 UTC ( [id://996746]=note: print w/replies, xml ) Need Help??


in reply to How to use "less than" and "greater than" inside a regex for a $variable number

don't really think I understand your task, but

> ...found that I can use an "if-then" type of expression within a regex, but have found no examples for how to do this. How would you do this?

see perlretut#Conditional-expressions

HTH

Cheers Rolf

  • Comment on Re: How to use "less than" and "greater than" inside a regex for a $variable number

Replies are listed 'Best First'.
Re^2: How to use "less than" and "greater than" inside a regex for a $variable number
by Polyglot (Chaplain) on Oct 01, 2012 at 21:28 UTC
    I appreciate the link, but that resource only tells how to work with a known quantity. I need to be able to match a variable number, conditional upon its relative value when compared to another number. In essence, I need to match based on the comparison result.

    For example, how would one do something like this?

    $string = "I have 5 apples, 6 oranges, and 8 limes."; #Match the oranges only if they are more than the apples #and fewer than the limes. $string =~ m/(\d+)\sapples.*(\d+)\soranges.*(\d+)\slimes(?{if (($1<$2) + && ($2<$3))})/g;
    If I had something like that I could then plug in $var for $2.

    Blessings,

    ~Polyglot~

      The  (*F) operator was introduced with 5.10. Prior to that,  (?!) can be used.

      >perl -wMstrict -le "for my $n (4 .. 9) { my $str = qq{I have 5 apples, $n oranges, and 8 limes.}; print qq{'$str'}; next unless $str =~ m{ (\d+) \s+ apples \D+ (\d+) \s+ oranges \D+ (\d+) \s+ limes (?(?{ $1 < $2 && $2 < $3 }) | (*F) ) }xms; print qq{'$2'}; } " 'I have 5 apples, 4 oranges, and 8 limes.' 'I have 5 apples, 5 oranges, and 8 limes.' 'I have 5 apples, 6 oranges, and 8 limes.' '6' 'I have 5 apples, 7 oranges, and 8 limes.' '7' 'I have 5 apples, 8 oranges, and 8 limes.' 'I have 5 apples, 9 oranges, and 8 limes.'
        I've implemented this approach, as it seems fairly close to the sort of solution I was looking for. Unfortunately, it is still rather slow. I started the process 2.5 days ago now (it's been running over 60 hours) and it is about half-way through the material. So it appears with this method it will take 5 days of 100% CPU on one of four cores of my Dell PowerEdge server. That's a little disappointing. My ugly approach, which may be slightly less thorough, finished after about three days. So it was 40% quicker.

        Given the complexity of the regex, I suppose I cannot blame perl or the program itself, it's just the way it is. But without the attempt to narrow the search to finding numbers between their respective forerunners/postrunners, the whole search can complete in less than five minutes.

        Anyway, at least I have learned something and I much appreciate your patience in demonstrating this method for me. I may still be able to use this as a final check over a long weekend or something, or perhaps I can limit the amount of material to be checked at a time (~130 books total). Thank you!

        Blessings,

        ~Polyglot~

      Actually, I think the problem can be addressed without the need for exotic regex operators or constructs (although this uses the  \K operator introduced with 5.10). Unfortunately, this approach involves the replacement of a substring with the identical substring, an operation that I do not think the regex compiler can optimize away and that therefore may lead to a bit of inefficiency.

      >perl -wMstrict -le "my $book = qq{pg. 1 foo pg. 2 bar baz pg. 4 fee fie pg. 5 foe \n} . qq{fum pg. 6 hoo ha pg. 9 deedle pg. 10 \n} . qq{blah blah pg. 14 noddle \n} ; print qq{[[$book]] \n}; ;; my $pn = qr{ pg[.] \s+ }xms; $book =~ s{ $pn (\d+) \K (.*?) (?= $pn (\d+)) } { my $m = missing($1, $3); $m ? qq{$2$m } : $2; }xmsge; print qq{(($book)) \n}; ;; sub missing { my ($i, $j) = @_; ;; return if $j - $i < 2; ;; my ($ii, $jj) = ($i + 1, $j - 1); return $j - $i > 2 ? qq{(pages $ii - $jj missing)} : qq{(page $ii missing)} ; } " [[pg. 1 foo pg. 2 bar baz pg. 4 fee fie pg. 5 foe fum pg. 6 hoo ha pg. 9 deedle pg. 10 blah blah pg. 14 noddle ]] ((pg. 1 foo pg. 2 bar baz (page 3 missing) pg. 4 fee fie pg. 5 foe fum pg. 6 hoo ha (pages 7 - 8 missing) pg. 9 deedle pg. 10 blah blah (pages 11 - 13 missing) pg. 14 noddle ))

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://996746]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2024-04-20 12:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found