use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new(
qr/^(\d+\..*?hello.*)$/m
)->explain;
__END__
The regular expression:
(?m-isx:^(\d+\..*?hello.*)$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?m-isx: group, but do not capture (with ^ and $
matching start and end of line) (case-
sensitive) (with . not matching \n)
(matching whitespace and # normally):
----------------------------------------------------------------------
^ the beginning of a "line"
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
----------------------------------------------------------------------
hello 'hello'
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
$ before an optional \n, and the end of a
"line"
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
| [reply] [d/l] [select] |
Not being too much of a “golfer,” I tend to solve such problems in two steps: first, I look for the string-structure that I am looking for, then I look for “hello...” within that string.
One issue that you should consider is that ... right now, you have no clearly-defined beginning/ending delimiter: where does the string begin, and where does it end? In such a case, the less-than/greater-than strings are the only reliable anchor-points that you have, in which case split() and pos() become your friends. (Along with the i,g modifiers of a regex.) You might be able to construct the argument (and therefore, a program) which says that what you really have here is a string that is “split by” either of these two characters. You iterate through the string, looking for these characters and noting their positions. You decide if a string-of-interest could be “beginning” or “ending,” and you extract the pieces for a closer look with substr().
Really, the true challenge of this kind of algorithm is “ruggedly and completely defining it.” It probably will be a two-part solution. (“First, find the strings, then, see if they’re interesting.”) After you have used perldoc and then maybe a few experimental programs to confirm in your own mind how these various Perl tools work, spend some serious thought-time defining your algorithm. It might not be entirely trivial. I would go so far as to recommend constructing a series of test-cases with test-strings, and build a Test::More test suite to actually and completely test it. You could easily construct a subtly flawed algorithm, bang it a few times, say, “yep, it seems to work,” and find that you are totally-wrong when your code goes into production. It happens. (A lot.) And, it’s not pretty or fun. The “extra” time needed to “prove it!!” will be worthwhile.
| [reply] |
!/<hello>/ and /(hello)/ and print $1
| [reply] [d/l] |
I need a plain regex expression which can be used as a condition
what I have come up with is : \bhello\b(?! ^\\w:-]*?>)
please help
| [reply] |
/<hello>(*COMMIT)(?!)|hello/;
| [reply] [d/l] |
that did not work I want to select all the following combinations
<hello
hello
hello>
but not <hello>
thanks for your help | [reply] |
that did not work
I want the follwing combinations to be selected
<hello
hello>
hello
but not <hello>
thanks for your help | [reply] |
| [reply] |
1.hello>
2.<hello
3.hello
<hello>
pm_regex.pl
use strict;
use warnings;
my $filename = shift or die "Usage $0 FILENAME\n";
open my $fh, '<', $filename or die "Could not open '$filename'\n";
while (my $line = <$fh>) {
chomp $line;
if ($line =~ /^\d+\..*?(hello).*$/) {
print "In $line $1 matches\n";
} else {
print "$line doesn't match\n";
}
}
Running perl pm_regex.pl pm_text.txt produced the output:
In 1.hello> hello matches
In 2.<hello hello matches
In 3.hello hello matches
<hello> doesn't match
| [reply] [d/l] [select] |