Re: Regex matching words with numbers, but not numbers.

That depends on what you mean by 'pure numbers'. 87? 99.00? 0.5? Let's assume all of that are numbers...

Is there a way to a create a regular expression character class that has some mandatory and optional members?

Yes, but you should probably use the function "looks_like_number" from Scalar::Util

What would be your way to match (not necessarily replace) these "words"?

use 5.020;
use warnings;

# for umlauts and stuff... not really necessary
# but a good idea regardless
use utf8;
use open qw{ :encoding(utf-8) :std };

use Scalar::Util 'looks_like_number';

my $string1 = 'foo 1foo;   foo_2   foo-bar() 87   - _ !@#$% ';
my $string2 = 'F? 1_1 99.00 .5 \\x87 14 fourteen !@#99$% 000';

my $test_string = $string1 . $string2;

while ( $test_string =~ m/ (\S+) /gx ) { # or whatever is a "word"
    my ( $word, $start, $end ) = ( $1, $-[0], $+[0] );
    next if $word !~ m/ \d+ /x or looks_like_number($word);
    say qq{"$word" has numbers, but doesn't look like number. Start: $
+start, end: $end};
}
[download]

Output:

"1foo;" has numbers, but doesn't look like number. Start: 4, end: 9
"foo_2" has numbers, but doesn't look like number. Start: 12, end: 17
"1_1" has numbers, but doesn't look like number. Start: 48, end: 51
"\x87" has numbers, but doesn't look like number. Start: 61, end: 65
"!@#99$%" has numbers, but doesn't look like number. Start: 78, end: 8
+5
[download]

Further down the road the actual task is to find the position of the next "word".

The positions are stored in magic arrays @- and @+

    @LAST_MATCH_START
    @-
    $-[0] is the offset of the start of the last successful match
...
    @LAST_MATCH_END
    @+
    This array holds the offsets of the ends of the  last
successful submatches in the currently active dynamic scope.
[download]

Comment on Re: Regex matching words with numbers, but not numbers. Select or Download Code


Problems? Is your data what you think it is?
	PerlMonks