Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re: regex search for words with one digit

by Athanasius (Bishop)
on Sep 21, 2020 at 16:03 UTC ( #11122008=note: print w/replies, xml ) Need Help??

in reply to regex search for words with one digit

The character class \w matches an alphanumeric character, so it matches a digit as well as a letter (or underscore). You need a character class which excludes digits. But \D includes anything not a digit, so it matches whitespace. A negated character class [^\d\s] will match a character that is neither a digit nor a space:

my @names = $text =~ /\b[^\d\s]*\d[^\d\s]*\b/g;

Or, more simply, specify the letters you want to match explicitly (note the /i modifier to make the regex case-insensitive):

my @names = $text =~ /\b[A-Z]*\d[A-Z]*\b/gi;

See the section “Character Classes and other Special Escapes” in perlre#Regular-Expressions.

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^2: regex search for words with one digit
by AnomalousMonk (Bishop) on Sep 21, 2020 at 18:31 UTC

    Note that these match the name '1'.

    Update: Note also that [^\d\s] matches stuff like % & = - / .

    Give a man a fish:  <%-{-{-{-<

      Hello AnomalousMonk,

      You make excellent points. Having read over this thread, I think I would now approach this task in a more long-winded — but hopefully safer — way by addressing the requirements separately:

      use strict; use warnings; my $text = "John P5ete 1 Andrew Richard58 Nic4k Le7on5 Ab5%&=-/zz."; my @words = split /\s+/, $text; my @names; for my $word (@words) { my @chars = $word =~ /[A-Z]/gi; my @digits = $word =~ /\d/g; my @symbols = $word =~ /\W/g; push @names, $word if @chars && @digits == 1 && !@symbols; } print "@names\n";


      19:27 >perl P5ete Nic4k 19:27 >

      This may or may not be exactly what the OP intended, but breaking down the code into separate parts like this at least makes it easier to tweak as and when the requirements are clarified.

      To the OP:

      • \W matches any non-word character; but, as the original string was split on whitespace, there are no whitespace characters in any $word and so within the for loop \W matches the sort of non-alphanumeric symbols identified by AnomalousMonk.
      • if @chars is Perlish shorthand for if scalar(@chars) != 0; similarly, if ... !@symbols is a shorter way of saying if ... scalar(@symbols) == 0.


      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11122008]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2021-02-27 23:05 GMT
Find Nodes?
    Voting Booth?

    No recent polls found