Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

User name character classes

by Zenzizenzizenzic (Scribe)
on Feb 22, 2019 at 19:04 UTC ( #1230409=perlquestion: print w/replies, xml ) Need Help??

Zenzizenzizenzic has asked for the wisdom of the Perl Monks concerning the following question:

Is there a way to make a user defined character class, for instance, should I want to find words like "did" that are consonant, vowel, consonant, I could do this:
print "CVC word found : $1\n" if (/\b([b-df-hj-np-tv-z][aeiouy][b-df-hj-np-tv-z])\b/i);
But I'd prefer (esp. in more complicate examples) to do something like:
print "CVC word found : $1\n" if (/\b([:consonant][:vowel][:consonant])\b/i);
Thank you

Replies are listed 'Best First'.
Re: User name character classes (updated)
by haukex (Bishop) on Feb 22, 2019 at 19:15 UTC

    You could use User Defined Character Properties (example).

    Update:

    sub IsConsonant () { "42 44\n46 48\n4A 4E\n50 54\n56 58\n5A\n62 64\n" ."66 68\n6A 6E\n70 74\n76 78\n7A\n" } sub IsVowel () { "41\n45\n49\n4F\n55\n59\n61\n65\n69\n6F\n75\n79\n" } print "CVC word found : $1\n" if /\b(\p{IsConsonant}\p{IsVowel}\p{IsConsonant})\b/;

    Update 2: Code you can use to generate sets like the above:

    use warnings; use strict; use Data::Dump; use Set::IntSpan; my @chars = grep {!/^[aeiouy]$/i} 'a'..'z','A'..'Z'; dd join '', map { no warnings 'redundant'; sprintf $$_[0]==$$_[1] ? "%X\n" : "%X %X\n", @$_ } Set::IntSpan->new( map {ord} @chars )->spans; __END__ "42 44\n46 48\n4A 4E\n50 54\n56 58\n5A\n62 64\n66 68\n6A 6E\n70 74\n76 + 78\n7A\n"
      Will check this out as well, thanks much!
Re: User name character classes
by AnomalousMonk (Bishop) on Feb 22, 2019 at 19:58 UTC

    If you want to avoid scary experimental features, the "classic" way to do this is by composition:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $rx_vowel = qr{ [aeiouAEIOU] }xms; my $rx_cons = qr{ (?! $rx_vowel) [[:alpha:]] }xms; ;; my $rx_cvc = qr{ \b $rx_cons $rx_vowel $rx_cons \b }xms; ;; my $s = 'when did the cop get his cap and tape and run out too?'; my @captures = $s =~ m{ $rx_cvc }xmsg; dd \@captures; " ["did", "cop", "get", "his", "cap", "run"]
    See perlre, perlretut, and perlrequick.

    Update 1: Changed example string slightly to highlight difference between "cap" and "tape" (was "cape").

    Update 2: Another ancient way to do this purely with character classes may be slightly faster because it avoids a lookaround:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my $vowel = 'aeiouAEIOU'; my $rx_vowel = qr{ [\Q$vowel\E] }xms; my $rx_cons = qr{ [^[:^alpha:]\Q$vowel\E] }xms; ;; my $rx_cvc = qr{ \b $rx_cons $rx_vowel $rx_cons \b }xms; ;; my $s = 'when did the cop get his cap and tape and run out too?'; my @captures = $s =~ m{ $rx_cvc }xmsg; dd \@captures; " ["did", "cop", "get", "his", "cap", "run"]
    Note, however, the tricky double-negative  [^[:^alpha:]\Q$vowel\E] in $rx_cons: "not not-an-alpha or a vowel". See "The POSIX character class syntax" in perlre.


    Give a man a fish:  <%-{-{-{-<

      scary experimental features

      Hm, although I see a mention that (?[ ]) is experimental, it doesn't seem to me like User Defined Character Properties are experimental...?

        ... it doesn't seem to me like User Defined Character Properties are experimental ...

        I think you're right. I just panicked there for a second.


        Give a man a fish:  <%-{-{-{-<

      Will look into these, thanks much!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1230409]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2020-10-22 17:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (229 votes). Check out past polls.

    Notices?