Re: Re: japhy's regex article for the TPJ

by japhy (Canon)
in reply to Re: japhy's regex article for the TPJ
in thread japhy's regex article for the TPJ

Actually, I'm glad you brought this up. In 5.8.4, there's improved ability (thanks to me) to create your own Unicode classes, and even build cascading ones. The documentation is in perlunicode, and here's an example (you must have Perl 5.8.4 for this to work):
package MyUnicode; sub InLetters { return << 'END'; 0041 005a 0061 007a END } sub InVowels { return << 'END'; 0041 0045 0049 004f 0055 0061 0065 0069 006f 0075 END } sub InConsonants { return << 'END'; +MyUnicode::InLetters -MyUnicode::InVowels END } package main; my $string = "Chicken Stromboli"; while ($string =~ /(\p{MyUnicode::InConsonants}+)/g) { print "consonant cluster: '$1'\n"; } __END__ consonant cluster: 'Ch' consonant cluster: 'ck' consonant cluster: 'n' consonant cluster: 'Str' consonant cluster: 'mb' consonant cluster: 'l'
I could write about that, and explain the new '&' class operand, which allows you to do the intersection of two or more Unicode classes.

I like this idea. Maybe I can do this and one other topic -- I don't want the article to be too widely scoped.

