Beefy Boxes and Bandwidth Generously Provided by pair Networks Frank
Syntactic Confectionery Delight
 
PerlMonks  

Re: [[:punct:]] vs. {IsPunct} in 5.8

by particle (Vicar)
on Nov 02, 2003 at 14:26 UTC ( #303910=note: print w/ replies, xml ) Need Help??


in reply to :punct: vs. {IsPunct} in 5.8

for some background, perlre (5.008) states:

The following equivalences to Unicode \p{} constructs and equivale +nt backslash character classes (if available), will hold: [:...:] \p{...} backslash alpha IsAlpha alnum IsAlnum ascii IsASCII blank IsSpace cntrl IsCntrl digit IsDigit \d graph IsGraph lower IsLower print IsPrint punct IsPunct space IsSpace IsSpacePerl \s upper IsUpper word IsWord xdigit IsXDigit <em>For example "[:lower:]" and "\p{IsLower}" are equivalent.</em>

if your results match mine,

#!/usr/bin/perl use strict; use warnings; $|++; my %classes= qw/ alpha IsAlpha alnum IsAlnum ascii IsASCII blank IsBlank cntrl IsCntrl digit IsDigit graph IsGraph lower IsLower print IsPrint punct IsPunct space IsSpace upper IsUpper word IsWord xdigit IsXDigit /; for( keys %classes ) { my( $r_posix, $r_unicode )= ( qr/[[:$_:]]/, qr/\p{$classes{$_}}/ ); print "testing $r_posix and $r_unicode$/"; for my $x (0x00..0x7e) { local $_= chr $x; printf "0x%x (%3d.) differ$/" => $x, $x if /$r_posix/ xor /$r_unicode/; } } __END__ testing (?-xism:[[:digit:]]) and (?-xism:\p{IsDigit}) testing (?-xism:[[:upper:]]) and (?-xism:\p{IsUpper}) testing (?-xism:[[:xdigit:]]) and (?-xism:\p{IsXDigit}) testing (?-xism:[[:cntrl:]]) and (?-xism:\p{IsCntrl}) testing (?-xism:[[:alnum:]]) and (?-xism:\p{IsAlnum}) testing (?-xism:[[:space:]]) and (?-xism:\p{IsSpace}) testing (?-xism:[[:print:]]) and (?-xism:\p{IsPrint}) testing (?-xism:[[:ascii:]]) and (?-xism:\p{IsASCII}) testing (?-xism:[[:word:]]) and (?-xism:\p{IsWord}) testing (?-xism:[[:alpha:]]) and (?-xism:\p{IsAlpha}) testing (?-xism:[[:punct:]]) and (?-xism:\p{IsPunct}) 0x24 ( 36.) differ 0x2b ( 43.) differ 0x3c ( 60.) differ 0x3d ( 61.) differ 0x3e ( 62.) differ 0x5e ( 94.) differ 0x60 ( 96.) differ 0x7c (124.) differ 0x7e (126.) differ testing (?-xism:[[:lower:]]) and (?-xism:\p{IsLower}) testing (?-xism:[[:blank:]]) and (?-xism:\p{IsBlank}) testing (?-xism:[[:graph:]]) and (?-xism:\p{IsGraph})

then i'd list this as a bug, and contact p5p. it seems only [[:punct:]] and \p{IsPunct} differ. this is not expected behavior.

~Particle *accelerates*


Comment on Re: [[:punct:]] vs. {IsPunct} in 5.8
Select or Download Code
Re: Re: [[:punct:]] vs. {IsPunct} in 5.8
by graff (Chancellor) on Nov 02, 2003 at 17:03 UTC
    Thanks for such a nicely crafted verification. (I wanted to check the other POSIX vs. Unicode classes as well, so you saved me some trouble -- and shown a neat approach!)

    I have posted the observation to both perl5-porters and perl-unicode mail lists.

Re: Re: [[:punct:]] vs. {IsPunct} in 5.8
by dakkar (Hermit) on Nov 02, 2003 at 20:45 UTC

    It's a bug alright. A documentation bug...

    I checked the Unicode properties, and these are the results:

    CodepointCharClass
    0024$Currency Symbol
    002B+Math Symbol
    003C<Math Symbol
    003D=Math Symbol
    003E>Math Symbol
    005E^Modifier Symbol
    0060`Modifier Symbol
    007C|Math Symbol
    007E~Math Symbol

    So those are not "punctuation" according to the Unicode standard... Time for a PunctPerl class, to keep company to SpacePerl?

    -- 
            dakkar - Mobilis in mobile
    

    Most of my code is tested...

    Perl is strongly typed, it just has very few types (Dan)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://303910]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2014-04-24 10:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (565 votes), past polls