Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re: About \d \w and \s

by ambrus (Abbot)
on Oct 19, 2009 at 08:59 UTC ( #801952=note: print w/replies, xml ) Need Help??

in reply to About \d \w and \s

To clarify things, does the unicode variant treat byte strings as if they were iso-8859-1 encoded? (There's also the question of how the use locale variant treats character strings, currently it assumes the string was accidentally iso-8859-1 decoded except where it has characters with code higher than 255, but it's probably always an error to actually depend on this so it doesn't matter.)

Strangely, it seems I don't have any obfus that use syntax like m/foobar/and (the closest I have is y//or in Ode for getprotobyname) so for a change this will be a new feature of perl core that does not break any of my obfus.

Replies are listed 'Best First'.
Re^2: About \d \w and \s
by demerphq (Chancellor) on Oct 19, 2009 at 21:54 UTC

    If I remember what iso-8859-1 is then i think so yes. In simple terms the rules will be those of unicode even tho the representation of the codepoints is bytes. In other words the matching would behave the same as would occur if you did a utf8::upgrade() on it before the match.

    How the regex engine works under use local will not be changed, except that it won't be "all or nothing", you will be able to turn it for sections of a pattern. I dont pretend to understand the use locale mode and I dont plan to do much with it. (Id like it if use locale "went away" actually.)


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://801952]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2018-02-20 21:57 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (274 votes). Check out past polls.