Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: UTF8 related proof of concept exploit released at T-DOSE

by demerphq (Chancellor)
on Oct 15, 2007 at 23:50 UTC ( #645068=note: print w/ replies, xml ) Need Help??


in reply to UTF8 related proof of concept exploit released at T-DOSE

Because this is probably a side effect of something

I'm not sure what you mean. My guess is that it comes from the internals when the regex engine tries to read a codepoint from the string, since its not valid it dies.

The solution is very simple: do not use :utf8, but use :encoding(UTF8) (or for strict Unicode compliant UTF-8, use :encoding(UTF-8) (same, but with a hyphen)), as should have been done in the first place.

Thats really crappy. Its huffman coded all wrong. IMO this should be raised on perl5porters with some thought to changing it for the better.

---
$world=~s/war/peace/g


Comment on Re: UTF8 related proof of concept exploit released at T-DOSE
Re^2: UTF8 related proof of concept exploit released at T-DOSE
by Juerd (Abbot) on Oct 16, 2007 at 01:05 UTC

    Because this is probably a side effect of something
    I'm not sure what you mean.

    I mean that I find it surprising that enabling warnings suddenly makes the program die. It should warn, not die. Or, alternatively, it should die even without "use warnings".

    "use warnings" without FATAL argument should not introduce fatal errors to the language. I suspect that the fatal exception is a side effect, not the intended behaviour.

    The solution is very simple: do not use :utf8, but use :encoding(UTF8) (or for strict Unicode compliant UTF-8, use :encoding(UTF-8) (same, but with a hyphen)), as should have been done in the first place.
    Thats really crappy. Its huffman coded all wrong. IMO this should be raised on perl5porters with some thought to changing it for the better.

    I agree that the huffman coding here is entirely wrong. Everything surrounding identifiers for the UTF8 flag, including its own names "svUTF8" and "the UTF8 flag" is very unfortunate. The very short name for the :utf8 PerlIO layer is downright dangerous, if :encoding(utf8) is the correct style.

    However, I insist that :utf8 must not be made an abbreviation for :encoding(UTF-8), because that would encourage people to use :utf8, which in 5.8.0 thru 5.8.8 is a security risk, and these versions will stay around for a long time.

    One solution that comes to mind is:

    1. Rename :utf8 to :_svUTF8. It is a direct interface to internals and should look like that.
    2. Keep support for :utf8 for backwards compatibility, but issue a mandatory warning.

    Optionally:
      3. Allow ":enc" as an abbreviation for ":encoding"
      4. Allow "=foo" as an abbreviation for "(foo)" so you can have ":enc=utf8" which is doable

    1 and 2 are, IMO, a good solution for a real problem. I'm not so sure 3 and 4 would be good: they'd make programs and modules depend on a new version of Perl only for syntactic sugar.

    Juerd # { site => 'juerd.nl', do_not_use => 'spamtrap', perl6_server => 'feather' }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://645068]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (8)
As of 2014-08-28 00:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (254 votes), past polls