Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^6: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?

by dave_the_m (Prior)
on Dec 22, 2016 at 10:44 UTC ( #1178357=note: print w/replies, xml ) Need Help??


in reply to Re^5: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?
in thread Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?

So what new problem was addressed by the 5.17 changes?
I can't remember the full details off the top of my head, but amongst others issues, there was a bug in the 5.8.1 implementation that, with a suitably crafted set of keys, could trigger the hash code into doubling the bucket size for every added key, making it trivial to exhaust a web server's memory. It was also shown that the ordering of keys extracted from a hash (like a web server returning unsorted headers) could be used to determine the server's hash seed.
And has anyone ever seen a plausible demonstration of that "new problem"?
On the security list I've seen simple code (that puts a particular sequence of keys into hash) that can crash the perl process.
Has there ever been an reported sighting of anyone exploiting that new problem in the field?
That shouldn't be the criteria for fixing security issues.
If the change is so critical, why wasn't it back-ported to 5.10 and other earlier versions that are still being shipped with 95% of *nix distributions?)
We backported the relevant changes to all maintained perl versions. It's up to vendors whether they patch old unsupported perl versions if they still ship them.

Dave.

  • Comment on Re^6: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?

Replies are listed 'Best First'.
Re^7: Our perl/xs/c app is 30% slower with 64bit 5.24.0, than with 32bit 5.8.9. Why?
by BrowserUk (Pope) on Dec 22, 2016 at 11:06 UTC
    It was also shown that the ordering of keys extracted from a hash (like a web server returning unsorted headers) could be used to determine the server's hash seed.

    That's a demonstration I would like to see. As in, someone actually deducing it from the returned headers of a system they otherwise have no visibility to; rather than a just a theoretical speculation that it might be possible.

    Basically, I don't believe that this theoretical possibility could ever be actually exploited. (But I did also say: (IMO) above.)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.
      That's a demonstration I would like to see. As in, someone actually deducing it from the returned headers of a system they otherwise have no visibility to; rather than a just a theoretical speculation that it might be possible.
      On the security list, someone posted (1) a short perl program which created a hash with 28 shortish random word keys (i.e. those matching /[a-z]{2,12}/), and then printed those keys to stdout in unsorted order; (2) a C program, which given as input that list of keys, in 785 CPU seconds was able to completely determine the random hash seed of that perl process.

      Given that it is common for web apps to output headers or parameters or other things which are, or are derived from, unsorted hash keys, then put those two together and you get remote seed determination. I don't think anyone went as far as actually demonstrating it against a web server.

      Dave.

        > a short perl program which created a hash with 28 shortish random word keys

        What Perl version? Before 5.18 with hash randomization or more recent?

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
        On the security list, someone posted (1) a short perl program which created a hash with 28 shortish random word keys (i.e. those matching /a-z{2,12}/), and then printed those keys to stdout in unsorted order; (2) a C program, which given as input that list of keys, in 785 CPU seconds was able to completely determine the random hash seed of that perl process.

        Okay. Is there any chance of laying my hands on the sources for the C program?

        I'd be a whole lot more impressed if the keys were a set of real (or at least realistic) headers, say something like this:

        1. But even if that could still be done in a similar timeframe -- which I think is highly doubtful -- in order to exploit that knowledge, they would then need to cause the server to generate a set of headers that provoked the pathological behaviour.

          How can an external party cause a server to generate a set of headers that are carefully crafted to induce the pathological behaviour that is the apparent root of the perceived problem?

        2. And, how many web servers would still be running that same perl process, with that same random seed 15 minutes later?
        3. And how many sites are there that run a single server process with a single persistent Perl process?
        4. And how many of those emit sufficient, short, and unsorted headers for the determination to be made?
        5. And how many of those accept sufficient input from remote user, to that same perl process such that the bad guys having determined the seed value, can construct a pathological set of keys of sufficient size to cause harm, and then persuade the process to accept those values from them and build a hash with them?

        I'm just not seeing the threat landscape where such a combination of requirements will exist. And even if they did, they would be so few and far between, and on such small websites -- single servers with a single permanent perl process are basically confined to schools, charities and mom*pop stores -- that no hacker is ever going to waste their time trying to find them, much less exploit them.

        In any case, my comment about "unnecessary" was little more than a footnote in my suggestion above that the OP could try reverting his 5.24 perl to using the 5.8.9 hashing mechanism to see if that was the source of his performance issue. If it isn't, one more thing to ignore. If it turned out it was, he could decide if his application was even remotely vulnerable to the "security concern" and choose to revert or not as he saw fit.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1178357]
help
Chatterbox?
[Corion]: I think I'm overdesigning things again. I want to export(later, synchronize) data from Google Keep, by scraping the HTML. And I'm thinking of automating this by having a canary note whose text my program knows and from which it can determine the ...
[Corion]: ... surrounding HTML to scrape all the other notes. Maybe I should better look at dumping all the requests that pass between Google and my "browser" instead.

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2017-12-12 08:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What programming language do you hate the most?




















    Results (327 votes). Check out past polls.

    Notices?