Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Cloudflare blames PCRE for outage

by dmitri (Priest)
on Jul 16, 2019 at 00:31 UTC ( [id://11102903]=perlnews: print w/replies, xml ) Need Help??

This is the regular expression that caused all Cloudflare servers to use 100% CPU and thereby cause a 27-minute outage:
(?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbo +l|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))
The last part of the regex is odd:
.*(?:.*=.*)
The last grouping does not do anything useful: it is not followed by a quantifier, nor does it capture. It can be simplified to
.*=.*
or to some variation thereof. But this is not how the regex discussion ends in the blog post:
But laziness isn’t the total solution to this backtracking behaviour. Changing the catastrophic example .*.*=.*; to .*?.*?=.*?; doesn’t change its run time at all. x=x still takes 555 steps and x= followed by 20 x’s still takes 5,353 steps.

The only real solution, short of fully re-writing the pattern to be more specific, is to move away from a regular expression engine with this backtracking mechanism. Which we are doing within the next few weeks.

I am guessing this is politically driven: Some people at Cloudflare want to use Rust and this snafu is a convenient excuse.

Another angle to consider is that of personnel. The postmortem does not dwell on the fact that this regular expression made it through review. Meaning that not only the person who wrote the regular expression was unaware of the backtracking potential of the above, but neither did the reviewer.

Replies are listed 'Best First'.
Re: Cloudflare blames PCRE for outage
by Tanktalus (Canon) on Jul 16, 2019 at 01:07 UTC

    And when you have an SQL query that runs too slow, that's an indication that you should switch to NoSQL. And when you are bathing an infant and the water gets too dirty, that's an indication that you should throw out the baby with the bath water.

    As I've switched jobs, and am now living in a C# and Javascript world, I've found developers to be extremely paranoid about regular expressions. They aren't the best tool for all jobs, but they are the best tool for some jobs. And it is beneficial to not only know what those jobs are, but to know the language well enough to get those jobs done.

    So, I'll grant that they may be best off by removing the regexes and replacing them with something else. That may or may not be because regexes aren't the right tool for this job. But, more importantly, their developers don't really understand the language, or at least not well enough for this complex of a requirement. (I'll freely admit: I don't know regexes well enough to write an xml parser in it, so I wouldn't, even if that was the best language for that job - I doubt it is, but whether it is or isn't is moot if I'm the one coding it.)

    And I'll say that I'm finding a lot of devs' understanding of SQL is in much the same boat. I usually liken the two languages, regex and SQL, as they are both weird non-C-style languages (moreso than python) where everything is just insane if you don't really know what you're doing. And if you do know what you're doing, they're not really that bad.

      > And when you have an SQL query that runs too slow, that's an indication that you should switch to NoSQL.

      And when PCRE goes wild you have to blame Perl.

      A friend of mine ridiculed Perl after this incident and he found it hard to believe that PCRE is a library which is used in many languages BUT Perl.

      (Actually he didn't listen much, some people fly to India to enjoy Indian Summer.)

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Re: Cloudflare blames PCRE for outage
by trippledubs (Deacon) on Jul 16, 2019 at 02:23 UTC
Re: Cloudflare blames PCRE for outage
by Anonymous Monk on Jul 16, 2019 at 07:06 UTC

    I am guessing this is politically driven: Some people at Cloudflare want to use Rust and this snafu is a convenient excuse.

    Why are you guessing when its obvious?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlnews [id://11102903]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-03-29 06:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found