|Perl: the Markov chain saw|
The Perl Regex Testerby davido (Archbishop)
|on Jul 03, 2012 at 18:15 UTC||Need Help??|
Over the years I've come across a number of websites that provide regex testing. But it always seems like I'm looking at a "Perl Compatible Regular Expression" through PHP, or some other language's goggles. And while some of them offer a slick interface, they usually only tell whether or not there was a match, and possibly what got captured. They all felt quirky.
So I set out to create my own quirky implementation, but in a way that I consider more useful and applicable to Perl users. The Perl Regex Tester (Github repo).
While the interface may be a bit Spartan -- no ajax, no flash, no fuss, it works pretty well (at least in my skewed assessment). And it provides the following features:
The site is currently hosted in a dev account at Dotcloud, and consequently the URL is a little goofy. Someday it may move to a more friendly URL, but I"m taking a wait and see approach, as I'd like to be sure that I'm catching all the most significant pitfalls in executing user-supplied regexes first.
Some "gory details":
The site is hosted in a Perl service at Dotcloud, through a Sandbox (development--free) account. These accounts aren't designed to scale, and have no availability guarantees. However, they are generally quite reliable. I could upgrade to a "Live" account which has performance and reliability guarantees, but it's just a pet project. There are four processes, and the entire kit and kaboodle consumes up to about 90MB of VM RAM. It sits on top of Plack, which is interfacing with Nginx. The code consists of a 175 line Moo-based model class, and about 50 lines of Mojolicious::Lite code, plus Some Mojolicious templates and Twitter Bootstrap CSS, with a few additions.
Much of the model class is devoted to paranoia. Special variables that might be interpolated to discover things like environment variables are escaped in regexes before they're compiled. Compilation of regexes takes place in a Safe compartment, returning a Safe-compartmentalized Regex object. Matches are carried out in a Try::Tiny compartment so that fatal errors can be trapped. An alarm timeout is set so that crazy-inefficient regexes won't chew up too much server time. Capturing the debug info was made easier by using Capture::Tiny. And of course I'm operating under no re 'eval'; (Honorable mention, via an update to this node.) Modifiers are restricted to those that make sense in the contexts of this tool (which means I currently drop the /c modifier, if you ask for it).
The "Captures" section will display the capture variables that are defined for the current successful match. The "Debug" section will display regardless of whether the match was successful or not, making it a useful tool for figuring out what went wrong.
I would have liked to also display GraphViz2::Parse::Regex, but Dotcloud doesn't have the "graphviz" C libraries installed for it, and I figured I probably shouldn't press my luck with a free account. I also wanted YAPE::Regex::Explain, but for some reason in the context of a Mojolicious web application with Unicode_Strings enabled, it produces no output. And it's fairly outdated anyway.
This was originally written as a quick demonstration of how simple it can be to get something together quickly with Mojolicious, and pushed to Dotcloud. I'll place it in a Github repo in a few days and will follow-up here when that's done. The model layer is front-end-agnostic, so I could easily turn it into a command-line tool. If I get around to doing that I'll add YAPE::Regex::Explain support back in.
Please feel free to play around with it and use it. If you find a problem or want to request an additional feature, send me a message and I'll see what I can do.