Hi Tybalt89
Thank you. I am going to have to go away and consider this, as I am not sure that I understand exactly how this is working. You've concatenated all the different regex expressions into one string, whilst including the type string. The key points are "use re eval" and "map" -- neither of which I am familiar with. The latter appears to create a hash, which makes perfect sense, but I am going to have to understand what $all is all about before the penny drops.
| [reply] |
my @out = map { ... } @in;
# - becomes -
my @out;
for $_ (@in) {
my @result = ...;
push @out, @result;
}
The regex /^(\S+)\s++(.+)/ is splitting the input string on the first whitespace (it is equivalent to my ($left,$right) = split /\s+/, $str, 2;, see split). Using the ternary ?: operator, if the regex matches, the block of code will return the string "(?:$1(?{'$2'}))", and if it doesn't match, die is called. So in this case the map operation is not returning a hash (or a list of key-value pairs), but just one output string for each input string, the input strings being one line of the regexes each.
So with the join '|', ..., as you said the code is constructing a single regex. The general process of doing so is something I discussed in my tutorial Building Regex Alternations Dynamically, but this one is a bit more specialized. For the names, tybalt89 is using a neat trick using (?{...}), which allows you to insert arbitrary code into a regular expression, the return value of the most recent code is then stored in the special variable $^R. The use re 'eval'; is necessary because these (?{}) blocks are being interpolated from strings into the regex, so this is a security feature of Perl.
Consider this regex (I'm using the /x modifier for readability): m{ ^[a-zA-Z]\w+$ (?{'one'}) | ^[0-9]\w+$ (?{'two'}) }x. When matching against the string "3abc", it will match the second alternation, that is, ^[0-9]\w+$, and then it will execute (?{'two'}), and since the last value in that piece of code is 'two', that is what it returns and what $^R gets set to. After the regex has executed and matched successfully, you can simply look at the value of $^R to see which of the two patterns contained in the regex were matched.
Minor edits for clarity. | [reply] [d/l] [select] |
... key points ... "map" [which] appears to create a hash ...
No "hash" (in the sense of an associative array) is created at any point. The critical effect of the map expression is to extract the format regex ($1) and descriptive text ($2) substrings from each data type specifier record and use them to build a sub-regex for each data type. The (?{'$2'}) sub-sub-regex generates code that evaluates the descriptive text substring and returns it via the $^R regex special variable (see perlvar). All these sub-regexes are then concatenated together into one big alternation.
You can just
print $regex, "\n";
and pick your way through the result to see the alternation of all the sub-regexes in all their glory.
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |