in reply to Wanted: Perl Regex Pretty Printer
See Re: ppiwx / wxPPI / wxppixregexp xPPIx_Regexp_linecol_onize / PPIx::Regexp::Element::column_number
Now I've coded up a replacement for YAPE::Regex::Explain, i'm close to posting it, see example at Re: regex help! (regexplain) -- it prettifies the easy way, without altering the original regex (it's eval-able)
rxrx also provides a describe-the-regex thing , but it needs to run it, example
'1234' /(?<P>(?&V))(?<Q>.)(?(DEFINE)(?<V>...))/ ←[34;40;4m ←[0m←[37;40m (?<P> ←[0m ←[36;40m + The start of a named capturing block (also $1)←[0m ←[34;40;4m ←[0m←[37;40m (?&V) ←[0m ←[36;40m + Match a call to the subpattern named <V>←[0m ←[34;40;4m ←[0m←[37;40m ) ←[0m ←[36;40m + The end of the named capturing block←[0m ←[34;40;4m ←[0m←[37;40m (?<Q> ←[0m ←[36;40m + The start of a named capturing block (also $2)←[0m ←[34;40;4m ←[0m←[37;40m . ←[0m ←[36;40m + Match any character (except newline)←[0m ←[34;40;4m ←[0m←[37;40m ) ←[0m ←[36;40m + The end of the named capturing block←[0m ←[34;40;4m ←[0m←[37;40m (?(DEFINE) ←[0m ←[36;40m + The start of a definition block (skipped during matching)←[ +0m ←[34;40;4m ←[0m←[37;40m (?<V> ←[0m ←[36;40m + The start of a named capturing block (also $3)←[0m ←[34;40;4m ←[0m←[37;40m . ←[0m ←[36;40m + Match any character (except newline)←[0m ←[34;40;4m ←[0m←[37;40m . ←[0m ←[36;40m + Match any character (except newline)←[0m ←[34;40;4m ←[0m←[37;40m . ←[0m ←[36;40m + Match any character (except newline)←[0m ←[34;40;4m ←[0m←[37;40m ) ←[0m ←[36;40m + The end of the named capturing block←[0m ←[34;40;4m ←[0m←[37;40m ) ←[0m ←[36;40m + The end of definition block←[0m ←[34;40;4m ←[0m
Ignoring the ansi escapes failure on win32, its prettyfied, each () indents, literals are literals ... its a very short hand-editing step from that to qr{}x
Compare to the original //x hand annotated
$_=1234; m{ (?<P>(?&V)) # match <V> and save to $+{P} (?<Q> .) # match <Q> and save to $+{Q} # this can be saved in $v_definition = qr// (?(DEFINE) (?<V> ...) # <V> aka (?&V) is three chars ) }xm;
A little search/replace and you have
m{ (?<P> # The start of a named capturing block (also $1) (?&V) # Match a call to the subpattern named <V> ) # The end of the named capturing block (?<Q> # The start of a named capturing block (also $2) . # Match any character (except newline) ) # The end of the named capturing block (?(DEFINE) # The start of a definition block (skipped during + matching) (?<V> # The start of a named capturing block (also $3 +) . # Match any character (except newline) . # Match any character (except newline) . # Match any character (except newline) ) # The end of the named capturing block ) # The end of definition block }x
Oh, duh, here is rxplain output (some perldoc links are broken, todo)
# my $regstr = join '', ; # The regular expression (PPI::Token::Regexp::Match): ; # # /(?<P>(?&V))(?<Q>.)(?(DEFINE)(?<V>...))/ # # matches as follows: #r: PPIx::Regexp / PPI::Token::Regexp::Match #r= "/(?<P>(?&V))(?<Q>.)(?(DEFINE)(?<V>...))/" # |
"", |
# address=/1/C0 ; xRe::Token::Structure ; Represent structural elements. #"", # ------------------------------------------------------------------ |
# # address=/1/C1 ; xRe::Structure::Regexp ; Represent the top-level regular expression # perl_version_introduced=5.009005 # ------ |
"/", |
# address=/1/C1/S0 ; xRe::Token::Delimiter ; Represent the delimiters of the regular expression #"/", # ------------------------------------------------------------------ |
# # address=/1/C1/C0 ; xRe::Structure::NamedCapture ; a named capture # L<<< perlre/(?<NAME>pattern) >>> # L<perlvar/%+> # perl_version_introduced=5.009005 # number=1 alias "$1" or "\1" # name=P alias "\g{P}" alias "\k<P>" alias "(?&P)" alias "(?P>P)" alias "$+{P}" # ------------ |
"(", |
# address=/1/C1/C0/S0 ; xRe::Token::Structure ; Represent structural elements. #"(", # ------------------------------------------------------------------ |
"?<P>", |
# address=/1/C1/C0/T0 ; xRe::Token::GroupType::NamedCapture ; #"?<P>", # ------------------------------------------------------------------ |
"(?&V)", |
# # address=/1/C1/C0/C0 ; xRe::Token::Recursion ; a recursion # L<perlre/(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)> # perl_version_introduced=5.009005 # name=V alias "(?&V)" alias "(?P>V)" # MATCH RECURSION at address=/1/C1/C2/C1 #"(?&V)", # ------------------------------------------------------------------ |
")", |
# # address=/1/C1/C0/F0 ; xRe::Token::Structure ; Represent structural elements. # end of grouping for number=1 alias "$1" or "\1" # end of grouping for name=P alias "\g{P}" alias "\k<P>" alias "(?&P)" alias "(?P>P)" alias "$+{P}" #")", # ------------------------------------------------------------------ |
# # address=/1/C1/C1 ; xRe::Structure::NamedCapture ; a named capture # L<<< perlre/(?<NAME>pattern) >>> # L<perlvar/%+> # perl_version_introduced=5.009005 # number=2 alias "$2" or "\2" # name=Q alias "\g{Q}" alias "\k<Q>" alias "(?&Q)" alias "(?P>Q)" alias "$+{Q}" # ------------ |
"(", |
# address=/1/C1/C1/S0 ; xRe::Token::Structure ; Represent structural elements. #"(", # ------------------------------------------------------------------ |
"?<Q>", |
# address=/1/C1/C1/T0 ; xRe::Token::GroupType::NamedCapture ; #"?<Q>", # ------------------------------------------------------------------ |
".", |
# # address=/1/C1/C1/C0 ; xRe::Token::CharClass::Simple ; This class represents a simple character class # any character except \n # L<perlrecharclass/.> # L<perlrebackslash/.> #".", # ---------------- |
")", |
# # address=/1/C1/C1/F0 ; xRe::Token::Structure ; Represent structural elements. # end of grouping for number=2 alias "$2" or "\2" # end of grouping for name=Q alias "\g{Q}" alias "\k<Q>" alias "(?&Q)" alias "(?P>Q)" alias "$+{Q}" #")", # ------------------------------------------------------------------ |
# # address=/1/C1/C2 ; xRe::Structure::Switch ; a switch # L<perlre/(?(condition)yes-pattern|no-pattern)> # L<perlre/(?(condition)yes-pattern)> # perl_version_introduced=5.009005 # ------------ |
"(", |
# address=/1/C1/C2/S0 ; xRe::Token::Structure ; Represent structural elements. #"(", # ------------------------------------------------------------------ |
"?", |
# # address=/1/C1/C2/T0 ; xRe::Token::GroupType::Switch ; Represent the introducing characters for a switch # for valid conditions see L<perlre/(?(condition)yes-pattern)> #"?", # ------------------------------------------------------------------ |
"(DEFINE)", |
# # address=/1/C1/C2/C0 ; xRe::Token::Condition ; Represent the condition of a switch # Checks if a specific capture group (or pattern) has matched something. # perl_version_introduced=5.009005 # L<perlre/(DEFINE)> # define subpatterns which will be executed only by the recursion mechanism # It is recommended that you put DEFINE block at the end of the pattern, # and that you name any subpatterns defined within it. # the yes-pattern is never directly executed, and no no-pattern is allowed # Similar in spirit to (?{0}) but more efficient. #"(DEFINE)", # ------------------------------------------------------------------ |
# # address=/1/C1/C2/C1 ; xRe::Structure::NamedCapture ; a named capture # L<<< perlre/(?<NAME>pattern) >>> # L<perlvar/%+> # perl_version_introduced=5.009005 # number=3 alias "$3" or "\3" # name=V alias "\g{V}" alias "\k<V>" alias "(?&V)" alias "(?P>V)" alias "$+{V}" # ------------------ |
"(", |
# address=/1/C1/C2/C1/S0 ; xRe::Token::Structure ; Represent structural elements. #"(", # ------------------------------------------------------------------ |
"?<V>", |
# address=/1/C1/C2/C1/T0 ; xRe::Token::GroupType::NamedCapture ; #"?<V>", # ------------------------------------------------------------------ |
".", |
# # address=/1/C1/C2/C1/C0 ; xRe::Token::CharClass::Simple ; This class represents a simple character class # any character except \n # L<perlrecharclass/.> # L<perlrebackslash/.> #".", # ---------------------- |
".", |
# # address=/1/C1/C2/C1/C1 ; xRe::Token::CharClass::Simple ; This class represents a simple character class # any character except \n # L<perlrecharclass/.> # L<perlrebackslash/.> #".", # ---------------------- |
".", |
# # address=/1/C1/C2/C1/C2 ; xRe::Token::CharClass::Simple ; This class represents a simple character class # any character except \n # L<perlrecharclass/.> # L<perlrebackslash/.> #".", # ---------------------- |
")", |
# # address=/1/C1/C2/C1/F0 ; xRe::Token::Structure ; Represent structural elements. # end of grouping for number=3 alias "$3" or "\3" # end of grouping for name=V alias "\g{V}" alias "\k<V>" alias "(?&V)" alias "(?P>V)" alias "$+{V}" #")", # ------------------------------------------------------------------ |
")", |
# address=/1/C1/C2/F0 ; xRe::Token::Structure ; Represent structural elements. #")", # ------------------------------------------------------------------ |
"/", |
# address=/1/C1/F0 ; xRe::Token::Delimiter ; Represent the delimiters of the regular expression #"/", # ------------------------------------------------------------------ |
"", |
# address=/1/C2 ; xRe::Token::Modifier ; Represent 1)embedded pattern-match modifiers or 2)(trailing) modifiers for operators match, substitution, regexp constructor #"", # ------------------------------------------------------------------ |
;;;;;;;;;;
|
anyone considering consideration to change pre/to/code because of [] shouldn't because there ain't no issue