Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

[Emacs] patching cperl-mode to highlight syntax without POD newlines

by LanX (Saint)
on Apr 22, 2015 at 09:44 UTC ( [id://1124242]=perlquestion: print w/replies, xml ) Need Help??

LanX has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I'm dealing with a lot of legacy code with slightly invalid POD markup.

(It's used as pure multiline documentation without being ever processed by a POD parser, but is still properly ignored by the Perl parser.)

Especially newlines are missing around =POD lines, most "newer" POD parsers are quite tolerant about it.

=h1 ------------------------------------------------------- DEFAULT_getaction => bla bla => blubber di blub =cut ------------------------------------------------------ sub DEFAULT_getaction{ my( $cmd, $sth, @row , $jahrpr ) ; # ...

Problem is I need to read this code and cperl-mode is pretty strict about the specifications and stops syntax highlighting of the code thinking it still belongs to POD.

Does anyone know a quick hack to make cperl-mode equally tolerant and parse according to Perl and not according to POD-specifications? (like Komodo does).

(no I can't change the old codebase yet to valid POD)

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!

PS: see also

  • Re: POD troubles
  • Re: POD troubles
  • (cperl-mode) fontification problem with PODs
  • Replies are listed 'Best First'.
    Re: [Emacs] patching cperl-mode to highlight syntax without POD newlines
    by duelafn (Parson) on Apr 22, 2015 at 11:42 UTC

      Quick ugly hack known to apply to https://github.com/jrockway/cperl-mode.git, but probably applies to whatever version of cperl-mode you have as well (I don't expect most people like messing with that chunk of voodoo code). Turns off parsing of multi-line command blocks. In your example, formats the first 5 lines in pod face, and the rest as code.

      Update per Laurent_R's request: (not sure what level of explanation you were wanting, so forgive me if I go on too long) POD treats all text following an =WHATEVER as part of that "Command Paragraph", so

      =head1 Heading Text Blah Blah
      and
      =head1 Heading Text Blah Blah

      Both make an H1 containing "Heading Text" (it may or may not include a newline in the second case / I'm not sure and isn't important in this discussion). The Emacs syntax highlighter (cperl-mode) was highlighting LanX's subroutine as though it were an argument to the =cut. Apparently, perl itself knows that =cut doesn't take an argument and executes the line after it, thus executing the subroutine. The code below is a diff against the emacs syntax highlighter (cperl-mode.el) that comments out its parsing and highlighting of command arguments. To use it, you would need to download a copy of the cperl-mode source code, apply the diff using patch, then load the modified cperl-mode.el in your emacs configuration file. Before the patch, emacs would color =head1 red and Heading Text blue in the examples above (in my color scheme, probably different in yours). After the patch, the whole =head1 Heading Text is just red the "Heading Text" argument is not parsed, the whole line is just treated as POD. This makes the =cut span only a single line and emacs highlights the subroutine as perl code, as desired. The patch removes a feature of the cperl-mode parser so that LanX can get his work done, it isn't an improvement of cperl-mode that would be desirable generally.

      diff --git a/cperl-mode.el b/cperl-mode.el index 00e6c3b..d0b23ee 100644 --- a/cperl-mode.el +++ b/cperl-mode.el @@ -3873,38 +3873,38 @@ the sections using `cperl-pod-head-face', `cpe +rl-pod-face', (put-text-property b e 'in-pod t) (put-text-property b e 'syntax-type 'in-pod) (goto-char b) - (while (re-search-forward "\n\n[ \t]" e t) - ;; We start 'pod 1 char earlier to include the pre +ceding line - (beginning-of-line) - (put-text-property (cperl-1- b) (point) 'syntax-ty +pe 'pod) - (cperl-put-do-not-fontify b (point) t) - ;; mark the non-literal parts as PODs - (if cperl-pod-here-fontify - (cperl-postpone-fontification b (point) 'face +face t)) - (re-search-forward "\n\n[^ \t\f\n]" e 'toend) - (beginning-of-line) - (setq b (point))) - (put-text-property (cperl-1- (point)) e 'syntax-type + 'pod) - (cperl-put-do-not-fontify (point) e t) - (if cperl-pod-here-fontify - (progn - ;; mark the non-literal parts as PODs - (cperl-postpone-fontification (point) e 'face +face t) - (goto-char bb) - (if (looking-at - "=[a-zA-Z0-9_]+\\>[ \t]*\\(\\(\n?[^\n]\\) ++\\)$") - ;; mark the headers - (cperl-postpone-fontification - (match-beginning 1) (match-end 1) - 'face head-face)) - (while (re-search-forward - ;; One paragraph - "^\n=[a-zA-Z0-9_]+\\>[ \t]*\\(\\(\n?[^ +\n]\\)+\\)$" - e 'toend) - ;; mark the headers - (cperl-postpone-fontification - (match-beginning 1) (match-end 1) - 'face head-face)))) +; (while (re-search-forward "\n\n[ \t]" e t) +; ;; We start 'pod 1 char earlier to include the pre +ceding line +; (beginning-of-line) +; (put-text-property (cperl-1- b) (point) 'syntax-ty +pe 'pod) +; (cperl-put-do-not-fontify b (point) t) +; ;; mark the non-literal parts as PODs +; (if cperl-pod-here-fontify +; (cperl-postpone-fontification b (point) 'face +face t)) +; (re-search-forward "\n\n[^ \t\f\n]" e 'toend) +; (beginning-of-line) +; (setq b (point))) +; (put-text-property (cperl-1- (point)) e 'syntax-type + 'pod) +; (cperl-put-do-not-fontify (point) e t) +; (if cperl-pod-here-fontify +; (progn +; ;; mark the non-literal parts as PODs +; (cperl-postpone-fontification (point) e 'face +face t) +; (goto-char bb) +; (if (looking-at +; "=[a-zA-Z0-9_]+\\>[ \t]*\\(\\(\n?[^\n]\\) ++\\)$") +; ;; mark the headers +; (cperl-postpone-fontification +; (match-beginning 1) (match-end 1) +; 'face head-face)) +; (while (re-search-forward +; ;; One paragraph +; "^\n=[a-zA-Z0-9_]+\\>[ \t]*\\(\\(\n?[^ +\n]\\)+\\)$" +; e 'toend) +; ;; mark the headers +; (cperl-postpone-fontification +; (match-beginning 1) (match-end 1) +; 'face head-face)))) (cperl-commentify bb e nil) (goto-char e) (or (eq e (point-max))

      Good Day,
          Dean

        Ooops, duelafn, I really don't understand your post. Will you please explain what it is supposed to do?

        Thanks in advance.

        Je suis Charlie.
        Thanks that looks awesome, and the explanation is interesting.

        I really didn't expect someone could get along with Ilya's legacy lisp code. :)

        Had no chance to test out yet, Sorry for the late reply.

        Going to try including exceptions into the regex such that it's parsed like modern POD-parsers do.

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

        Many thanks, but unfortunately your patch is introducing new problems, bc indentation gets confused now and it fails on longer code.

        =h1 ------------------------------------------------------- DEFAULT_getaction => bla bla => blubber di blub =cut ------------------------------------------------------ sub DEFAULT_getaction{ my( $cmd, $sth, @row , $jahrpr ) ; # ... }

        Thanks for your efforts!!!

        Could you plz tell me which lines you changed, I have problems identifying differences between the + and - section.

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

          Well, darn — indeed it does seem to be behaving inconsistently, apparently I didn't test well enough. As for changes, I literally did the simplest thing possible and commented out that chunk of code — there are no actual code changes. Apparently I commented out some critical state updates. I'll try to poke at it more still and will let you know if I find something.

          Good Day,
              Dean

    Re: [Emacs] patching cperl-mode to highlight syntax without POD newlines
    by RonW (Parson) on Apr 22, 2015 at 23:23 UTC

      If you are just reading the code, why not filter the source files to convert the "pod blocks" into ordinary comment blocks. For example:

      #=h1 ------------------------------------------------------- # DEFAULT_getaction # => bla bla # => blubber di blub #=cut ------------------------------------------------------ sub DEFAULT_getaction{ my( $cmd, $sth, @row , $jahrpr ) ; # ...

      Even if you are editing the code, this transformation is reversible, if needed.

        I'm doing something similar ATM.

        But it's a general problem resulting from the fact that Perl and POD parsers have differing specifications.

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

          Agreed. Also, I really don't see much benefit in requiring POD command paragraphs be surrounded by blank lines - other than human readability of "raw" POD.

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Domain Nodelet?
    Node Status?
    node history
    Node Type: perlquestion [id://1124242]
    Approved by Ratazong
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this?Last hourOther CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (3)
    As of 2024-04-19 19:42 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found