Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

Yesterday, I had just finished reading Perl Best Practices and I felt electrified. So many things that you could have done and you haven't bothered for so long!

The book is so full of useful advice that I felt ashamed for every good piece that I could have found on my own and instead, due to too much laziness, I had not. Thus, I decided, from now on, I will put into my personal practice all the advice that I liked (most of them, actually) and were not already consolidated in my day-by-day programming.

Then, I updated some of my templates for subs, class creation, and so on, and I started the next project in my agenda fully armed with new knowledge. (Update Please notice: I said "next project". Truly to the if-it-works-don't-change-it principle, I leave my existing code as it is until it's in need of maintenance.)

One of the useful pieces of advice that struck me as simple and very easy to adopt was one about regular expressions. The book says "always use the /x modifier" (and the /m and /s as well, for reasons that I leave to the reader to find out in the book), so that your regex are easier to read. No big deal, I thought. I occasionally use the /x modifier, and why not making a habit of it? So I set off with the new course of action, and I changed one of my templates for parsing a simple data file. Before, I used to write things like this:

#!/usr/bin/perl use strict; use warnings; PARSE: while (<DATA>) { chomp; next PARSE if /^\s*$/ ; # skip blank lines next PARSE if /^\s*#/ ; # skip comments # do something useful with the data print "<$_>\n"; } __DATA__ some data here # a comment # another comment, followed by a blank line more data # another comment final data

Since I use this kind of thing very often, I have a template for it in my editor. I updated it so it became:

#!/usr/bin/perl use strict; use warnings; PARSE: while ( my $line = <DATA> ) { chomp $line; next PARSE if $line =~ m{ ^ \s* $ }xsm; next PARSE if $line =~ m{ ^ \s* # }xsm; # do something useful with the data print "<$line>\n"; } __DATA__ some data here # a comment # another comment, followed by a blank line more data # another comment final data

It looks better, doesn't it?

Unfortunately, this code is not the same as the previous one. When I ran my program for the first time, I got an empty result set. No lines were parsed at all.

I spent several minutes scratching my head, until I realized that the idiom I had used countless times in the past was failing me now.

The problem is, /^\s*#/ is not the same as /^\s*#/x, because the /x modifier allows not only whitespace, but also comments and therefore the "#" character is not a literal any more!

Of course, I should have adopted yet another best practice piece of advice, i.e. the one saying to put a character to escape into a character class.   (*)

Now,  m{ ^ \s* [#] }xms works as advertised, but I wonder if it was worth the trouble of deviating from the less readable but consolidated m/^\s*#/.   (**)

Lesson learned: cargo cult coding is always a risk, even for experienced programmers. Think before refactoring!

(*) Provided that I realized first that there was a character to escape!

(**) Yes, of course it was. I just have to remember to connect fingers and brain before starting a coding session!

 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to A refactoring trap by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chanting in the Monastery: (5)
    As of 2014-11-01 05:33 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      For retirement, I am banking on:










      Results (227 votes), past polls