Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

I stopped by a colleague's office recently and asked how things were going. She explained that she was tasked to write a screen-scraper and that she was having trouble with a regex. I offered to help and asked to see the code, to which she replied

"OK, but don't go all PerlMonks on me."
I must admit I was a bit surprised and consequently I didn't hear much of what followed - it was something about white space and angle brackets.

...don't go all PerlMonks on me... The words echoed through my head. What did that mean? Was it a compliment? I do try to be helpful and she knows that much of my knowledge can be attributed to PerlMonks. Perhaps it was more of a playful jab. I am guilty of inflicting informal and unsolicited code reviews on her in similar circumstances in the past, and sometimes I lose sight of the immediate goal (just make it work) and instead suggest alternative designs that would provide additional flexibility and be more robust (see XY Problem).

As I regained focus I realized that she was looking at me expectantly. I studied the line in question, and then I looked at the surrounding code.

"Are you parsing an HTML table?"
"Yes."
"Is there an API that you can use instead of screen-scraping?"
"No - this is it."
"Hmmm." I paused, not wanting to push my luck. "It can be pretty tough to parse HTML with regular expressions. Are you familiar with any of the HTML parsers that are available on CPAN?" I almost cringed as I said it.
"No. Are they easy to use? I have to get this done before I leave today."
I had an opening. In the next few minutes I learned that she inherited the code from on-high and was told to make it work (which may explain why she wasn't as sensitive to the suggestion as I feared). After I explained some of the advantages of using a parser, I told her that if she could give me 20 minutes I'd whip up an example for her. She agreed.

I went back to my office and ripped the guts out of the script that she was working on. I replaced nearly 100 lines of code with the following:

use HTML::TableExtract; my $te = HTML::TableExtract->new(); $te->parse( $content ); foreach my $ts ( $te->tables() ) { foreach my $row ( $ts->rows() ) { print $outfh join ( "\t", map { defined $_ ? $_ : ''; } @$row +), "\n"; } }

I walked back to her office and asked her to open up my version of the script. She gasped. "It's so short!" I smiled. "Yup - and it works, too."

As I left her office I thought of those words: don't go all PerlMonks on me. I'm still not sure what that means, but I think I did... and I'm proud of it.

Thanks, PerlMonks, for helping me to help make someone else's job a little easier.


In reply to Don't go all PerlMonks on me by bobf

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others examining the Monastery: (8)
    As of 2014-12-19 07:37 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      Is guessing a good strategy for surviving in the IT business?





      Results (73 votes), past polls