Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I stopped by a colleague's office recently and asked how things were going. She explained that she was tasked to write a screen-scraper and that she was having trouble with a regex. I offered to help and asked to see the code, to which she replied

"OK, but don't go all PerlMonks on me."
I must admit I was a bit surprised and consequently I didn't hear much of what followed - it was something about white space and angle brackets.

...don't go all PerlMonks on me... The words echoed through my head. What did that mean? Was it a compliment? I do try to be helpful and she knows that much of my knowledge can be attributed to PerlMonks. Perhaps it was more of a playful jab. I am guilty of inflicting informal and unsolicited code reviews on her in similar circumstances in the past, and sometimes I lose sight of the immediate goal (just make it work) and instead suggest alternative designs that would provide additional flexibility and be more robust (see XY Problem).

As I regained focus I realized that she was looking at me expectantly. I studied the line in question, and then I looked at the surrounding code.

"Are you parsing an HTML table?"
"Is there an API that you can use instead of screen-scraping?"
"No - this is it."
"Hmmm." I paused, not wanting to push my luck. "It can be pretty tough to parse HTML with regular expressions. Are you familiar with any of the HTML parsers that are available on CPAN?" I almost cringed as I said it.
"No. Are they easy to use? I have to get this done before I leave today."
I had an opening. In the next few minutes I learned that she inherited the code from on-high and was told to make it work (which may explain why she wasn't as sensitive to the suggestion as I feared). After I explained some of the advantages of using a parser, I told her that if she could give me 20 minutes I'd whip up an example for her. She agreed.

I went back to my office and ripped the guts out of the script that she was working on. I replaced nearly 100 lines of code with the following:

use HTML::TableExtract; my $te = HTML::TableExtract->new(); $te->parse( $content ); foreach my $ts ( $te->tables() ) { foreach my $row ( $ts->rows() ) { print $outfh join ( "\t", map { defined $_ ? $_ : ''; } @$row +), "\n"; } }

I walked back to her office and asked her to open up my version of the script. She gasped. "It's so short!" I smiled. "Yup - and it works, too."

As I left her office I thought of those words: don't go all PerlMonks on me. I'm still not sure what that means, but I think I did... and I'm proud of it.

Thanks, PerlMonks, for helping me to help make someone else's job a little easier.

In reply to Don't go all PerlMonks on me by bobf

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2021-07-29 08:42 GMT
Find Nodes?
    Voting Booth?

    No recent polls found