Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
HTML::Parser would actually be an awful fit for this problem. If you don't believe it, try to duplicate the functionality the code already has.

The problem is that the incoming document is not HTML. It is a document in some markup language, some of whose tags look like html, but which isn't really. I don't want to spend time worrying about "broken html" that I am going to just escape. I don't want to worry about valid html that I want to deny. I want to report custom errors. (Hey, why not instead of just denying pre-monks image tags, also give an error with a link to the FAQ?) And I want to include markup tags you won't find in HTML.

I did a literal escape above using [code] above. I submit that HTML::Parser would not help with that. OK, so that should be <code> for this site, but this site would want to implement a couple of escaped I didn't. For instance the following handler would be defined for this site for [ (assuming that $site_base was http://www.perlmonks.org/index.pl and hoping that I don't make any typos):

use URI::Escape qw(uri_escape); sub { my $t_ref = shift; if ($$t_ref =~ /\G([^\|\]]+)(?:\|(\|[^\|\]]+))?\]/g) { my $node_ref = "$site_base?node=" . uri_escape($1); my $node_name = encode_entities($2 || $1); return qq(<a href="$node_ref">$node_name</a>); } else { return show_err("Incomplete node link?"); } }
And, of course, given $node_id there is probably a function get_node_name available. And we have that lastnode_id the site keeps track of. So we also need a handler for [:// to link by ID, and that would be generated by something like this:
sub ret_link_by_id { my $tracking = shift; # eg "&lastnode_id=23453" sub { my $t_ref = shift; if ($$t_ref =~ /\G([1-9]\d*)(?:\|([^\|\]]+))?\]/g) { my $node_id = $1; my $name = $2 || get_node_name($node_id); my $node_name = encode_entities($name); my $url = "$site_base?node_id=$node_id$tracking"; return qq(<a href="$url">$node_name</a>); } else { return show_err("Incomplete node_id link?"); } } }
If this still looks to your eyes like a slightly hacked up html spec, let me show you a feature that I dearly wish that this site had. Stop and think about what the following handler for \ does:
sub { my $t_ref = shift; if ($$t_ref =~ /\G([&\[\]<>\\])/g) { return encode_entities($1); } }
Do you see it? Consider what would happen to the following string:
You can link by URL like this: <pre> \<a href="http://www.perlmonks.org/"\><a href=http://www.perlmonks.o +rg/>Perl Monks</a>\</a\> </pre>
Got it yet?

No more looking up those pesky escape codes! :-)

My apologies for using you as a foil, but you just let me illustrate Tom's point perfectly. All of the stuff I am saying is obvious to anyone who has played with functional techniques, but since you haven't you are simply unable to see the amazing potential inherent in this method of code organization. And I happen to know that you are not a bad programmer, but this was a blind spot for you.

Time to put down the pot, we aren't boiling now. This is a frying pan and I feel like an omelette. :-)


In reply to RE (tilly) 2 (not html): Why I like functional programming by tilly
in thread Why I like functional programming by tilly

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-03-28 11:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found