Links between Mason components?

tye has asked for the wisdom of the Perl Monks concerning the following question:

Have I just not found it? It seems a pretty basic, fundamental, and important feature. And I have not yet found any built-in support in HTML::Mason for constructing a link (as in <a href=...) from one Mason component to another where that link includes arguments.

That is, other than a quite awkward construction like:

    <a href="/path/to/component?who=<% $who |u %>;why=<% $why |u %>">
    Title goes here</a>
[download]

Having to remember to include "|u" for each parameter makes that nearly unacceptable in my book (after seeing way too many bugs from lack of URL escaping that pass unnoticed for a long time and then turn into a crisis, even a security problem). It also gets quite tedious (and error-prone and hard to read) when producing a table full of similar links.

But to see how awkward that really can be, imagine what I find to be a common case: Having a hash of arguments that you want to include in the link. Am I really supposed to roll my own URL constructor for such an obvious case?

    <& .link, 'Edit Settings', '/widget/settings', %Context &>
...
<%def .link>
%   my( $title, $page, %args ) = @_;
    <a href="<% $page %>?
%       for my $key ( sort keys %args ) {
            <% $key |u %>=<% $args{$key} |u %>;
%       }
    "><% $title |h %></a>
</%def>
[download]

Although that first line is a reasonable interface, the implementation, of course, doesn't actually work, producing:


        <a href="/widget/settings?
            acct=some_acct%23id;
            widget=widget%2Bid;
    ">Edit Settings</a>
[download]

Is there a better (and actually correct) way to write such in Mason?

Too bad defining a "removing newlines and adjacent whitespace" Mason filter (call it "|w") doesn't allow me to address this problem as simply as:

    <& .link, 'Edit Settings', '/widget/settings', %Context |w &>
[download]

(You can use "|w" inside of <% ... %> but not inside of <& ... &>.)

So, (at least for now) I resign myself to looking outside of Mason for a solution.

My first stop was CGI because I already know that CGI.pm knows how to construct a URL with parameters. I know it even allows me to choose to use ';' to separate parameters instead of the old, ugly '&'. Unfortunately, I end up disappointed to find that CGI.pm only knows how to construct URLs to the current page.

So, CGI also stopped doing URL escaping itself, now delegating that feature to URI::Escape. Maybe it knows how to construct URLs? No.

Clearly, URI knows how to construct URLs. Of course it does. Sadly, it doesn't appear to know anything about CGI parameters in a URL.

(Sidebar) Not that this is terribly shocking. Even JavaScript got this embarrassingly wrong "forever". JavaScript originally came with a function for URL-encoding strings called escape(). It didn't actually do it right. You should probably never use it.
JavaScript 1.5 add encodeURI(). It appears to be designed to be used in a manner that encourages encoding bugs. You should probably never use it. 1.5 also added the awkwardly-named encodeURIComponent()... which actually does URL encoding correctly. Of course, you still have to roll your own iterating and concatenating and inserting the delimiters.
A language designed for working in web pages doesn't actually come with a feature that will build a URL with parameters. (But it does include the ability to convert numbers to base 11 and atan2(), of course.)
And JavaScript isn't the only example. There have been a lot of web projects that I've dived into to find URLs being constructed with the equivalent of "$page?acct=$acct;phone=$phone". Hope that phone number isn't, for example, "+44 20 773 1234".
Even PerlMonks was such a "web project". Remember when replying to a node whose title contained a double quote didn't work right (and numerous other similar bugs)?

So, I roll these pieces together and get:

package LinkToMason;
use strict;

use CGI();
use URI::Escape();

sub escape_url {
    my( $class, $string ) = @_;
    # Escape using default except don't escape '{' nor '}':
    return URI::Escape::uri_escape( $string, "^A-Za-z0-9\\-_.!~*'(){}"
+ );
}

sub as_url {
    my( $class, $page, $rel, %args ) = @_;
    if(  $page !~ m(^/)  ) {
        require Carp;
        Carp::croak( "..." )
            if  ! $rel;
        $rel =~ s{/[^/]*$}{};
        $page = "$rel/$page";
    }
    return $page
        if  ! %args;
    return "$page?" . join ';', map {
        join '=', map $class->escape_url($_), $_, $args{$_}
    } sort keys %args;
}

sub html_link {
    my( $class, $title, $page, $args, $attrs, $rel ) = @_;
    $args   ||= { };
    $attrs  ||= { };

    if(  ref $title  ) {
        $title = $$title;                   # \ '<bold>Real</bold> HTM
+L'
    } else {
        $title = CGI->escapeHTML( $title ); # Non-HTML string
    }

    my $url = $class->as_url( $page, $rel, %$args );

    return CGI->a( { href => $url, %$attrs }, $title );
}

1;
[download]

And try to use that in my Mason:

<%once>
    use LinkToMason;
</%once>

<%args>
    $acct_id
</%args>

<%shared>
    my %Context = (
        acct    => $acct_id,
        widget  => $widget_id,
    );
</%shared>
...
    <& .link, "Edit $widget_name Settings", 'settings', %Context &>
...
<%def .link><%perl>
    my( $title, $page, %args ) = @_;
    my $link = LinkToMason->html_link(
        $title, $page, \%args, { }, '/widget/',
    );
</%perl><%
    $link |n
%></%def>
[download]

Note the gyrations to prevent .link from including newlines.

Okay, that is quite a bit uglier than I had hoped for. But it actually works.

But it quickly demonstrated how it wasn't very flexible when I tried to use it in a page that uses JavaScript to generate a list of links client-side (also changing it to not take a hash of parameters but instead just a comma-separated list of key names used to look up the parameter names and values that are ever used from this page):

    <& .link, "Edit $widget_name Settings", 'settings', 'acct,widg' &>
...
    <script type="text/javascript">
...
        + '<& .link, "Edit {{feature_name}}", 'edit', 'acct,widg,feat'
+ &>'
[download]

Where the new .link replaces the ',feat' with feature_id => '{{feature_id}}' and the JavaScript replaces '{{feature_name}}' and '{{feature_id}}' with values that vary between rows (one row generated per feature).

There is a risk that the second call to .link above would include something (a ', a \, or a newline) that wouldn't be legal inside of a JavaScript string. That sounds like a job for a Mason filter. I could define "|l" to strip newlines (a common desire when using Mason, it seems) and "|sq" to escape those problem characters.

Oh, except, as we already mentioned, you can't use something like "|sq" with <& ... &>.

I could define .sq that escapes the string passed to it. Ooh, I just found this syntax:

        + '<&| .sq &><& .link, ... &></&>'
[download]

That actually addresses (if in a manner still uglier than I had hoped) some of the questions I had when I started writing this.

What other features am I missing? How can I do this better?

- tye

Comment on Links between Mason components? Select or Download Code

Replies are listed 'Best First'.
Re: Links between Mason components? by Anonymous Monk on May 17, 2012 at 20:36 UTC
So, CGI also stopped doing URL escaping itself, now delegating that feature to URI::Escape. No, CGI has forever delegated the job to CGI::Util::escape() Clearly, URI knows how to construct URLs. Of course it does. Sadly, it doesn't appear to know anything about CGI parameters in a URL. Actually it does, see "query_form", but it is PITA ( no state, one-shot replacements), that is why URI::QueryParam, which while more convenient, still kind of a PITA You can use CGI for this `$ perl -l use CGI; my $queryParams = { a => [qw/ a >< a/], qw/ b b><b c c><c / }; print CGI->new( $queryParams )-> query_string; __END__ c=c%3E%3Cc;a=a;a=%3E%3C;a=a;b=b%3E%3Cb` [download] Badger::URL looked interesting, but naturally, like all interesting things, it doesn't handle something, it doesn't handle multivalued params like `fa=a;fa=b;fa=c;`	[reply] [d/l] [select]
Re^2: Links between Mason components? (CGI) by tye (Sage) on May 18, 2012 at 03:00 UTC
Thanks. That was helpful (and, even more so, interesting). The pseudo-singleton features of CGI made me unwilling to create a new CGI object in an environment that is likely already using a CGI instance. I guess I could dig into CGI.pm to see if my fears are unfounded with the current implementation and, also important, whether the documentation makes it clear that my fears should not become 'founded' in a future release of CGI.pm. - tye	[reply]
Re^3: Links between Mason components? (CGI) by Anonymous Monk on May 18, 2012 at 03:49 UTC
I guess I could dig into CGI.pm to see if my fears are unfounded with the current implementation and, also important, whether the documentation makes it clear that my fears should not become 'founded' in a future release of CGI.pm. Your fears are unfounded both according to the source and the docs To create an empty query, initialize it from an empty string or hash ) I remember this from source diving (which CGI.pm encourages) in the year 2000	[reply]
Re: Links between Mason components? by FloydATC (Deacon) on May 17, 2012 at 20:48 UTC
Only you know how to link to stuff in your web application, so no size will ever fit all; a generalized solution will probably never meet all requirements. Here's what I usually do: I write my own modules (classes really) that represent the different entities that the web application deals with, such as users, articles, hosts or whatever. Each of those modules (classes) include a link() method which "knows" what an URL for that particular entity should look like. This method is passed with a hash argument where I pass on all the arguments that the current Mason component was called with. Example: `<TABLE class="items"> % foreach my $item (@items) { <TR><TD><% $item->link( %ARGS ) %></TD></TR> % } </TABLE>` [download] The corresponding Item.pm module might contain something like this: `sub link { my $self = shift; my %args = @_; $args{'foo'} \|\|= "default"; $args{'bar'} \|\|= "values"; return '<A href="/path/to/item.html?id='.$self->id.'&foo='.$args{'fo +o'}.'&bar='.$args{'bar'}.'>'.$self->name.'</A>'; }` [download] There might be other elegant solutions but this approach has worked brilliantly for me over the years. -- Time flies when you don't know what you're doing	[reply] [d/l] [select]
Re^2: Links between Mason components? (requirements) by tye (Sage) on May 18, 2012 at 03:54 UTC
Thanks for the examples. this approach has worked brilliantly for me over the years I guess you should count yourself lucky that you have never, in years, run into a value that contains a '+' character or a `$self->name()` that contains a contraction or a quote (or tons of other cases that your code brilliantly fails in the face of). Though, as I noted, I've found that these types of bugs usually take a long time to actually bite you. "Years" isn't out of the question. Heck, my latest discovery of this class of bug is in code where the problem hasn't been noticed for years. Perhaps there is code elsewhere that guarantees that only URL-safe characters are ever allowed in `$self->name()`. My experience is that such an assumption is so often correct (and usually due mostly to chance) that "nobody" notices the lack of proper encoding / escaping. Which leads to "everybody" forgetting about encoding and escaping and then to problems (sometimes just annoying, sometimes serious) when the case where that assumption doesn't hold finally crops up. Writing `"$page?foo=$foo"` while thinking "I can do that because I know that `$foo` never contains anything but letters" just leads to people (sometimes even the original author) copying that form of code in a situation where it isn't actually safe. If you have to think that phrase, then you should be recording it in a comment. Or, better yet, just properly encode `$foo` even though you "know" you don't have to. Only you know how to link to stuff in your web application, so no size will ever fit all; a generalized solution will probably never meet all requirements. The only requirements I was expecting to find a generalized solution for were the requirements of the standards that define how CGI parameters are put into a URL. But you seem blissfully unaware of those. Not that you aren't "in good company" on that front, in my experience. In my environment, there are no classes that represent objects to be displayed that have any business knowing how to link to pages. The objects that the browser-friendly pages display are the same objects that the REST API deals with and that the XMLRPC API deals with (and that the cron jobs deal with, etc.). So teaching those objects how to produce links to browser-friendly pages wouldn't solve the problem of providing URLs for the REST API (the XMLRPC API doesn't use a concept of 'links'). But that problem really gets bad when the new, non-Mason front end components start getting added in. So, I'd like the Mason-specific quirks of linking to be facilitated by the Mason. In particular, my examples showed how I was trying to use details about a specific Mason component to better abstract the types of links used by just that component (which then used the module I wrote to correctly construct a URL and then wrap that correctly into an HTML link). We try to put as little code in the Mason as is practical. But code that is specific to one Mason component seems a bit silly to put someplace else. But if I don't learn some significant improvements in how to do that, then moving the code out of Mason will likely be the solution. In reading further after I posted, I did find `$m->print(...)` which answered another fundamental question I had: How do I write Perl code that 'returns' content from a submodule (or method). Which means I could rewrite the URL-composing Mason code like this: `<%def .link><%perl> my( $title, $page, %args ) = @_; my $i = $m->interp(); $m->print( "<a href='$page" ); for my $key ( sort keys %args ) { $m->print( join '=', map $i->apply_escapes($_,'u'), $key, $args{$key} ); } $m->print( "'>" . $i->apply_escapes($title,'h') . "</a>" ); </%perl></%def>` [download] which is much closer to correct, if still quite ugly. (One mistake is that it doesn't HTML-escape the URL. That shouldn't usually be a problem here since I'm URL-encoding most of the values and using ';' separators not '&' separators. But I can still think of ways to make it fail.) So, as usual, taking the time to try to carefully compose the question did lead to finding some answers. But I'm still hoping for some revelations of tricks I'm missing about doing this type of abstraction nicely in Mason. Thanks, again, for your reply. Update: Better example of proper URL building in MASON: `<%def .link><%perl> my( $title, $page, %args ) = @_; my $i = $m->interp(); $page .= '?' . join ';', map { join '=', map $i->apply_escapes($_,'u'), $key, $args{$key} ); } sort keys %args if %args; $page = $i->apply_escapes( $page, 'h' ); $title = $i->apply_escapes( $title, 'h' ); $m->print( "<a href='$page'>$title</a>" ); </%perl></%def>` [download] - tye	[reply] [d/l] [select]
Re^3: Links between Mason components? (requirements) by FloydATC (Deacon) on May 18, 2012 at 05:42 UTC
The example is ofcourse very simplified to illustrate a concept. If there are user strings involved then proper care is required when building the URLs. If my approach doesn't suit your needs then it just goes to prove my point; one size does not fit all :-) -- Time flies when you don't know what you're doing	[reply]