http://www.perlmonks.org?node_id=530614

A long-building batch of improvements have just been rolled onto the PerlMonks production web servers. Most of the end-user effects were rather small, but there were also some bigger behind-the-scenes improvements and the total change was fairly large.

My favorite user-visible change is that PerlMonks' "Search" box lets you jump directly to any URL you could construct with any of our linking short-cuts. Say you need to read Data::Diver docs and already happen to have a PerlMonks page loaded into your browser (don't you always?). Type "mod://Data::Diver" into the "Search" box, submit, and your browser jumps right to the requested page in search.cpan.org.

Square brackets can also be included so you can paste link specs like "[id://22609|someone]" into the search box to see / test where they lead and without having to trim them down.

Part of the motivation for this feature is to provide a way for external clients to faithfully translate PM link specs. For example, if you write a PM chat client, you can parse out the link patterns and then fetch (with 'follow redirects' disabled) "http://perlmonks.org/?node=[doc://@ARGV]" and find out exactly what URL PM would use for such a link and what title would be displayed. The destination URL will be in the "Location" header (of course) and the PM title will be in the "X-Title" header. Or you could just have such link specs just link to the above (redirecting) URL and not have to pre-fetch any URLs (and display some other title / label for the link, perhaps the full link spec, as typed).

Note that chat clients should also take advantage of the fairly recently added ability to get chatter as already processed HTML (including enforcement of proper nesting of tags, etc.). I'll let demerphq expand on this point, if he'd be so kind. (This means many clients won't need the above feature for translating links, but it can still be useful, especially for some clients).

Next, another change that is a bit complicated to explain. It used to be that [12345] would first look for nodes titled "12345" and only go to node ID 12345 if no matching title was found. Now [12345] and ?node=12345 and typing 12345 into the "Search" box all always just try to go to node ID 12345 (like [id://12345] and ?node_id=12345 have done and contnue to do). This is more efficient, prevents some problem situations, and makes it easy and reliable to use the "Search" box to jump to nodes by ID number.

Other changes:

  1. Links like [tye ] now correctly display as tye  (no longer like tye ).
  2. You can once again put simple HTML tags in your link titles, so "[TheDamien|<i>The</i>D<b>am</b><tt>i</tt>an]" becomes "TheDamian"
  3. Several bugs in link short-cuts were fixed. For example, AT&T no longer acts like AT.
  4. Writing new link short-cuts is much simpler and less error-prone, so we'll probably soon have imdb://, e2://, and a few more.
  5. A few minor improvements to the emitted HTML
  6. Restore our very efficient front page for casual visitors (in case we get /.'d again)
  7. The 'salt' for your encrypted password in your cookie is no longer the first two characters of your username
  8. Some improvements for being able to deal with web spiders sanely (so google could index us directly soon, we hope) were fixed
  9. A new method, genLink(), for constructing links will make a common pmdev task easier and also reduce server load a bit
  10. Changes to pave the way for a better new-user sign-up process that avoids creating useless user accounts when people don't type in their e-mail address correctly the first time.

- tye        

Replies are listed 'Best First'.
Re: Batch of improvements applied
by Limbic~Region (Chancellor) on Feb 16, 2006 at 13:27 UTC
    tye,
    This is great. I know that some monks think the only improvements or changes that should be done are their great ideas. You won't be able to please all the monks all the time but even if they don't express it - a lot of monks do appreaciate the hard work all the volunteers put into the site.

    Any idea when the new node cache will be ready?

    Cheers - L~R

Re: Batch of improvements applied
by jdporter (Paladin) on Feb 16, 2006 at 16:05 UTC

    tye, you just pegged the karma meter, I'm sure!

    It used to be that 12345 would first look for nodes titled "12345" and only go to node ID 12345 if no matching title was found.

    So, what happens to nodes that are titled "12345"? Doesn't this change make it impossible to link to such a node by title? (And, uh... is that bad? ;-)

    We're building the house of the future together.

      In order: They already sucked. Kinda, for now, or at least harder. No (it isn't bad).

      Mostly it's a "don't care". Long time ago I made it illegal to register a user with only digits in its name (and all of the previous all-digit user accounts have fallen into disuse). Since then we've only had a few cases of nodes with all-digit titles. One was from a meddler trying to abuse the 'feature' that I just 'fixed' (ie. removed) which got retitled after I complained. The others were from people trying to be helpful in covering up a bug of little consequence and these all now linger, making it harder for me to jump to node ID 1, etc.

      You can still link to such nodes by title using href://?node=title;type=nodetype (provided, of course, that you also specify the node's type). Long-ago plans include adding [title://...] links which would allow linking to nodes by title which can't currently be linked to by title (such as nodes with square brackets in their titles) and that would include this new set of "can't be easily linked to by title" nodes. This would also add a ?title= feature that would be like ?node= but without the DWIM (perhaps also not just running off and searching when you get the title wrong).

      And, of course, you can link to such nodes by ID (and do it even easier than before), which is what you should do anyway since such nodes are likely to get retitled anyway because all-digit titles suck. (:

      - tye        

        I should update this logic so that if [9] contains a node_id that does not exist or that you don't have access to (and you haven't told 'search' to list nodes that you don't have access to), then instead of "none such" or "tough beans" you'll get title search behavior.

        - tye        

Re: Batch of improvements applied
by sweetblood (Prior) on Feb 16, 2006 at 13:51 UTC
    Outstanding! Thanks also to all the monks behind the scenes that participated in this update as well as others, past and yet to come.

    Sweetblood

Re: Batch of improvements applied (pmdev)
by tye (Sage) on Feb 17, 2006 at 18:52 UTC

    Note that I've made some changes so pmdev members can see the prior version, Everything/HTML.pm-1 (and, for example, download it to compare it to the current version, Everything/HTML.pm) and I'll follow this convention for future patches to the PM modules.

    For pmdev members hoping to patch in new link short-cuts, here is some documentation on how the new, simpler, more robust link handlers are written:

    The following variables are set for you to use in the link handler:

    [$fullspec] [ $prefix :// $suffix | $title ] $linkspec= "$prefix://$suffix" $escsuffix= $q->escape($suffix) # URL-encoded

    That is, $fullspec is exactly what the user typed between the brackets, including whitespace. $prefix, $suffix, and $title have leading and trailing whitespace stripped and $prefix is forced to lowercase.

    If your handler is just a bareword, then you are defining an alias (the ftp and https handlers are both just "http"). Otherwise, your handler should return one of the following cases:

    ( $url ) # The most common case, uses $suffix as default title ( $url, $deftitle ) # Alternate default title ( '', $html ) # No separate URL; avoid this if possible ( ) # $linkspec was invalid, $q->escapeHTML("[$fullspec]") used

    So a typical link handler is often as simple as:

    "http://lyrics.org/?song=$escsuffix"

    If you want [lyrics://ironic] to render as <a ...>lyrics://ironic</a> instead of w/o the "lyrics://" part showing, then you'd use:

    ( "http://lyrics.org/?song=$escsuffix", $q->escapeHTML($linkspec) )

    Note that the second value returned is just the default title and it is only used if $title is blank (which should mean that the user didn't include "|title" in their link -- but your link handler can also "mess with" $title if you have a good reason, which is unlikely).

    I have a few $q->escapeHTML() calls missing (from the setting of the default title and from the 'http' short-cut), so I'll fix that soonish. But, the default title you return should be HTML so be sure to escapeHTML() it if needed (usually the case).

    Also note that the presence or absence of each returned value is determined with a simple Perl boolean test so returning ( $url, "0" ) is the same as ( $url ), ( 0, $html ) means ( '', $html ), and ( undef, '0') means ( ), for example.

    Returning ( '', $html ) is supported for strange cases like [localtime://] (which renders as "Apr 16, 2024 at 14:50 UTC", ATM) but should be avoided, if possible. Some link handlers (like pad://) currently use it just because it was a pain to fix them to use the new convention, but I hope they will get converted eventually. Note that such link short-cuts don't work in the "Search" box.

    Returning no true values means that the link spec was invalid and what the user typed should simply be output unchanged, for example, [id://notanumber].

    In the next day or two I'll fix the settings display page and patching pages to stop stripping newlines and switch back to using "handlelinks settings" (instead of the current "new handlelinks settings" that made the migration easier) and then patches can be applied again. Sorry for the delay.

    - tye        

Re: Batch of improvements applied
by demerphq (Chancellor) on Feb 17, 2006 at 21:47 UTC

    Note that chat clients should also take advantage of the fairly recently added ability to get chatter as already processed HTML (including enforcement of proper nesting of tags, etc.). I'll let demerphq expand on this point, if he'd be so kind. (This means many clients won't need the above feature for translating links, but it can still be useful, especially for some clients).

    Instead of expanding on this point I put together a proof-of-concept implementation of a properly parsed CB mirror: CB60.

    The point is that the 'modern' xmlstyle which is available from both CB XML tickers is much easier to write clients for than the older versions, and the end product will respect the site markup conventions, even if we change them.

    Folks should have a look at What XML generators are currently available on PerlMonks? for more details on the tickers.

    ---
    $world=~s/war/peace/g