Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Making perlmonks seach engine friendly

by blakem (Monsignor)
on Aug 17, 2001 at 13:43 UTC ( #105640=monkdiscuss: print w/ replies, xml ) Need Help??

As many of you know, search engines do a poor job indexing dynamic content. URLs with question marks in them are sometimes trimmed, and usually not "followed" (i.e. added to the indexing list when encountered in another page) Therefore, sites which are entirely dynamic (such as PM) tend to get skipped by search engines.

This has all been discussed before (see PerlMonks and Google and CGI queries without '?') but nothing has been done about it.

Therefore, I wrote up a quick "mirror" which maps dynamic pages on PM to static pages on one of my domains. They have a big "Go to the Real Perlmonks" link for actual people to use, and have all nodelets turned off. Since, these pages can be indexed readilly by search engines, my goal is to help build our community by bringing more perl programmers to the site.

I'm wondering if anyone has any strong feelings on this. If people don't think its appropriate its easy enough to pull the plug on it. I've already made a few tweaks to it (for instance, the Offering Plate isn't mirrored, it sends you to the actual one -- same with links to The Monestary)

When you try it out, keep in mind that it being run by a monk for the monks, so if you have any questions/suggestions/trepidations just let me know.

p.s. I has a boss who liked to call these setups "goat pages" after that unfortunate goat used to lure the dinosaurs in Jurrasic park.

-Blake

Comment on Making perlmonks seach engine friendly
Re: Making perlmonks seach engine friendly
by stefan k (Curate) on Aug 17, 2001 at 15:16 UTC
    So,
    first of all respect to the work you have done and the time you have spent!
    I must admit that I haven't followed the earlier discussions about the search engine topic, but I got my own views on it. Probably the search function within the Monastery is more likely to produce a useful result when you're looking for perlish answers than a general search engine. So I wouldn't use The Google for finding my answers in the first place.

    Another goals might be to get more perl coders here. I'd think this is better achieved by getting some links to the Monastery Gates in the right places where a common perl coder would look for help.

    Thus I'd spent no time on the effort of making perlmonks.org search engine friendly. But if you had some fun doing all that and got the feeling you provided something useful/important you're probably right - besides being a valuable part of the worldwide perm community :). And so -once again for the closing- respect to your efforts!

    Regards... Stefan

(ichimunki) Re: Making perlmonks seach engine friendly
by ichimunki (Priest) on Aug 17, 2001 at 19:10 UTC
    Totally great technique. Except that if we can't get vroom to do it here, all the real perlmonks.org links will still clutter any search engine result listing. :)

    How hard would it be to simply redirect user agents that don't look like spiders/bots to the real site? I notice in my logs that the well behaved spiders ask for robots.txt first, so based on that you could allow only those agents that ask for robots.txt first to access these clean, better-for-indexing pages.

    Just out of curiosity... does your CGI script simply use an LWP request to perlmonks.org to create the content on the fly... I assume it must since to do otherwise would be a constant update job. Although I also assume for efficiency it would cache any pages it has processed at least once. Something like this would also be a great start on that CD-ROM version of the site. ;)
      Yep, it keeps a cache for a few days; if the cache misses, fresh content is fetched from perlmonks.org.

      I don't really want to play the cat-n-mouse game of User-Agent matching. However, I did remap a few more buttons to fall through to the real site. 'Offer your reply', '\d+ replies', 'comment on' and 'perl monks user search' all send you to perlmonks.

      Even w/o the big "Go to the real perlmonks" sign at the top, it wouldn't take too many clicks before a real user would wind up on the main site.

      -Blake

Re: Making perlmonks seach engine friendly
by Superlman (Pilgrim) on Aug 17, 2001 at 19:46 UTC
    Fantastic!

    It's things like this that make me wish I could ++ the same thing multiple times

Re: Making perlmonks seach engine friendly
by dave_aiello (Pilgrim) on Aug 21, 2001 at 19:44 UTC
    I would like to make a couple of general comments about Web application design and search engine friendlyness that seem relevant to PerlMonks.

    Last year, I provided a bit of assistance to Automatic Media, the folks that built Plastic.com. IMHO, Plastic is one of the best commercially-oriented implementations of Slashcode ever done, and a real credit to the people who built it, including Joey Anuff and Jon Phelps.

    In the course of our discussion, we all agreed that one of the biggest problems with Slashdot was the site's "lack of memory". In early versions of Slashcode, the lack of memory stemmed from the fact that the only way search engines could find content that had scrolled off the home page was through the "Older Stuff" Slashbox, which generally only displayed the links to the story pages for the previous 7 days. If Rob Malda and friends had, at the time of Slash 0.9, chosen to display an paging device (such as link labeled "Last 20 stories ->" at the bottom of the home page), they would have created a searchable site since all story pages were static rendered.

    With the advent of Slash 1.0 and later versions, many advances had been made in the efficiency of Apache, mod_perl, mySQL, and in the efficient deployment of these resources across multiple machines. So, the developers of Slashcode and code bases that were influenced by it (like the Everything code bases upon which PerlMonks was built) focused on dynamically rendering as many content pages as possible. This was thought to improve the user experience, which I guess it did marginally for those people who already knew about the Web Sites in question. But, it actually made the site less useable for people who would have otherwise found the site by doing searches on Google and other search engines.

    Ultimately, this has hurt sites like PerlMonks more than Slashdot, because the nodes in PerlMonks are of greater lasting value to Perl users than the discussion of current events that is the core of Slashdot content.

    If you agree with me that greater search engine accessibility would improve PerlMonks, then one obvious way forward would be to enhance the underlying code base to have it perform more static rendering of nodes. This would not have to be an all-or-nothing approach-- the techniques of server side includes could still be used to merge static and non-static page components when logged in users visit each node. There are definitely other techniques that could be employed, such as URL rewriting that would hide URI parameters in permanent-looking URLs. But, I think increasing the amount of content that is static-rendered would be the most efficient approach, if you look at the site holistically.

    Dave Aiello
    Chatham Township Data Corporation

Re: Making perlmonks seach engine friendly
by andye (Curate) on Sep 27, 2001 at 14:44 UTC
    Nice one - blakemdouble++.

    How about a link to a list of all nodes - if the monastery can be persuaded to generate such a thing?

    andy.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://105640]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (11)
As of 2014-08-28 11:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (259 votes), past polls