Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: Are monks hibernating? (spiders)

by tye (Cardinal)
on Feb 22, 2007 at 23:04 UTC ( #601658=note: print w/ replies, xml ) Need Help??


in reply to Re: Are monks hibernating?
in thread Are monks hibernating?

Thanks to you (and others) for looking up some numbers.

I'm pretty convinced that the recent decline is due to me finally closing the back door where search engine spiders were (badly) indexing the site. Note, however, that bad indexing has at least some advantages over no indexing. The back door was closed for good reasons and it was thought that a long-running project to produce search-engine-friendly renditions of pages would be finished "RSN". Also, closing the back door doesn't appear to have magically ended our recurring problem with one of the web servers going "out to lunch".

So, in the short term, we should probably re-open the front door (but keep the back door closed) and try to only open one front door (just www.perlmonks.org, not perlmonks.org nor www.perlmonks.net etc) so that we have "okay" indexing. It will be unfortunate that snippets of CB content will be indexed and cached and some of the features for making it easier to do more powerful searches via google (et. al.) won't be there (my plan was to add keywords to pages so you could tell google that you only want to search a specific section or only for a certain author even if that author's name is something heavily used like "grep"), but at least we'd be on the map and "strangers" might find some of our useful content.

I'll put that near the top of my to-do list.

- tye        


Comment on Re^2: Are monks hibernating? (spiders)
Re^3: Are monks hibernating? (done)
by tye (Cardinal) on Feb 27, 2007 at 05:49 UTC

    Okay, say "hello" to our new (dynamic) /robots.txt; what you see there will depend on what hostname you use to visit PerlMonks. Compare http://perlmonks.net/robots.txt vs. http://www.perlmonks.org/robots.txt.

    Now we see how long it takes google, et al, to notice and then wait for the traffic levels to rise until the site becomes just annoyingly slow enough that we reach equilibrium (which I think explains the quite flat site traffic level prior to shutting out the search engine spiders).

    - tye        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://601658]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2014-12-25 13:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (160 votes), past polls