Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^2: Make your 404 pages smarter with metaphone matching

by Taulmarill (Deacon)
on Sep 03, 2007 at 15:27 UTC ( #636741=note: print w/ replies, xml ) Need Help??


in reply to Re: Make your 404 pages smarter with metaphone matching
in thread Make your 404 pages smarter with metaphone matching

As far as i can see, the script used filters all files by extension. Everything with .html gets indexed, everything else isn't.


Comment on Re^2: Make your 404 pages smarter with metaphone matching
Replies are listed 'Best First'.
Re^3: Make your 404 pages smarter with metaphone matching
by merlyn (Sage) on Sep 04, 2007 at 02:38 UTC
    Everything with .html gets indexed, everything else isn't.
    And ... what?

    That doesn't address my concern at all. If I have a private URL that ends in ".html", it'll still likely get indexed. Then someone guesses a URL similar to that, and boom, they're in.

    A good solution would also have an additional regex or blacklist of things that should never be offered as a suggestion.

      If I have a private URL that ends in ".html", it'll still likely get indexed.

      It's not likely, it will get indexed for sure. I don't think this is meant as a finished solution but to show a general way how to do such things.
      I am afraid however, there will be more cut & pasting than actual reading.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://636741]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (14)
As of 2015-07-30 16:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (273 votes), past polls