Re^4: Help update the Phalanx 100

by stvn (Monsignor)
on Dec 23, 2004 at 13:49 UTC

in reply to Re^3: Help update the Phalanx 100
in thread Help update the Phalanx 100

# Exclude downloads from agents matching this regex, because they seem + to be # related to mirroring or crawling rather than genuine downloads: my $rx_agent_ignore = qr/     \. google \.            |     \. yahoo  \.            |     \b LWP::Simple \b       |     \b MS\ Search \b        |     \b Webmin \b            |     \b Wget \b              |     \b teoma \b /x;

Markus, I may be wrong, but I think that uses LWP::Simple sometimes to download modules with, so excluding this would not be a good idea even though there is a good chance it could also be a spider.


Re^5: Help update the Phalanx 100
on Dec 23, 2004 at 22:21 UTC
    Thanks, stvn! I've updated the code and results accordingly.


