Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Dynamic Well Poisoner

by wombat (Curate)
on Oct 25, 2000 at 21:50 UTC ( #38443=sourcecode: print w/ replies, xml ) Need Help??

Category: Web Stuff
Author/Contact Info wombat
Description: "Poisoning the well" is a term sometimes used to mean feeding a bot false information, either to make it crash, or to devalue the rest of the data that it is supposed to be mining. We all hate spambots. They're the ones that scour webpages, looking for little <a href=mailto:> tags. They then take down your email address and some spammer then sells it on a "100,000,000 VALID WORKING ADDRESSES" CD-Rom to other unpleasent people. You can all help stop these evil robots, by putting up a Poisoned Well on your homepage.

Basically it's a list of randomly generated email addresses that look valid (cause they're made of dictionary words), have valid DNS entries, but then bounce when the spammers try to send mail to them. This program has a twofold purpose. Older Poison Wells just generate [a-z]{random}@[a-z]{random}.[com org net]. This one takes domains that you specify from a list. Thus, you can put domains you don't like in the list, and then cause THEM to have the burden of sending back lots of bounced messages. As stated before, any spambot worth its silicon would check to see if the address was a valid domain. This would circumvent that.

For my list of evil domains, I've put the six top generators of banner ads. Especially the ones that are suspected of selling personal data. >:-D

Some of the amusing email addys that this generated were
  • Colanderwax@someregistereddomain.com
  • JesusRedmond@someregistereddomain.com
  • crudbedbug@someregistereddomain.com
  • tyingbabies@someregistereddomain.com
  • leekchecker@someregistereddomain.com
  • hottrousers@someregistereddomain.com
#!/usr/bin/perl
use POSIX;
 
print "Content-type: text/html\n\n";
print "<html><head><title>A list of all my best friends email addresse
+s!</title></head>\n";
#Include whatever other page fluff you want here.

open (DICT, "/usr/dict/words") or die "Canna open zee dictionary file!
+\n";
open (DOM, "/home/httpd/cgi-lib/dominia") or die "Canna open zee domai
+n file!\n";
#dominia is just a list of (evil)domains, one per line.

while (<DOM>)
 {
  chomp;
  push @domlist, $_;
  $domcount++;
 }
 
for(0..255)
 {
   do {
     $randloc=floor(rand(409070));
#This is the size of my dictionary.  Adjust as needed.
     seek DICT, $randloc,0;
     $discard = <DICT>;
     $in = <DICT>; chomp $in;
      }
   while ((length $in)>5);
   push @first, $in;
   do {
     $randloc=floor(rand(409070));
     seek DICT, $randloc,0;
     $discard = <DICT>;
     $in = <DICT>; chomp $in;
      }
   while ((length $in)>7);
   push @last, $in;
 }
 
 
print "<table><tr>";
for (0..255)
{
  $domseek = floor(rand($domcount));
  print "<td>";
  print "<a href=mailto:$first[$_]$last[$_]\@$domlist[$domseek]>$first
+[$_]$last[$_]\@$domlist[$domseek]</a>";
  print "</td>";
  if ($_ % 3 == 2) {print "</tr>\n<tr>";}
}
print"</table>";
print"</body></html>";
 
close DICT;
close DOM;

Comment on Dynamic Well Poisoner
Download Code
RE: Dynamic Well Poisoner
by jepri (Parson) on Oct 26, 2000 at 12:10 UTC
    A good way to install this to use SSI to invoke your script. This will make the page appear to be normal HTML. Use the 'xbithack' directive for Apache and set the execute bit to cause your booby-trapped HTML page to be parsed. This will prevent spambots from identifying your program as a CGI script and avoiding it.

    This way the spambot will see something like "myfriends.html" rather than "script.pl" or "script.shtml".

    The complete solution will need these steps:
    Make your my_friends_emails.html something like this:

    <html> <!--#exec cgi="/cgi-bin/script.pl" --> </html>
    In your Apache config file, set Includes to on
    * Options +Includes
    Set xbithack to on
    * XBitHack on

    Now on the command line
    Set your page to execute status (the xbithack)
    * chmod ug+x my_friends_emails.html

    Reload Apache and wait. For more amusement, examine the server logs to see how many spambots you trapped.

    Update: I forgot to add you have to load mod_include. For Debian users, it's in you /etc/apache/httpd.conf file. Uncomment the appropriate LoadModule line. Set up the Handlers in srm.conf. Also the Options +Includes line can be found in access.conf

    ____________________
    Jeremy

Re: Dynamic Well Poisoner
by wombat (Curate) on Nov 27, 2000 at 08:39 UTC
    For the record. SSI is insecure. Making apache run with the ability to execute .shtml content in a .html file is not very bright at all. If you go to my homepage, you can see I've managed to get it to appear as a .html file for the sake of the bots, but not by using the XBIThack, or SSI. :-) My way is secure as far as I can tell. You may now begin to speculate on my methods.

    ~W
      You can also use ForceType cgi-script with Apache.
      In this case we have simple (executable) CGI script, called my_friends.html or anything we want ;-)

      Greetz, Tom.

Back to Code Catacombs

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://38443]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2014-07-26 03:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls