Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Dynamic Well Poisoner

by wombat (Curate)
on Oct 25, 2000 at 21:50 UTC ( #38443=sourcecode: print w/replies, xml ) Need Help??
Category: Web Stuff
Author/Contact Info wombat
Description: "Poisoning the well" is a term sometimes used to mean feeding a bot false information, either to make it crash, or to devalue the rest of the data that it is supposed to be mining. We all hate spambots. They're the ones that scour webpages, looking for little <a href=mailto:> tags. They then take down your email address and some spammer then sells it on a "100,000,000 VALID WORKING ADDRESSES" CD-Rom to other unpleasent people. You can all help stop these evil robots, by putting up a Poisoned Well on your homepage.

Basically it's a list of randomly generated email addresses that look valid (cause they're made of dictionary words), have valid DNS entries, but then bounce when the spammers try to send mail to them. This program has a twofold purpose. Older Poison Wells just generate [a-z]{random}@[a-z]{random}.[com org net]. This one takes domains that you specify from a list. Thus, you can put domains you don't like in the list, and then cause THEM to have the burden of sending back lots of bounced messages. As stated before, any spambot worth its silicon would check to see if the address was a valid domain. This would circumvent that.

For my list of evil domains, I've put the six top generators of banner ads. Especially the ones that are suspected of selling personal data. >:-D

Some of the amusing email addys that this generated were
use POSIX;
print "Content-type: text/html\n\n";
print "<html><head><title>A list of all my best friends email addresse
#Include whatever other page fluff you want here.

open (DICT, "/usr/dict/words") or die "Canna open zee dictionary file!
open (DOM, "/home/httpd/cgi-lib/dominia") or die "Canna open zee domai
+n file!\n";
#dominia is just a list of (evil)domains, one per line.

while (<DOM>)
  push @domlist, $_;
   do {
#This is the size of my dictionary.  Adjust as needed.
     seek DICT, $randloc,0;
     $discard = <DICT>;
     $in = <DICT>; chomp $in;
   while ((length $in)>5);
   push @first, $in;
   do {
     seek DICT, $randloc,0;
     $discard = <DICT>;
     $in = <DICT>; chomp $in;
   while ((length $in)>7);
   push @last, $in;
print "<table><tr>";
for (0..255)
  $domseek = floor(rand($domcount));
  print "<td>";
  print "<a href=mailto:$first[$_]$last[$_]\@$domlist[$domseek]>$first
  print "</td>";
  if ($_ % 3 == 2) {print "</tr>\n<tr>";}
close DICT;
close DOM;
Replies are listed 'Best First'.
RE: Dynamic Well Poisoner
by jepri (Parson) on Oct 26, 2000 at 12:10 UTC
    A good way to install this to use SSI to invoke your script. This will make the page appear to be normal HTML. Use the 'xbithack' directive for Apache and set the execute bit to cause your booby-trapped HTML page to be parsed. This will prevent spambots from identifying your program as a CGI script and avoiding it.

    This way the spambot will see something like "myfriends.html" rather than "" or "script.shtml".

    The complete solution will need these steps:
    Make your my_friends_emails.html something like this:

    <html> <!--#exec cgi="/cgi-bin/" --> </html>
    In your Apache config file, set Includes to on
    * Options +Includes
    Set xbithack to on
    * XBitHack on

    Now on the command line
    Set your page to execute status (the xbithack)
    * chmod ug+x my_friends_emails.html

    Reload Apache and wait. For more amusement, examine the server logs to see how many spambots you trapped.

    Update: I forgot to add you have to load mod_include. For Debian users, it's in you /etc/apache/httpd.conf file. Uncomment the appropriate LoadModule line. Set up the Handlers in srm.conf. Also the Options +Includes line can be found in access.conf


Re: Dynamic Well Poisoner
by wombat (Curate) on Nov 27, 2000 at 08:39 UTC
    For the record. SSI is insecure. Making apache run with the ability to execute .shtml content in a .html file is not very bright at all. If you go to my homepage, you can see I've managed to get it to appear as a .html file for the sake of the bots, but not by using the XBIThack, or SSI. :-) My way is secure as far as I can tell. You may now begin to speculate on my methods.

      You can also use ForceType cgi-script with Apache.
      In this case we have simple (executable) CGI script, called my_friends.html or anything we want ;-)

      Greetz, Tom.
Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://38443]
NodeReaper adjusts the cross hairs

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (10)
As of 2017-05-25 17:15 GMT
Find Nodes?
    Voting Booth?