Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Spam tracking

by blogan (Monk)
on Oct 05, 2002 at 16:32 UTC ( [id://203052]=CUFP: print w/replies, xml ) Need Help??

I've seen people who generate unique e-mail addresses for each web page served up and then if they get spam to that one, they can simply block it. Instead of using a complex system to keep track of each hit and everything, I wrote a small script that will convert the IP address to a unique address with some randomness.
#!/usr/bin/perl use CGI qw/:standard/; use Socket; my $domain="example.com"; my ($host, $user); my @nums = (['m', 'l'], # 0 ['q', 't'], # 1 ['x', 'd'], # 2 ['z', 'k'], # 3 ['s', 'b'], # 4 ['c', 'h'], # 5 ['r', 'n'], # 6 ['v', 'p'], # 7 ['g', 'j'], # 8 ['w', 'f'], # 9 ['a', 'e'], # 10 ['i', 'o'], # 11 ['u', '1'], # 12 ['2', '4'], # 13 ['5', '6'], # 14 ['9', '7']); # 15 $host = remote_host(); if ($host !~ /^(\d{1,3}\.){3}\d{1,3}$/) { $host = inet_ntoa(scalar gethostbyname($host)); } $user = ""; foreach $octet (split /\./, $host) { $high = $octet>>4; $low = $octet & 0xF; $user .= $nums[$high][int rand 2]; $user .= $nums[$low][int rand 2]; } $user = "webmaster" if (!$user); print header; print start_html, "E-mail me <A HREF=\"mailto:$user\@$domain\">$user\@ +$domain</A>", end_html;
And you use the following to decode:
#!/usr/bin/perl #use CGI qw/:standard/; use Socket; my ($addr); my @nums = (['m', 'l'], # 0 ['q', 't'], # 1 ['x', 'd'], # 2 ['z', 'k'], # 3 ['s', 'b'], # 4 ['c', 'h'], # 5 ['r', 'n'], # 6 ['v', 'p'], # 7 ['g', 'j'], # 8 ['w', 'f'], # 9 ['a', 'e'], # 10 ['i', 'o'], # 11 ['u', '1'], # 12 ['2', '4'], # 13 ['5', '6'], # 14 ['9', '7']); # 15 my %backwards; for ($i = 0; $i < 16; $i++) { $backwards{$nums[$i]->[0]} = $i; $backwards{$nums[$i]->[1]} = $i; } $addr = shift || die; $addr =~ s/\@.*//; if (length $addr != 8) { print "Bad length\n"; exit; } @chars = split //, $addr; for ($i = 0; $i < 8; $i += 2) { push @octets, $backwards{$chars[$i]}<<4 | $backwards{$chars[$i+1]} +; } $ip = join ".", @octets; print "IP is $ip\n"; $host = scalar gethostbyaddr(scalar inet_aton($ip), AF_INET) || "unkno +wn"; print "Host is $host\n";
The only to make sure of is that both @nums are the same in each program. Since each IP can been encoded to 256 (2^8), you don't run a high chance of blacklisting everyone from the same IP (if a spammer uses some AOL account to look at the page, then later someone innocent gets the same IP address and looks at the page, they'll likely get different addresses). No need to keep a separate database.

Replies are listed 'Best First'.
Re: Spam tracking
by newrisedesigns (Curate) on Oct 06, 2002 at 15:40 UTC

    blogan++, I think I might use this for my site.

    However, why do you use CGI.pm in the decoder?

    Good post.

    John J Reiser
    newrisedesigns.com

      I had copied over the encoder from the decoder, accidentally left CGI in. Oops.
Re: Spam tracking
by cybear (Monk) on Oct 10, 2002 at 12:28 UTC
    Hate to sound "out of step" but I'm not really sure what mean.
    Could you explain in a little more detail.

    Thanks

    - cybear

      If you just use webmaster@domain.com on your webpage, it will get harvested sooner or later. Even if it is harvested once, it will be sold over and over again, and the e-mail going to webmaster@domain.com is basically junk. If you generate a unique address for each visitor, if the address akhjasio@domain.com is harvested once and sold over and over, you can simply block that one address, and visitors you wrote down your address weeks ago will still have a non-blocked address. If by some chance the harvester and another visitor went to your web site using the same IP address, there's only a 1 in 256 chance that they'll get the same generated address. If you're still unsure about something, let me know what's not clear and I'll explain some more.
        Cool. Nice code... and not a bad idea.

        - cybear

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://203052]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-03-29 11:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found