Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Ultimate anti-leech, anti-proxy, anti-bot, CAPTCHA works, link does not (code included)

by taint (Chaplain)
on Apr 19, 2013 at 18:28 UTC ( #1029563=perlquestion: print w/ replies, xml ) Need Help??
taint has asked for the wisdom of the Perl Monks concerning the following question:

Greetings fellow perl lovers,
I host a fair number of larger files.
They are meant for those who actually want to use them --
not to "enhance" someone else' web page, or end up being
downloaded 50 times/day by every BOT known to the internet.
So, I've been on what seems an endless quest to find the
Ultimate ant-leech, anti-hotlink, ant-bot, anti-proxy (google
proxies, and even manipulates the results, in some cases --
PDF, for example).
Well, I've built/tried download scripts that hide the source location.
Those that prevent anti-leeching (easy to fool), and what seems
a myrid of other possible solutions.
But sadly, none completely fit the bill.
So last night I performed a search here at perlmonks
using the string CAPTCHA. I finally landed what looked like
it might be a great starting point to build the ultimate solution.
The monks thread I'm referring to is here.
The code I chose, was from an external link suggested by a fellow monk in that thread.
I made a few modifications, and "ran it up the flag pole" to see how it
would fly. The CAPTCHA && session routine(s) worked flawlessly.
My modification(s) aslo seemed to work -- save one; the link
to the file was produced, but when clicked, returns the files' source
(it's a bzip2(1) archive). Among other modules, it uses the CGI(3)
(CGI.pm) perl module. My guess is I need to post a header/content-type,
rather than a <a href which isn't what I was hoping for.
Here's the source I'm using:
#!/usr/bin/perl -w use strict; $|++; use CGI qw(:all); use Cache::FileCache; my $cache = Cache::FileCache->new ({namespace => 'antirobot', username => 'nobody', default_expires_in => '10 minutes', auto_purge_interval => '1 hour', }); if (length (my $info = path_info())) { # I am the image my ($session) = $info =~ m{\A/([0-9a-f]+)\.png\z}i or do { warn("bad URL $info"); print header(-status => '404 Not Found'); exit 0; }; defined(my $verify = $cache->get($session)) or do { warn("Cannot find $session"); print header(-status => '404 Not Found'); exit 0; }; ## make up an image from the verify string require GD; my $font = GD::gdGiantFont(); my $image = GD::Image->new(2 + $font->width * length $verify, 2 + $font->height); my $background = $image->colorAllocate(0,0,0); ## $image->transparent($background); my $ink = $image->colorAllocate(255,255,255); $image->string($font, 1, 1, $verify, $ink); print header('image/png'), $image->png; exit 0; } print header, start_html(-encoding=>'utf-8',-title=>'File download'), h1("File download"); if (defined(my $verify = param('verify'))) { Delete('verify'); if (defined (my $session = param('session'))) { Delete('session'); if (defined (my $validate = $cache->get($session))) { $cache->remove($session); # one chance is all you get if ($validate eq $verify) { # success! ## would save param('flavor') here print h2("You're human!"), p("Please use <a href=\"./_/file-name.tbz2\">this temporary li +nk</a> to download Filename."), p("<a href=\"/man/?query=md5\">MD5</a>: (filename-0.6.iso) = 0 +8f4fb31b1a33e126d1a1aa9315cb207"), end_html; exit 0; } print p("Sorry, please reenter the security string exactly as sh +own!"); } } } my $verify = do { my @charset = grep !/[10joli]/i, 0..9, 'a'..'z', 'A'..'Z'; join "", map { $charset[rand @charset] } 1..8; }; my $session = do { require MD5; MD5->hexhash(MD5->hexhash(time.{}.rand().$$)); }; param('session', $session); $cache->set($session, $verify); print hr, startform; print h3("You must first prove that you are human (not a bot)"); print p("Please choose your favorite color:"); print radio_group(-name => "flavor", -values => [qw(None Other Purple Green Orange)], -default => "None", -columns => 1); print p("then enter this verification string:", img({src => url()."/$session.png"}).":", textfield(-name => "verify")." (CasE sEnSitiVe)"); print hidden('session'); print submit(-name=>'continue'), endform, hr; print end_html;
While I could modify it to post a header of:
header(-type=>'application/x-bzip2');
or an HTTP 301, via Location:
print "Status: 301 Moved Permanently\n"; print "Location: http://host.domain.tld/filename.tbz2\n\n";
or some such thing. But this isn't quite what I'd hoped for.
I think this script is an ideal start for an ultimate solution.
I think this could/would be perfect, if I could create a temporary symlink(2)
to the actual file, much as the session image is created for the CAPTCHA.
Is this possible? Any ideas how to do it?

Thank you for all your time and consideration.

--chris

#!/usr/bin/perl -Tw
use perl::always;
my $perl_version = "5.12.4";
print $perl_version;

Comment on Ultimate anti-leech, anti-proxy, anti-bot, CAPTCHA works, link does not (code included)
Select or Download Code
Re: Ultimate anti-leech, anti-proxy, anti-bot, CAPTCHA works, link does not (code included)
by taint (Chaplain) on Apr 19, 2013 at 20:35 UTC
    Hmmm, I'm currently looking at: File::Temp (Temp.pm) as a possible solution.
    Is this the best (or good) approach?
    Thanks.
    #!/usr/bin/perl -Tw
    use perl::always;
    my $perl_version = "5.12.4";
    print $perl_version;
Re: Ultimate anti-leech, anti-proxy, anti-bot, CAPTCHA works, link does not (code included)
by InfiniteSilence (Curate) on Apr 19, 2013 at 21:18 UTC
    Try copying your specialty file from a non-public location to a public one doing something like this:
    perl -e 'use Time::HiRes qw|gettimeofday|; my ($sec,$microsec) = getti +meofday(); if (-f qq|foo.file|){`cp foo.file $sec$microsec\.file`} '
    I would add some kind of session information to the filename so it could only be created by a logged-in user and then have a daemon that reaps files that were downloaded or are > 1 day old.

    Celebrate Intellectual Diversity

      Greetings InfiniteSilence, and thank you for your reply.
      Yes, this is the type of thing I'm looking towards right now.
      This is why I was looking at File::Temp, as opposed to performing the
      operation manually (as you suggested), because that would require additional
      "housekeeping" (deleting the files manually via a cron(8) job. Whereas File::Temp
      appears to automatically remove (unlink) the file, after the download has completed.
      But as I have never used it before, I'm still trying to figure out how to do that.
      Copying/associating the session hash to the <tempfilename> might be handy too.
      I could really do with a couple of examples tho. But, for now, I'm just reading, reading, reading. :)

      Thanks again, for taking the time to respond InfiniteSilence!

      --chris

      #!/usr/bin/perl -Tw
      use perl::always;
      my $perl_version = "5.12.4";
      print $perl_version;

        Also, rather than copying the files -- which as you say they are large, could be prohibitive -- create symbolic links to them in a public "downloads" directory using your date/time/session/ip/userid/whatever in the name, and then have a background daemon that scans that directory once a minute or hour or 4 hours, and removes the links that have expired.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1029563]
Approved by herveus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (14)
As of 2014-09-16 14:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (29 votes), past polls