Re: Duplicate Randoms with SSI
by perrin (Chancellor) on Oct 26, 2007 at 16:23 UTC
|
Here's an idea. First, use an algorithm rather than a random selection. Take some things like the current time and the browser's IP address and apply something like a hashing algorithm to generate a number from them. Use that number as an index into your file, wrapping as necessary. Then, add offsets to the SSI calls, e.g. /my/cgi.pl?offset=1, /my/cgi.pl?offset=2, etc. In each call, add this offset number to your computed index. If showing the ads in sequence is a problem, multiply the offset by something, but keep it small enough to avoid wrapping around and possibly showing the first one again. | [reply] [Watch: Dir/Any] |
Re: Duplicate Randoms with SSI
by amarquis (Curate) on Oct 26, 2007 at 14:28 UTC
|
I cannot think of an elegant solution to your question, but have you considered dropping the SSI and just going with a full Perl solution? You can add a handler so that the server will know to call the perl interpreter on your .html files, so you don't have to change the urls or anything (There are other ways to get around this as well). It seems easier to have one script do everything rather than putting together some method of saving state information between SSI calls.
| [reply] [Watch: Dir/Any] |
|
I have considered it, but part of the content is legacy Perl applications, a couple of purchased PHP apps, and is your typical 10 year old site a little of this, some of that. It is far from elegant and the number effort gnomes required to convert the entire site to Perl is way beyond the number I have.
| [reply] [Watch: Dir/Any] |
Re: Duplicate Randoms with SSI
by Rhandom (Curate) on Oct 26, 2007 at 16:24 UTC
|
Each of those SSI calls will be forking off another process and will have no way to communicate up to the parent process that something has happened.
You do have something else that could potentially work. It has been long enough since I have used an SSI that I don't remember if REMOTE_ADDR, HTTP_REFERER, or REMOTE_USER are set inside of an SSI process - I think that they should be (REMOTE_USER would only be set in an htauthed area). As long as even one of those are set, you can use a solution similar to the following. It isn't 100% accurate - but it should be good enough:
use Cache::Memcached;
use Digest::MD5 qw(md5_hex);
sub get_random {
my ($pick_list, $unique_key) = @_;
# add uniqueness to our key
$unique_key .= $ENV{'REMOTE_USER'}
|| $ENV{'REMOTE_ADDR'} # less reliable
|| $ENV{'HTTP_REFERER'} # least reliable
|| ''; # based on time now
$unique_key = md5_hex($unique_key);
# this solution requires a memcache server
# or some other cache that handles volitility
my $mem = Cache::Memcached->new({servers => ['localhost:11211']})
+;
# see what has already been used
my $used = $mem->get($unique_key) || [];
$used = [] if @$used >= @$pick_list; # reset when full
my %used = map {$_ => 1} @$used;
my @avail = grep {! $used{$_}} 0 .. $#$pick_list;
# pick random item and add it to list of used items
my $index = $avail[rand @avail];
push @$used, $index;
$mem->set($unique_key, $used);
return $pick_list->[$index];
}
# use it like this
my @items = ("http://foo", "http://bar", "http://baz");
my $page = 'pagefoo';
print get_random(\@items, $page), "\n";
print get_random(\@items, $page), "\n";
print get_random(\@items, $page), "\n";
# it will automatically reset
print "Reset\n";
print get_random(\@items, $page), "\n";
print get_random(\@items, $page), "\n";
print get_random(\@items, $page), "\n";
__END__
Prints something like:
http://bar
http://baz
http://foo
Reset
http://baz
http://foo
http://bar
You should note that I have used memcached here. My reasoning for doing it is you can have a small localized chunk of memory allocated for this very temporary, very dynamic system and you can insert entries into memcache and then forget about them. The old entries and unused entries are automatically dropped as new entries use up the available memcache space. I would say that for the use you have mentioned here, having a memcache daemon running with only 1MB of allocation would be sufficient for what you are doing.
Oh - and this solution should add very little overhead to your process.
my @a=qw(random brilliant braindead); print $a[rand(@a)];
| [reply] [Watch: Dir/Any] [d/l] |
Re: Duplicate Randoms with SSI
by dwm042 (Priest) on Oct 26, 2007 at 17:00 UTC
|
Truly random numbers should contain periods with repeats. Therefore, you don't want truly random numbers. You want a nonrandom function with a period as long as the number of ads you serve, whose values map 1 to 1 to your ads for the length of the period. The piece of code that generates this function has to be persistent.
By way of example, if you have 10 ads, you could start with an array of numbers, 1 through 10 which you would then shuffle. Each number would map to an ad, and then use your 'persistent' piece to hand the required content to the display code. When the server piece gets to the end of the array it starts over. After serving up so many ads, it reshuffles the array of ad-order (so that people don't see too common a pattern).
Your persistent piece could be a separate process, using sockets or a fifo for communication or simply a file that is read that contains an index. You would increment and update the index after each use.
| [reply] [Watch: Dir/Any] |
|
I have considered having sets of numbers of blocks to pull, I was going to index them on the current seconds (instead of IPs and ENV variables as previously suggested). I tend to add and delete blocks from daily to at least a couple times a month and the number of blocks that are in the text file can inflate or deflate by 15-20 blocks per change. Having to constantly update my chain of numbers seems to defeat the reason to have SSI
| [reply] [Watch: Dir/Any] |
Re: Duplicate Randoms with SSI
by duff (Parson) on Oct 26, 2007 at 18:28 UTC
|
Crazy idea: Make an image server daemon and make your SSI calls to a light-weight client that just requests an image from the daemon. The daemon would keep track of the last 5 (for instance) images that it just served and know not to serve them again.
| [reply] [Watch: Dir/Any] |
|
Not sure it is crazy, I have thought about implementing this with mod_perl so it is constantly running. With the quantity of visitors I get, I would need to keep a list of IP addresses or session IDs or something to keep track of what went to each user, I think what you are suggesting is not too far off this.
| [reply] [Watch: Dir/Any] |
Re: Duplicate Randoms with SSI
by Krambambuli (Curate) on Oct 26, 2007 at 15:50 UTC
|
I'm not sure if I really understand how things are working, so maybe my idea might be totally wrong.
I'm thinking about introducing (text, md5_checksum) pairs into in the equation. Giving the Perl script the already existent md5_checksums as arguments and getting not only the text, but also the md5_checksum for it as result.
Calculating md5 checksums isn't cheap, but also not really expensive - so, maybe it might work ?
| [reply] [Watch: Dir/Any] |
Re: Duplicate Randoms with SSI
by Aim9b (Monk) on Oct 26, 2007 at 17:46 UTC
|
Would it be posible to determine ahead of time how many ads you need to display, then retrieve them all on a single access. They wouldn't need to necessarily be in sequence. Just a thought. | [reply] [Watch: Dir/Any] |
|
I have considered this, one call that would create the maximum number of blocks each as its own -div- then placing the -div- blocks. I don't know if I run into CSS issues or not, since a left column block needs to be formatted differently than a right column or center column block. I may have to ask a CSS whiz on how I might it get it done correctly.
| [reply] [Watch: Dir/Any] |
Re: Duplicate Randoms with SSI
by jethro (Monsignor) on Oct 27, 2007 at 22:34 UTC
|
Lowest-tech solution:
shuffle the ads into 5 directories. Get the first block on the page out of the first dir and so on.
Have a cron-job either rename the dirs in a round-robin fashion or shuffle the adds between the dirs every 15 minutes. This makes sure that in the long run all the ads have equal chance to show up.
BUT: Both would need a locking mechanism, so that the SSL is not accessing at the moment of the renaming or shuffling.
| [reply] [Watch: Dir/Any] |