Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^2: A distributed design challenge (scale)

by tye (Sage)
on Jun 29, 2012 at 20:25 UTC ( #979176=note: print w/replies, xml ) Need Help??

in reply to Re: A distributed design challenge
in thread A distributed design challenge

I agree (except for the minor confusion with some numbers in the last paragraph).

Bids probably need to be subtracted from a budget in Redis. You might have a separate "budget" that includes bids while keeping the budget that only includes payments for winning bids.

You might also hedge and say that each bid only has a (say) 60% chance of winning so bidding $50 subtracts only $30 from the "budget less outstanding bids", realizing that this risks going over the budget if you win more than an average of 60% of bids (weighted by bid size).

There are several complications of this approach:

  1. Each bid must talk to the single Redis master
    • This prevents horizontal scaling of Redis capacity
    • This causes an outage in bidding when the master fails (and it may take a while to promote one of the slaves to be the new master)
  2. You probably need to deal with a bidding host dieing and then failing to remove a bid in a timely manner, leaving the available budget unnecessarily low for too long

Also, the example Ovid gives seems way too optimistic and thus to greatly understate the risk. If the budget is $100 and $50 of it has been spent and there are no outstanding bids, then 36 systems could each bid $50 and end up spending $1800 and exceeding the budget by 1750%, not a measly 35%. So that makes me suspect that the explanation was incomplete.

But winning bids already have to access the Redis master, so unless winning bids are a small fraction of regular bids, there isn't much benefit to horizontal scaling with Redis slaves. But I could certainly see the timing being less critical for recording winning bids (and thus an outage on the master would prevent bidding but not be as serious of a problem for the recording of winning bids).

I can see a lot of approaches for dealing with better scaling and higher availability while still subtracting each bid from a central balance. But I don't think a simple Redis set-up would be the solution. I'd probably want multiple masters so you might want to look at membase or riak, for example (though that might not really address scaling).

You could also shard your requests so that bids against budget $X only go to (say) 3 of the available servers and then each server can bid upto 1/3 of the available budget (using a server-local central bid amount tracker, perhaps a local Redis instance). Then your 100s of budgets spread load across your dozens of servers but each budget only spreads over a smaller subset. That obviously can get complicated to maintain that mapping in a way where load is sufficiently balanced.

You can also just roll your own somewhat trivial multi-master infrastructure where each master is responsible for broadcasting updates to other masters. You can even use "broadcast" or "multicast" packets for notifying all masters "at once", if you seriously trust the capacity and reliability of your network. This also brings to mind pubsub.

- tye        

  • Comment on Re^2: A distributed design challenge (scale)

Replies are listed 'Best First'.
Re^3: A distributed design challenge (scale)
by Ovid (Cardinal) on Jul 02, 2012 at 18:50 UTC

    I had considered subtracting bid amounts from the budget, but rejected it for several reasons, all of which are problematic. First, if 36 boxes bid simultaneously, we could overshoot our budget unless we have boxes "lock" the bid amount, but that changes us from parallel to serial bidding and increases response time to the point where bids that should have been one would be lost.

    Serial bidding might be appropriate as we get close to the budget, but that raises another issue: the box that receives the notification that we lost is usually not the box that made the corresponding bid. Because we don't receive the bid amount in the loss notification, we'd have to search all 36 boxes to find out who bid what (or tremendously complicate our Redis setup by trying to store huge amounts of bid information in Redis and we're expecting our volume of data to increase by a couple of orders of magnitude).

    As for your suggestion of me being overly optimistic, that might be the case, but the bid amounts are (fortunately) small fractions of the budget, so it won't be catastrophic until we scale up.

    Riak is not an option because the response times on Riak are slow enough that we'll lose bids we should have one (though we're probably going to put Redis in front of Riak). I don't know about membase.

    We're probably going to go with a strategy of allocating separate budgets to separate boxes. It's more complicated, but it seems much safer. We lose a lot of bidding capacity for individual campaigns, but for many campaigns, I think it's doable.

      First, if 36 boxes bid simultaneously, we could overshoot our budget unless we have boxes "lock" the bid amount, but that changes us from parallel to serial bidding and increases response time to the point where bids that should have been one would be lost.

      I don't see where you need locking. Naturally the check "if ($budget>0-$overshootallowance) { $budget-=$bid }" must be an atomic operation and this is a minimal lock, but this is not a lock over the duration of a bid. The case that only one machine can bid can only occur if the total budget is already so low as to allow exactly one last bid

      UPDATE: To clarify, if the subtract operation is the most time-consuming part in a single bid operation, then having a central budget is practically serializing the bidding per campaign. I'm not sure whether you meant that or a lock over a complete bid-and-wait-for-success-transaction

      You either have the ability/funds to allow *all* your boxes to overshoot or you don't. In the first case you don't have a problem. In the second case, whatever scheme you are using, you obviously have to lose some bids as the budget gets low.

      the box that receives the notification that we lost is usually not the box that made the corresponding bid

      Ok, this is a real problem for the central budget idea. But since you want to go with the separate budget idea, note that you have a smaller but similar problem: Since the successful bid arrives at a different machine, that machine has to deduce the amount from its own budget (if you don't want to search all 36 boxes). But what happens if that machine has no budget left? Either it overshoots or it searches the other boxes for budget left over (which takes enough time for the machine with the budget left over to bid although it should not). In the worst case all successful bid notifications go to the same machine with budget==0 and all the other machines can still place further bids until they are notified by that one machine.

      You have to be careful that the negotiation traffic between the machines doesn't throttle your network when lots of machines have no budget left and are repeatedly contacting other machines to get another bid out.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://979176]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (9)
As of 2016-10-26 07:43 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (336 votes). Check out past polls.