Beefy Boxes and Bandwidth Generously Provided by pair Networks RobOMonk
XP is just a number
 
PerlMonks  

Re: Daily Counters

by BrowserUk (Pope)
on May 06, 2013 at 16:55 UTC ( #1032349=note: print w/ replies, xml ) Need Help??


in reply to Daily Counters

I had a similar requirement a few years ago and (back then) the fastest mechanism available to me that provided shared access and fast lookup, was to use the file system.

For sake of discussion, assuming that your userids consist of mixed case ANSI alphanemerics -- ie. 62 chars. If you have 10 million users and use the first 3 characters in their names as an index into a first level of subdirectories, you'll have (on average) 42 users in each second level subdirectory -- so lookup is fast.

The directory structure looks like this:

/yourapp/index/ash/ashford/7/ /bre/brent/3/ /cra/crawford/4/

And the process of lookup/increment is:

my $prefix = '/yourapp/index'; my $userid = ...; my $idx = substr $userid, 0, 3; my $limitReach = 1; { opendir DIR, "$prefix/$idx/$userid/" or die $!; my $count = readdir DIR; last if $count >= LIMIT; rename "$prefix/$idx/$userid/$count", "$prefix/$idx/$userid/" . $c +ount + 1 or redo; $limitReached = 0; } ## use $limitReached to decide further action

If your data is to persist, you are going to have to do at least one directory lookup to find the DB file -- and usually more than one -- so the directory look is effectively free. And as rename is atomic, the shared data problems are taken care of without the need for time-costly, locking and polling.

The more characters in the alphabet available for your userids, the more well spread your directory structure and the faster the lookups. The only real restriction is that the alphabet must be compatible with your file systems naming conventions, which isn't usually a problem.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
/div


Comment on Re: Daily Counters
Select or Download Code
Re^2: Daily Counters
by docbrown25 (Initiate) on May 06, 2013 at 17:53 UTC

    Interesting. Thanks for the reply. I'm going to look into implementing like this.

    My user ids will be all numerics. Should I break each user_id up by each digit of the id?

    For example: $user_id = 5989358

    /pathtocountdir/$date/5/9/8/9/3/5/8/5989358/$count/

    this will also allow me to just clear out the whole /pathtocountdir/$date/ dir for previous days

    thoughts?
      Should I break each user_id up by each digit of the id?

      No. It just creates extra levels for the filesystem to lookup, which slows things down, for no benefit.

      Adding the date into the path however is a brilliant idea.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      By the way, as you only have 10 characters in your alphabet, you might want to consider using the first 4 digits split into two groups of 2:

      /pathtocountdir/date/11/22/1122333/

      Or perhaps two groups of 3:<code>/pathtocountdir/date/111/222/1112223/


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        OK thanks. Just curious if that will give me enough directories to avoid hitting any filesystem limits?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1032349]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (11)
As of 2014-04-18 15:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (469 votes), past polls