Should I break each user_id up by each digit of the id?
No. It just creates extra levels for the filesystem to lookup, which slows things down, for no benefit.
Adding the date into the path however is a brilliant idea.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
By the way, as you only have 10 characters in your alphabet, you might want to consider using the first 4 digits split into two groups of 2:
/pathtocountdir/date/11/22/1122333/
Or perhaps two groups of 3:<code>/pathtocountdir/date/111/222/1112223/
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
OK thanks. Just curious if that will give me enough directories to avoid hitting any filesystem limits?
| [reply] |
Depends upon your filing system, but 2 x 2-digits means a max of 100 in the lower levels and 10,000 in the top. 2 x 3-digit, gives 1000 in teh lower levels and 100 in the top. Every filesystem I am aware of will handle those numbers with ease.
The latter probably works better for performance, but you'd have to do a few tests to be sure.
On my system it is much of a muchness at around 3 milliseconds per lookup/increment either way:
#! perl -slw
use strict;
use Time::HiRes qw[ time ];
mkdir 'myroot';
for my $l1 ( '00' .. '09' ) {
mkdir "myroot/$l1";
for my $l2 ( '00' .. '09' ) {
mkdir "myroot/$l1/$l2";
mkdir "myroot/$l1/$l2/$l1$l2$_" for '000' .. '999';
mkdir "myroot/$l1/$l2/$l1$l2$_/1" for '000' .. '999';
}
}
my $start = time;
for( 1 .. 1000 ) {
my $id = sprintf "%02u%02u%03u", int( rand 10 ), int( rand 10 ), in
+t( rand 1000 );
my( $l1, $l2 ) = unpack 'a2a2', $id;
opendir D, "myroot/$l1/$l2/$id/";
readdir D; readdir D; ## get rid of '.' & '..'
my $count = readdir D;
rename "myroot/$l1/$l2/$id/" . $count, "myroot/$l1/$l2/$id/" . ++$c
+ount
or warn "$! : $id : $count";
closedir D;
}
printf "2x2x7 took %f secs/lookup&increment\n", ( time() - $start ) /
+1000;
system "rd /q /s myroot";
mkdir 'myroot';
for my $l1 ( '000' .. '009' ) {
mkdir "myroot/$l1";
for my $l2 ( '000' .. '009' ) {
mkdir "myroot/$l1/$l2";
mkdir "myroot/$l1/$l2/$l1$l2$_" for '0' .. '9';
mkdir "myroot/$l1/$l2/$l1$l2$_/1" for '0' .. '9';
}
}
$start = time;
for( 1 .. 1000 ) {
my $id = sprintf "%03u%03u%01u", int( rand 10 ), int( rand 10 ), in
+t( rand 10 );
my( $l1, $l2 ) = unpack 'a3a3', $id;
opendir D, "myroot/$l1/$l2/$id/" or warn "$! : myroot/$l1/$l2/$id/"
+;
readdir D; readdir D; ## get rid of '.' & '..'
my $count = readdir D;
rename "myroot/$l1/$l2/$id/" . $count, "myroot/$l1/$l2/$id/" . ++$c
+ount
or warn "$! : $id : $count";
closedir D;
}
printf "3x3x7 took %f secs/lookup&increment\n", ( time() - $start ) /
+1000;
system "rd /q /s myroot";
__END__
C:\test>1032393
2x2x7 took 0.002671 secs/lookup&increment
3x3x7 took 0.002761 secs/lookup&increment
C:\test>1032393
2x2x7 took 0.002628 secs/lookup&increment
3x3x7 took 0.003111 secs/lookup&increment
BTW: Please note the additional readdir D; readdir D; ## get rid of '.' & '..' which I forgot above.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
.
| [reply] [d/l] [select] |