Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Base36 numbers: speed and golf

by nop (Hermit)
on Feb 27, 2002 at 19:11 UTC ( #147999=perlquestion: print w/ replies, xml ) Need Help??
nop has asked for the wisdom of the Perl Monks concerning the following question:

Given an n-digit long number in base36 [0-9A-Z], what is the fastest way to generate the next number in the sequence? N is large enough that I can't cache the whole mess in some hashes and do fast lookups.
As a curiousity, anyone want to suggest the golfiest way to generate the next number? Thanks -- nop

Comment on Base36 numbers: speed and golf
Download Code
Re: Base36 numbers: speed and golf
by japhy (Canon) on Feb 27, 2002 at 19:52 UTC
    Well, you could convert the base 36 number to Perl's representation, add one, and then convert it back to base 36...
    { my @to_b36 = (0 .. 9, 'A' .. 'Z'); my %to_num; @to_num{@to_b36} = 0 .. 35; sub b36_to_num { my $n = shift; my ($i, $s) = (0, 0); $s += $to_num{chop $n} * 36**$i++ while length $n; return $s; } sub num_to_b36 { use integer; # so that /= 36 is easy my $n = shift; my $s = ""; do { $s = $to_b36[$n % 36] . $s, $n /= 36 } while $n; return $s; } sub inc_b36 { num_to_b36(&b36_to_num + 1) } }
    That's the complete solution. You don't need to do all that work, though. Here's a much simpler regex solution:
    { # in the same block as the previous code sub rx_inc_b36 { my $n = shift; $n =~ s/([^Z])(Z*)$/$to_b36[$to_num{$1}+1] . 0 x length $2/e or $n = 1 . 0 x length $n; return $n; } }
    You could also use my reversed regex approach:
    { # ditto sub rx_inc_b36 { my $n = reverse shift; $n =~ s/^(Z*)([^Z])/(0 x length $1) . $to_b36[$to_num{$2}+1]/e or $n = (0 x length $n) . 1; return scalar reverse $n; } }
    And here is my golf code, which registers 94 characters:
    sub GOLF { local$_=pop;@_{@_=( 0..9,A..Z)}=0..35;s /([^Z])(Z*)$/$_[$_{ $1}+1].$-x length$2 /e?$_:1 .$-x y!!!c }

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: Base36 numbers: speed and golf
by Ido (Hermit) on Feb 27, 2002 at 20:16 UTC
    It's ugly, not golfy, and not fast, but that's the best I could come up with:
    my $number="8i8jkwdz4fks4d1z"; chg(1); sub chg{(substr($number,-1*$_[0],1)=dig(substr($number,-1*$_[0],1))) e +q '0' and chg($_[0]+1)} sub dig{my $d=shift; if($d==9){return 'a'}elsif($d=~/\d/){return $d+1} +elsif(lc($d) eq 'z'){return 0}else{return chr(ord($d)+1)}} print $number;
(tye)Re: Base36 numbers: speed and golf
by tye (Cardinal) on Feb 27, 2002 at 22:04 UTC

    Of course, I can't say whether this is the fastest but it was designed to be fast.

    #!/usr/bin/perl -w use strict; my %next; BEGIN { @next{'0'..'9','A'..'Z'}= ('1'..'9','A'..'Z','0'); } sub inc36 { my $add= ""; my $dig; while( not $dig= $next{chop($_[0])} ) { $add .= $dig; if( ! length($_[0]) ) { $dig= '1'; last; } } $_[0] .= $dig . $add; } while( <DATA> ) { for( split " " ) { print "$_ + 1 = "; inc36( $_ ); print $_,$/; } } __END__ 0 1 9 A Y Z 10 19 1Z 9Z AZ ZZ YZZ XYZZY XYZZZ
    which outputs:
    0 + 1 = 1
    1 + 1 = 2
    9 + 1 = A
    A + 1 = B
    Y + 1 = Z
    Z + 1 = 10
    10 + 1 = 11
    19 + 1 = 1A
    1Z + 1 = 20
    9Z + 1 = A0
    AZ + 1 = B0
    ZZ + 1 = 100
    YZZ + 1 = Z00
    XYZZY + 1 = XYZZZ
    XYZZZ + 1 = XZ000
    

            - tye (but my friends call me "Tye")

      Well done tye. This is an elegant solution and the fastest by a mile.

      There is a lot of good code produced here at PerlMonks but I wonder how much of it is actually studied or even looked at. At the time of writing two other solutions that contain bugs have higher reps than this node.

      --
      John.

      Well, this was interesting. I wrote a version of your code that used an array and ord instead of the hash. It was 10% slower in my tests which quite suprised me. I really thought the difference in lookup time between an array and a hash would outway the cost of the ord function, but not so.

      Thanks, very educational.

      Yves / DeMerphq
      --
      When to use Prototypes?
      Advanced Sorting - GRT - Guttman Rosler Transform

Re: Base36 numbers: speed and golf
by jmcnamara (Monsignor) on Feb 27, 2002 at 22:05 UTC

    The POSIX function strtol provides a straightforward method of converting from a Base36 number. Unfortunately, there isn't an equivalent ltostr.

    Here is an un-golfed solution. It edits the number in-place.

    There is a optimisation at the beginning that uses a simple string increment for numbers that don't end in 9 or Z. This gives a speed-up for 17/18 of the Base36 range.

    #!/usr/bin/perl -w use strict; use POSIX 'strtol'; my @chars = (0..9, 'A'..'Z'); sub inc36 { my $num = $_[0]; if ($num !~ /[9Z]$/) { $num =~ tr/0-9/a-j/; ++$num; $num =~ tr/a-j/0-9/; return $_[0] = $num; } my $int = 1 + strtol($_[0], 36); my $base36 = ''; while ($int) { my $frac = $int % 36; $int = int ($int / 36); $base36 = $chars[$frac] . $base36; } $_[0] = $base36; }

    --
    John.

(Duplicate) Re: Base36 numbers: speed and golf
by NodeReaper (Curate) on Feb 27, 2002 at 22:06 UTC

    Reason: (jmcnamara) Delete. This is a duplicate. Evidently I'm not as patient as I thought.

    For more information on this node visit: this

Re: Base36 numbers: speed and golf
by Anonymous Monk on Feb 27, 2002 at 22:51 UTC
    some code borrowed from a previous game..
    sub addone{ # 1 2 3 4 5 6 #23456789_123456789_123456789_123456789_123456789_123456789_123456789 ($_,$s)=@_;y/A-Z/:-T/;$s=36*$s-48+ord for/./g;$s++;$t.=chr(($s%36)+48 # 8 9 10 11 12 #23456789_123456789_123456789_123456789_123456789_12 ),$s=($s-$s%36)/36while$s>0;$t=~y/:-T/A-Z/;reverse$t }
      Hey!! someone edited out my little ♣ dohickey... whats up with that? I thought we were allowed to put little markers on our titles...
        out on a limb

        If anything, the "single char dohickey's" should be reserved for saints. Other monks should settle for using the "(foo)" notation. If your posts are important/frequent enough for you to want to track them, Create a new user.

        Why? How many dohickey's are there?

        Can someone please explain to me why the markers (♣) in my titles are being edited out? Both of the above two nodes had their titles edited w/o my consent and (from what I can tell) w/o a documented request to an editor. If this is a site policy and is applied fairly across the board, thats one thing... If only certain people get the privilege of marking their nodes with special chars, thats really a shame. ;-(
•Re: Base36 numbers: speed and golf
by merlyn (Sage) on Feb 27, 2002 at 23:37 UTC
    sub inc36 { my ($left, $right) = shift =~ /^(.*?)(z*)$/i; $left = "0" unless length $left; substr($left, -1) =~ tr/0-9a-yA-Y/1-9ab-zB-Z/; $right =~ tr/zZ/00/; $left . $right; }
    But I'm sure I typed this in before, either here or somewhere else.

    -- Randal L. Schwartz, Perl hacker


    update: minor typo (replacing range should be 1-9ab-zB-Z, not 1-ab-zB-Z, doh!) noted by jmcnamara and fixed, thanks!
Re: Base36 numbers: speed and golf
by jmcnamara (Monsignor) on Feb 28, 2002 at 11:13 UTC

    Since nop asked for a fast solution there can be only one arbiter: Benchmark.

    I benchmarked the non-golf solutions. The results show that tye's solution is the fastest by quite a margin.

    Test input is 'XYZZY' Benchmark: timing 1000000 iterations japhy: 28 secs (28.21 usr + 0.01 sys = 28.22 CPU) @ 35435.86/s) japhy2: 23 secs (23.57 usr + 0.00 sys = 23.57 CPU) @ 42426.81/s) japhy3: 22 secs (21.87 usr + 0.01 sys = 21.88 CPU) @ 45703.84/s) jmn: 8 secs ( 7.26 usr + 0.01 sys = 7.27 CPU) @ 137551.58/s) merlyn: 20 secs (20.54 usr + -0.02 sys = 20.52 CPU) @ 48732.94/s) petral: 22 secs (22.47 usr + 0.01 sys = 22.48 CPU) @ 44483.99/s) tye: 5 secs ( 4.84 usr + 0.00 sys = 4.84 CPU) @ 206611.57/s) Rate japhy japhy2 petral japhy3 merlyn jmn tye japhy 35436/s -- -16% -20% -22% -27% -74% -83% japhy2 42427/s 20% -- -5% -7% -13% -69% -79% petral 44484/s 26% 5% -- -3% -9% -68% -78% japhy3 45704/s 29% 8% 3% -- -6% -67% -78% merlyn 48733/s 38% 15% 10% 7% -- -65% -76% jmn 137552/s 288% 224% 209% 201% 182% -- -33% tye 206612/s 483% 387% 364% 352% 324% 50% -- Test input is 'XYZZZ' Benchmark: timing 1000000 iterations japhy: 28 secs (28.30 usr + 0.00 sys = 28.30 CPU) @ 35335.69/s) japhy2: 22 secs (22.45 usr + 0.02 sys = 22.47 CPU) @ 44503.78/s) japhy3: 22 secs (21.97 usr + 0.01 sys = 21.98 CPU) @ 45495.91/s) jmn: 27 secs (25.95 usr + 0.00 sys = 25.95 CPU) @ 38535.65/s) merlyn: 18 secs (18.30 usr + 0.00 sys = 18.30 CPU) @ 54644.81/s) petral: 18 secs (18.69 usr + 0.01 sys = 18.70 CPU) @ 53475.94/s) tye: 5 secs ( 4.85 usr + 0.00 sys = 4.85 CPU) @ 206185.57/s) Rate japhy jmn japhy2 japhy3 petral merlyn tye japhy 35336/s -- -8% -21% -22% -34% -35% -83% jmn 38536/s 9% -- -13% -15% -28% -29% -81% japhy2 44504/s 26% 15% -- -2% -17% -19% -78% japhy3 45496/s 29% 18% 2% -- -15% -17% -78% petral 53476/s 51% 39% 20% 18% -- -2% -74% merlyn 54645/s 55% 42% 23% 20% 2% -- -73% tye 206186/s 484% 435% 363% 353% 286% 277% --
    Update: Added japhy's other solutions. The benchmark code is on my scratchpad.

    Update 2: Added petral's solution.

    As a general observation. 97% of the Base36 range can be handled with a simple substitution of the last character in the string. This is the test case represented by 'XYZZY' above. The remaining numbers in the Base36 range (those ending in Z) require more work. This is the test case represented by 'XYZZZ'.

    All of the above solutions could be optimised for the substitution case. The Z increment is therefore the governing case in determining the speed of the solution.

    What's nice about tye's solution is that it takes these factors into account. He wrote it to be fast and it was.


    John.

      Which solution of mine did you use?

      _____________________________________________________
      Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      What about using tye's hash incrementer with regexp processing?
      Eg, from Merlyn's:
      my %next; @next{0..9,'a'..'y','A'..'Y'} = (1..9,'a'..'z','B'..'Z'); sub inc36 { my ($l, $i, $r) = shift=~/^(.*?)([^zZ]?)([zZ]*)$/; $l . $next{ $i||0 } . (0 x length $r) }
        p
Re: Base36 numbers: speed and golf
by shotgunefx (Parson) on Feb 28, 2002 at 21:24 UTC
    Just a thought, are you going to use this for an order number of similar?

    One of our clients use this but the problem is that you would get orders with 2F*CKU, etc eventually.

    If you are going to use it for that purpose, you might be better off taking out vowels and making it base 31. That's what we did to avoid offending customers.

    -Lee

    "To be civilized is to deny one's nature."
      Nope. It is a tracking code we embed in links into our site to track efficacy of 3rd party online advertising. Advertising on search engines with many many words and many many destination URLS chews up codes, so we go base 36 to pack in as much as we can in.

      nop

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://147999]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2014-07-26 05:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls