There's more than one way to do things PerlMonks

### Base36 numbers: speed and golf

by nop (Hermit)
 on Feb 27, 2002 at 19:11 UTC Need Help??
nop has asked for the wisdom of the Perl Monks concerning the following question:

Given an n-digit long number in base36 [0-9A-Z], what is the fastest way to generate the next number in the sequence? N is large enough that I can't cache the whole mess in some hashes and do fast lookups.
As a curiousity, anyone want to suggest the golfiest way to generate the next number? Thanks -- nop

Replies are listed 'Best First'.
Re: Base36 numbers: speed and golf
by japhy (Canon) on Feb 27, 2002 at 19:52 UTC
Well, you could convert the base 36 number to Perl's representation, add one, and then convert it back to base 36...
```{
my @to_b36 = (0 .. 9, 'A' .. 'Z');
my %to_num;
@to_num{@to_b36} = 0 .. 35;

sub b36_to_num {
my \$n = shift;
my (\$i, \$s) = (0, 0);
\$s += \$to_num{chop \$n} * 36**\$i++ while length \$n;
return \$s;
}

sub num_to_b36 {
use integer;  # so that /= 36 is easy
my \$n = shift;
my \$s = "";
do { \$s = \$to_b36[\$n % 36] . \$s, \$n /= 36 } while \$n;
return \$s;
}

sub inc_b36 { num_to_b36(&b36_to_num + 1) }
}
That's the complete solution. You don't need to do all that work, though. Here's a much simpler regex solution:
```{
# in the same block as the previous code

sub rx_inc_b36 {
my \$n = shift;
\$n =~ s/([^Z])(Z*)\$/\$to_b36[\$to_num{\$1}+1] . 0 x length \$2/e
or \$n = 1 . 0 x length \$n;
return \$n;
}
}
You could also use my reversed regex approach:
```{
# ditto

sub rx_inc_b36 {
my \$n = reverse shift;
\$n =~ s/^(Z*)([^Z])/(0 x length \$1) . \$to_b36[\$to_num{\$2}+1]/e
or \$n = (0 x length \$n) . 1;
return scalar reverse \$n;
}
}
And here is my golf code, which registers 94 characters:
```sub GOLF {
local\$_=pop;@_{@_=(
0..9,A..Z)}=0..35;s
/([^Z])(Z*)\$/\$_[\$_{
\$1}+1].\$-x length\$2
/e?\$_:1 .\$-x y!!!c
}

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

(tye)Re: Base36 numbers: speed and golf
by tye (Sage) on Feb 27, 2002 at 22:04 UTC

Of course, I can't say whether this is the fastest but it was designed to be fast.

```#!/usr/bin/perl -w
use strict;
my %next;
BEGIN {
@next{'0'..'9','A'..'Z'}= ('1'..'9','A'..'Z','0');
}
sub inc36 {
my \$dig;
while(  not \$dig= \$next{chop(\$_[0])}  ) {
if(  ! length(\$_[0])  ) {
\$dig= '1';
last;
}
}
}
while(  <DATA>  ) {
for(  split " "  ) {
print "\$_ + 1 = ";
inc36( \$_ );
print \$_,\$/;
}
}
__END__
0 1 9 A Y Z 10 19 1Z 9Z AZ ZZ YZZ XYZZY XYZZZ
which outputs:
```0 + 1 = 1
1 + 1 = 2
9 + 1 = A
A + 1 = B
Y + 1 = Z
Z + 1 = 10
10 + 1 = 11
19 + 1 = 1A
1Z + 1 = 20
9Z + 1 = A0
AZ + 1 = B0
ZZ + 1 = 100
YZZ + 1 = Z00
XYZZY + 1 = XYZZZ
XYZZZ + 1 = XZ000
```

- tye (but my friends call me "Tye")

Well done tye. This is an elegant solution and the fastest by a mile.

There is a lot of good code produced here at PerlMonks but I wonder how much of it is actually studied or even looked at. At the time of writing two other solutions that contain bugs have higher reps than this node.

--
John.

Well, this was interesting. I wrote a version of your code that used an array and ord instead of the hash. It was 10% slower in my tests which quite suprised me. I really thought the difference in lookup time between an array and a hash would outway the cost of the ord function, but not so.

Thanks, very educational.

•Re: Base36 numbers: speed and golf
by merlyn (Sage) on Feb 27, 2002 at 23:37 UTC
```sub inc36 {
my (\$left, \$right) = shift =~ /^(.*?)(z*)\$/i;
\$left = "0" unless length \$left;
substr(\$left, -1) =~ tr/0-9a-yA-Y/1-9ab-zB-Z/;
\$right =~ tr/zZ/00/;
\$left . \$right;
}
But I'm sure I typed this in before, either here or somewhere else.
update: minor typo (replacing range should be 1-9ab-zB-Z, not 1-ab-zB-Z, doh!) noted by jmcnamara and fixed, thanks!
Re: Base36 numbers: speed and golf
by jmcnamara (Monsignor) on Feb 28, 2002 at 11:13 UTC

Since nop asked for a fast solution there can be only one arbiter: Benchmark.

I benchmarked the non-golf solutions. The results show that tye's solution is the fastest by quite a margin.

```Test input is 'XYZZY'

Benchmark: timing 1000000 iterations
japhy: 28 secs (28.21 usr +  0.01 sys = 28.22 CPU) @ 35435.86/s)
japhy2: 23 secs (23.57 usr +  0.00 sys = 23.57 CPU) @ 42426.81/s)
japhy3: 22 secs (21.87 usr +  0.01 sys = 21.88 CPU) @ 45703.84/s)
jmn:  8 secs ( 7.26 usr +  0.01 sys =  7.27 CPU) @ 137551.58/s)
merlyn: 20 secs (20.54 usr + -0.02 sys = 20.52 CPU) @ 48732.94/s)
petral: 22 secs (22.47 usr +  0.01 sys = 22.48 CPU) @ 44483.99/s)
tye:  5 secs ( 4.84 usr +  0.00 sys =  4.84 CPU) @ 206611.57/s)
Rate  japhy japhy2 petral japhy3 merlyn    jmn    tye
japhy   35436/s     --   -16%   -20%   -22%   -27%   -74%   -83%
japhy2  42427/s    20%     --    -5%    -7%   -13%   -69%   -79%
petral  44484/s    26%     5%     --    -3%    -9%   -68%   -78%
japhy3  45704/s    29%     8%     3%     --    -6%   -67%   -78%
merlyn  48733/s    38%    15%    10%     7%     --   -65%   -76%
jmn    137552/s   288%   224%   209%   201%   182%     --   -33%
tye    206612/s   483%   387%   364%   352%   324%    50%     --

Test input is 'XYZZZ'

Benchmark: timing 1000000 iterations
japhy: 28 secs (28.30 usr +  0.00 sys = 28.30 CPU) @ 35335.69/s)
japhy2: 22 secs (22.45 usr +  0.02 sys = 22.47 CPU) @ 44503.78/s)
japhy3: 22 secs (21.97 usr +  0.01 sys = 21.98 CPU) @ 45495.91/s)
jmn: 27 secs (25.95 usr +  0.00 sys = 25.95 CPU) @ 38535.65/s)
merlyn: 18 secs (18.30 usr +  0.00 sys = 18.30 CPU) @ 54644.81/s)
petral: 18 secs (18.69 usr +  0.01 sys = 18.70 CPU) @ 53475.94/s)
tye:  5 secs ( 4.85 usr +  0.00 sys =  4.85 CPU) @ 206185.57/s)
Rate  japhy    jmn japhy2 japhy3 petral merlyn    tye
japhy   35336/s     --    -8%   -21%   -22%   -34%   -35%   -83%
jmn     38536/s     9%     --   -13%   -15%   -28%   -29%   -81%
japhy2  44504/s    26%    15%     --    -2%   -17%   -19%   -78%
japhy3  45496/s    29%    18%     2%     --   -15%   -17%   -78%
petral  53476/s    51%    39%    20%    18%     --    -2%   -74%
merlyn  54645/s    55%    42%    23%    20%     2%     --   -73%
tye    206186/s   484%   435%   363%   353%   286%   277%     --

Update: Added japhy's other solutions. The benchmark code is on my scratchpad.

As a general observation. 97% of the Base36 range can be handled with a simple substitution of the last character in the string. This is the test case represented by 'XYZZY' above. The remaining numbers in the Base36 range (those ending in Z) require more work. This is the test case represented by 'XYZZZ'.

All of the above solutions could be optimised for the substitution case. The Z increment is therefore the governing case in determining the speed of the solution.

What's nice about tye's solution is that it takes these factors into account. He wrote it to be fast and it was.

John.

Which solution of mine did you use?

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

What about using tye's hash incrementer with regexp processing?
Eg, from Merlyn's:
```my %next;
@next{0..9,'a'..'y','A'..'Y'} = (1..9,'a'..'z','B'..'Z');

sub inc36 {
my (\$l, \$i, \$r) = shift=~/^(.*?)([^zZ]?)([zZ]*)\$/;

\$l . \$next{ \$i||0 } . (0 x length \$r)
}
p
Re: Base36 numbers: speed and golf
by Ido (Hermit) on Feb 27, 2002 at 20:16 UTC
It's ugly, not golfy, and not fast, but that's the best I could come up with:
```my \$number="8i8jkwdz4fks4d1z";
chg(1);
sub chg{(substr(\$number,-1*\$_[0],1)=dig(substr(\$number,-1*\$_[0],1))) e
+q '0' and chg(\$_[0]+1)}
sub dig{my \$d=shift; if(\$d==9){return 'a'}elsif(\$d=~/\d/){return \$d+1}
+elsif(lc(\$d) eq 'z'){return 0}else{return chr(ord(\$d)+1)}}
print \$number;
Re: Base36 numbers: speed and golf
by jmcnamara (Monsignor) on Feb 27, 2002 at 22:05 UTC

The POSIX function strtol provides a straightforward method of converting from a Base36 number. Unfortunately, there isn't an equivalent ltostr.

Here is an un-golfed solution. It edits the number in-place.

There is a optimisation at the beginning that uses a simple string increment for numbers that don't end in 9 or Z. This gives a speed-up for 17/18 of the Base36 range.

```#!/usr/bin/perl -w

use strict;
use POSIX 'strtol';

my @chars = (0..9, 'A'..'Z');

sub inc36 {
my \$num = \$_[0];

if (\$num !~ /[9Z]\$/) {
\$num =~ tr/0-9/a-j/;
++\$num;
\$num =~ tr/a-j/0-9/;
return \$_[0] = \$num;
}

my \$int = 1 + strtol(\$_[0], 36);

my \$base36 = '';

while (\$int) {
my \$frac = \$int % 36;
\$int     = int (\$int / 36);
\$base36  = \$chars[\$frac] . \$base36;
}
\$_[0] = \$base36;
}

--
John.

Re: Base36 numbers: speed and golf
by Anonymous Monk on Feb 27, 2002 at 22:51 UTC
some code borrowed from a previous game..
```sub addone{
#        1         2         3         4         5         6
#23456789_123456789_123456789_123456789_123456789_123456789_123456789
(\$_,\$s)=@_;y/A-Z/:-T/;\$s=36*\$s-48+ord for/./g;\$s++;\$t.=chr((\$s%36)+48
#        8         9        10        11        12
#23456789_123456789_123456789_123456789_123456789_12
),\$s=(\$s-\$s%36)/36while\$s>0;\$t=~y/:-T/A-Z/;reverse\$t
}
Hey!! someone edited out my little ♣ dohickey... whats up with that? I thought we were allowed to put little markers on our titles...
Can someone please explain to me why the markers (♣) in my titles are being edited out? Both of the above two nodes had their titles edited w/o my consent and (from what I can tell) w/o a documented request to an editor. If this is a site policy and is applied fairly across the board, thats one thing... If only certain people get the privilege of marking their nodes with special chars, thats really a shame. ;-(
out on a limb

If anything, the "single char dohickey's" should be reserved for saints. Other monks should settle for using the "(foo)" notation. If your posts are important/frequent enough for you to want to track them, Create a new user.

Why? How many dohickey's are there?

Re: Base36 numbers: speed and golf
by shotgunefx (Parson) on Feb 28, 2002 at 21:24 UTC
Just a thought, are you going to use this for an order number of similar?

One of our clients use this but the problem is that you would get orders with 2F*CKU, etc eventually.

If you are going to use it for that purpose, you might be better off taking out vowels and making it base 31. That's what we did to avoid offending customers.

-Lee

"To be civilized is to deny one's nature."
Nope. It is a tracking code we embed in links into our site to track efficacy of 3rd party online advertising. Advertising on search engines with many many words and many many destination URLS chews up codes, so we go base 36 to pack in as much as we can in.

nop

Create A New User
Node Status?
node history
Node Type: perlquestion [id://147999]
Approved by root
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (1)
As of 2018-05-22 23:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
World peace can best be achieved by:

Results (166 votes). Check out past polls.

Notices?