In Fastest way to lookup a point in a set we concluded, in isolation, that hash lookups were faster using
split/join rather than pack/unpack.
With nothing to lose though, I tried changing split /:/, $z to unpack 'ii', $z and
join ':', $x - 1, $y - 1
join ':', @_
to:
pack 'ii', $x - 1, $y - 1
pack 'ii', @_
... and almost fell off my chair when the run time for three million
cells dropped from 287 seconds way down to 204 seconds!!
What gives?
As shown below, there is no difference in the number of op codes:
> perl -MO=Terse -e "split /:/, $z"
LISTOP (0x2bf88f8) leave [1]
OP (0x25b59a0) enter
COP (0x2bf8940) nextstate
LISTOP (0x25b59d8) split [2]
PMOP (0x25b5a60) pushre
UNOP (0x25b5a20) null [15]
PADOP (0x25b5ad0) gvsv GV (0x25b0660) *z
SVOP (0x25b5938) const [3] IV (0x25b0d80) 0
> perl -MO=Terse -e "unpack 'ii', $z"
LISTOP (0x2c387f8) leave [1]
OP (0x645908) enter
COP (0x645940) nextstate
LISTOP (0x6459d8) unpack
OP (0x6459a0) null [3]
SVOP (0x645aa0) const [2] PV (0x63fd40) "ii"
UNOP (0x645a20) null [15]
PADOP (0x645a60) gvsv GV (0x63fe00) *z
> perl -MO=Terse -e "join ':', @_"
LISTOP (0x2a9d798) leave [1]
OP (0x24f5938) enter
COP (0x24f5970) nextstate
LISTOP (0x24f5a08) join [3]
OP (0x24f59d0) pushmark
SVOP (0x24f5ad0) const [4] PV (0x24f0960) ":"
UNOP (0x24f5a50) rv2av [2]
PADOP (0x24f5a90) gv GV (0xe9b2c8) *_
> perl -MO=Terse -e "pack 'ii', @_"
LISTOP (0x2c4e318) leave [1]
OP (0x645938) enter
COP (0x645970) nextstate
LISTOP (0x645a08) pack [3]
OP (0x6459d0) pushmark
SVOP (0x645ad0) const [4] PV (0x640a80) "ii"
UNOP (0x645a50) rv2av [2]
PADOP (0x645a90) gv GV (0xffb2c8) *_
There didn't appear to be any significant difference in memory consumption either.
Anyone got any ideas?
The slowdown may be caused by split using
a /:/ regex - and regexes are slow.
Note that in Fastest way to lookup a point in a set, we were measuring lookups in isolation
and lookups use join, not split.
How to investigate further?
Devel::NYTProf?
Update:: From running:
for my $r ( [-123456789, 987654321], [1,2] ) {
my $pp = pack 'ii', @{$r};
my $jj = join ':', @{$r};
my $pplen = length $pp;
my $jjlen = length $jj;
print "$r->[0]:$r->[1] packlen=$pplen joinlen=$jjlen\n";
my ($xpp, $ypp) = unpack 'ii', $pp;
my ($xjj, $yjj) = split /:/, $jj;
$xpp == $r->[0] or die;
$ypp == $r->[1] or die;
$xjj == $r->[0] or die;
$yjj == $r->[1] or die;
}
we see:
-123456789:987654321 packlen=8 joinlen=20
1:2 packlen=8 joinlen=3
That is,
pack/
unpack always has a hash key length of 8 bytes,
while with
split/
join the key length varies, depending on
the size of the x and y coordinates.
Update: As discovered by marioroy, 'i2' is faster than 'ii' in pack and unpack.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.