XP is just a number PerlMonks

### Yet more While issues

by Dandello (Scribe)
 on Mar 16, 2011 at 19:11 UTC Need Help??
Dandello has asked for the wisdom of the Perl Monks concerning the following question:

Well, sort of

I'm trying to speed up a subroutine by going from this

```sub popnum3 {
my ( \$x, \$y, \$z, \$zazb ) = @_;
if ( \$y == 0 ) {
\$aob[\$x][0] = \$initial * ( 1 + \$z );
}
else {
while (1) {
my \$xda = int rand( \$total + 1 );

if ( substr( \$zazb, \$xda, 1 ) eq 'c' ) {
\$aob[\$x][\$y] = \$aob[\$xda][ \$y - 1 ] * ( 1 + \$z );
last;
}
}
}
return \$aob[\$x][\$y];
}
Which gets - for obvious reasons - REALLY slow on large strings, to something like this:
```sub popnum3 {
my ( \$x, \$y, \$z, \$zazb ) = @_;
if ( \$y == 0 ) {
\$aob[\$x][0] = \$initial * ( 1 + \$z );
}
else {
while (1) {
my @cPosns;
push @cPosns, pos \$zazb while \$zazb =~ m{(?=c)}g;
my \$offset = splice @cPosns, rand @cPosns, 1;
\$aob[\$x][\$y] = \$aob[\$offset][ \$y - 1 ] * ( 1 + \$z );
last;
}
}
return \$aob[\$x][\$y];
}
Which I just can't seem to get my head around to get it to work without throwing 'undefined value' warnings all over the place.

For both these, \$x and \$y are 2d array references, \$z is a predetermined random number and \$zazb is a string containing a more or less random number of 'a's, 'c's, 'y's and 'x's (and may be over than 25,000 characters long). This subroutine is supposed to locate one of the 'c's in \$zazb, (without iterating over the entire string repeatedly) grab its mate from the @aob array (which is a number) and determine the value of \$aob[\$x][\$y] based on the value of 'c's mate.

This code

```my @cPosns;
push @cPosns, pos \$zazb while \$zazb =~ m{(?=c)}g;
my \$offset = splice @cPosns, rand @cPosns, 1;

Update

Despite ikegami's input, I still can't get anything even closely related to the above code to work for this instance. However, since the parameters of the model data has changed, it's no longer an issue as all the indexes for 'c' need to saved to a file for data-munging.

So, this does a good and fast job finding the indices for 'c' from @aod (the array of strings containing 'c'.)

```tie my @aod, 'Tie::File', 'bias/array.txt', recsep => "\n";

my \$letter = 'c';
my \$result = 0;
open my \$DATABASE, '>', \$datafileout or croak 'dataout not made.';
foreach my \$r (0 .. \$gener){
\$result = index(\$aod[\$r], \$letter);
while (\$result != -1){
print {\$DATABASE} qq{\$result,};
my \$offset = \$result + 1;
\$result = index(\$aod[\$r], \$letter, \$offset);
}
print {\$DATABASE} qq{\n} or croak 'unable to print';
}

close \$DATABASE or croak 'data1 not closed.';
then for the random indexes (\$off is the array of 'c' indices.)
```my @offst = split /\,/xsm, \$off[\$y - 1];
then to get the random offset
``` my \$index  = int rand (\$#offst);
\$offset = \$offst[\$index];
It may not be pretty - but it does work.

Replies are listed 'Best First'.
Re: Yet more While issues
by ikegami (Pope) on Mar 16, 2011 at 19:22 UTC

For clarity more than for speed,

```my \$offset = splice @cPosns, rand @cPosns, 1;
should be
```my \$offset = \$cPosns[rand @cPosns];
and
```while (1) {
...
last;
}
should simply be
```...

For speed, don't copy the zazb argument into a local variable. Access \$_[3] directly.

A possible tweak:

```push @cPosns, pos \$zazb while \$zazb =~ m{(?=c)}g;

might be faster as

```push @cPosns, \$-[0] while \$zazb =~ m{c}g;

or maybe even

```my \$pos = -1;
push @cPosns, \$pos while (\$pos = index(\$zazb, 'c', \$pos+1)) >= 0;

Benchmark and find out.

Which I just can't seem to get my head around to get it to work without throwing 'undefined value' warnings all over the place.

This will occur when 'c' doesn't occur in \$zazb. Based on your first snippet, you want

```sub popnum3 {
my ( \$x, \$y, \$z ) = @_;  # \$_[3] is \$zazb
if ( \$y == 0 ) {
\$aob[\$x][0] = \$initial * ( 1 + \$z );
}
else {
my @cPosns;
push @cPosns, \$-[0] while \$_[3] =~ m{c}g;
if (@cPosns) {
my \$offset = \$cPosns[rand @cPosns];
\$aob[\$x][\$y] = \$aob[\$offset][ \$y - 1 ] * ( 1 + \$z );
}
}
return \$aob[\$x][\$y];
}

I got it - at least it's stopped throwing warnings and it's a whole lot faster that what I had before.

Since popnum3 is called from inside an if/else

```else {
my \$pos = -1;
push @cPosns, \$pos while (\$pos = index(\$zazb, 'c', \$pos+1)) >= 0;
my \$offset = \$cPosns[rand @cPosns];
\$cell = sprintf '%.2f', popnum3( \$x, \$y, \$copycop, \$offset, );
}
then
```sub popnum3 {
my ( \$x, \$y, \$z, \$offset ) = @_;
if ( \$y == 0 ) {
\$aob[\$x][0] = \$initial * ( 1 + \$z );
}
else {
while (1) {
\$aob[\$x][\$y] = \$aob[\$offset][ \$y - 1 ] * ( 1 + \$z );
last;
}
}
return \$aob[\$x][\$y];
}

So far it's looking good. Thanks

What's with the useless while(1) { ...; last; } again?

Back to square one I guess. This

``` sub popnum3 {
my ( \$x, \$y, \$z ) = @_;  # \$_[3] is \$zazb
if ( \$y == 0 ) {
\$aob[\$x][0] = \$initial * ( 1 + \$z );
}
else {
my @cPosns;
push @cPosns, \$-[0] while \$_[3] =~ m{c}g;
if (@cPosns) {
my \$offset = \$cPosns[rand @cPosns];
\$aob[\$x][\$y] = \$aob[\$offset][ \$y - 1 ] * ( 1 + \$z );
}
}
return \$aob[\$x][\$y];
}
only seems to work relatively short strings, but not when the string gets much more than 500 characters. Then it throws 'uninitialized value in \$offset' errors which I'm guessing means the algorithm didn't find the one 'c' that had to be there.

I admit I'm at my wit's end here. This looks like it should work, but it doesn't. There is a 'c' in every row, so that's not the problem.

THIS works

```sub popnum3 {
my ( \$x, \$y, \$z, \$zazb ) = @_;
if ( \$y == 0 ) {
\$aob[\$x][0] = \$initial * ( 1 + \$z );
}
else {
while (1) {
my \$xda = int rand( \$total + 1 );

if ( substr( \$zazb, \$xda, 1 ) eq 'c' ) {
\$aob[\$x][\$y] = \$aob[\$xda][ \$y - 1 ] * ( 1 + \$z );
last;
}
}
}
return \$aob[\$x][\$y];
}
But takes about 5 seconds per row on the full array and since there's 8400 rows in the full run, that's about 12 hours.

Getting the run time down would be very helpful.

Thanks

Then it throws 'uninitialized value in \$offset' errors

I don't believe you when you say that snippet produced that warning.

Then it throws 'uninitialized value in \$offset' errors which I'm guessing means the algorithm didn't find the one 'c' that had to be there.

No. If it doesn't find any 'c', it never uses \$offset.

PS - Don't count out half the warning (not even counting the line number).

Create A New User
Node Status?
node history
Node Type: perlquestion [id://893618]
Approved by ikegami
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (11)
As of 2017-06-27 14:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How many monitors do you use while coding?

Results (608 votes). Check out past polls.