http://www.perlmonks.org?node_id=11132707

Marshall has asked for the wisdom of the Perl Monks concerning the following question:

I wrote some code today that worked most of the time but not all of the time -> different runs of Perl produced different results.
I have fixed the code so that it is reliable. My question is "what happens inside Perl such that my first version was unreliable?"

I started playing a scrabble like game on my cellphone and once I got to level 108, things got hard. So I whipped out a cheater program! Pretty easy to do and my quick hack worked fine except that it made mistakes but only some of the time.

Here is the output of a "good" run:

list of letters or pattern: :otrwreh master_letter_freq: $VAR1 = { 'w' => 1, 'h' => 1, 't' => 1, 'o' => 1, 'r' => 2, 'e' => 1 }; list of letters or pattern: ---w thew trow list of letters or pattern:
The letters "otrwreh" are its "vocabulary" for forming words. "r" can be repeated, but none of the other letters.
The pattern "---w", means show all 4 letter words in the dictionary that end in "w" that are formed using only the letters "otrweh".

All well and good until I saw this run:

list of letters or pattern: :otrwreh list of letters or pattern: ---w thew trow whew ## this is the problem!! "w" is not allowed to repeat! list of letters or pattern: exit
Ok, the offending unreliable code snippet:
RESULT: foreach (@result) { my %seen; $seen{$_}++ for (split //,lc $_); foreach (keys %seen) { next RESULT if ($seen{$_} > $master_letter_freq{$_}); } print "$_\n"; }
Once the regex does its thing on a huge list of possible words, I take out any results where a letter occurs more often in the result than in the pattern list of characters. so, "whew" should get thrown away. But evidently the next is not being executed. Sometimes the code falls through the loop and "whew" gets printed even though "w" occurs too often for the "rules".

Now arguably using "next" in this way is not the brightest thing I've done. But when hacking, stuff happens and sometimes I try something new with Perl. This was just "throw away" code for my own amusement.

Of course I rewrote the code using a more traditional approach like this:

foreach (@result) { my %seen; $seen{$_}++ for (split //,lc $_); my $no_print = 0; foreach (keys %seen) { $no_print++ if ($seen{$_} > $master_letter_freq{$_}); } print "$_\n" unless $no_print; }
So, I get the answer of "hey, don't do that!". I am curious why my hack was unreliable? It worked often enough that I figured that it was ok, until problems showed up later on. I had never used next in this way, although I have used a labeled redo before.

Update: Now that I think about it, it could be that using the default variable $_ in both loops could be an issue. I normally would assign an explicit my variable for the loop variable. But in quick, just barf if out code, that didn't happen here.

UPDATE
I think tybalt89 came up "with the ball" at Re^3: Next from inner loop only works "most of the time".

Replies are listed 'Best First'.
Re: Next from inner loop only works "most of the time"
by hippo (Bishop) on May 18, 2021 at 12:47 UTC

    I am unable to reproduce your problem. This SSCCE never prints any output for me.

    #!/usr/bin/perl use strict; use warnings; my @result = 'whew'; my %master_letter_freq = ( w => 1, h => 1, t => 1, o => 1, r => 2, e => 1, ); OUTER: foreach (@result) { my %seen; $seen{$_}++ for (split //,lc $_); foreach (keys %seen) { next OUTER if ($seen{$_} > $master_letter_freq{$_}); } print "$_\n"; }

    🦛

      Thanks for the effort! I'm glad that my problem statement was clear enough for you to generate this code. I should have added that I'm using AS Perl, 64bit, v5.24.

      Basically what I have is a UFO report. I do have a picture of the UFO. But in order to convince anybody that this UFO actually exists, I need to be able to make the UFO appear on demand!

      I suspect that there is something that varies between runs of Perl. What that is, I don't know. The program is a command line thing using STDIN and STDOUT. So my thinking now is to make a Perl program that itself launches another instance of Perl with my script, "perl wm.pl" (I call my program "word master"). This new test program would take over STDIN and STDOUT and run some test scenario, quit the program and then repeat this process a whole bunch of times looking for the UFO. I believe that I have saved a complete command line session from Perl launch to UFO appearance. I suppose there could be some state dependent thing so I would run all of the commands leading up to the failure.

      As I said in the title of my post, the "problem" only happens some of the time! This is not a "hard fail" and as such, I expect there to be some difficulty in making this repeatable. Anyway suggestions about how best to write such a "beat it up" test script would be welcome!

      This thing actually works well enough for my purposes "as is". But it has gotten my Perl inquisitiveness tweaked.

        Basically what I have is a UFO report. I do have a picture of the UFO. But in order to convince anybody that this UFO actually exists, I need to be able to make the UFO appear on demand!

        Quite so. Intermittent bugs are undoubtedly the worst. If your own code analysis doesn't find it then more often than not the only thing that will help is to log everything and then pore over those logs when the bug manifests.

        As I said in the title of my post, the "problem" only happens some of the time! This is not a "hard fail" and as such, I expect there to be some difficulty in making this repeatable. Anyway suggestions about how best to write such a "beat it up" test script would be welcome!

        I ran my SSCCE script there 1000 times just in case. As expected, there was no output from any of the runs. I think that we can safely say that the code there, in isolation, is not at fault and you will have to look elsewhere for the cause of the bugs. Presumably you are on MSWin32 since you mention ActiveState? I can't really help you with automating large runs of the code in that case, sorry.

        In your shoes I would slowly expand the SSCCE adding more and more of the surrounding code and running it as many times in a loop as you think prudent. Once you can reproduce the bug (at all, not every time) you can then concentrate on the last thing you added to the SSCCE and try to spot what change that has made.

        Good luck with your debugging.


        🦛

Re: Next from inner loop only works "most of the time"
by GrandFather (Saint) on May 18, 2021 at 21:49 UTC

    I don't have an answer, but I have part of an answer - hash keys are not ordered and the order changes each time the script is run. Another observation is that characters in the target word that are not in the letter frequency hash cause uninitialized value warnings (but you'd have noticed those so I'm guessing it's not that).

    I strongly recommend you post a complete runable example that demonstrates the problem. It may be that you find the issue along the way - that's not a bad thing (see I know what I mean. Why don't you?).

    I agree with the others who have suggested loop variable should be named. The line $seen{$_}++ for (split //,lc $_); uses the default variable in two different roles. Are you trying to deliberately make your code obscure, or are you just "saving time" (see my current sig). If you need to revisit the code even once you have lost all the time you saved and more by not using named variables.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: Next from inner loop only works "most of the time"
by BillKSmith (Monsignor) on May 18, 2021 at 13:22 UTC
    All three of of your loops use the same global variable ($_). Declare a lexical variable for at least two of them.
    #UNTESTED RESULT: foreach my $wrd(@result) { my %seen; $seen{$_}++ for (split //,lc $_); foreach my $ltr (keys %seen) { next RESULT if ($seen{$ltr} > $master_letter_freq{$ltr +}); } print "$_\n"; }

    UPDATE: Thanks to jo37 (below),I now believe that the original code 'should' work.

    Bill

      I don't think the multiple usage of $_ is problematic here. According to perlsyn the global $_ gets localized in a foreach loop. Indeed:

      #!/usr/bin/perl use v5.16; use warnings; foreach (1, 2) { say for qw(a b); say; foreach (10, 20) { say; } say; } __DATA__ a b 1 10 20 1 a b 2 10 20 2

      Greetings,
      -jo

      $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
        Nonetheless, "for(each) my $var ..." is what's recommended, and with very good reason. For one thing, inner loops (or the code which they contain) usually needs to be able to refer separately to the loop-indexes of one or more of the outer loops. This achieves that without polluting the namespace.
Re: Next from inner loop only works "most of the time"
by jo37 (Deacon) on May 18, 2021 at 21:03 UTC

    Still puzzled.
    We seem to agree that the given code is bad style, but should work. If it doesn't, the problem might be hidden in another piece of code you thought to be irrelevant. From your "run" output it is not clear to me, if there is some kind of user interaction involved in your loops. This could change the game because

    while (<$fh>) { ... }

    would modify a non-localized $_.

    Just a wild guess, but I ran into this trap some time ago...

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
Re: Next from inner loop only works "most of the time"
by GrandFather (Saint) on May 19, 2021 at 01:36 UTC

    I'd replace the whole inner loop with next if grep {$seen{$_} > $masterLetterFreq{$_}} keys %seen; giving:

    use strict; use warnings; my @result = ('thew', 'trow', 'whew '); my %masterLetterFreq = ('w' => 1, 'h' => 1, 't' => 1, 'o' => 1, 'r' => 2, 'e' => 1); for my $word (@result) { my %seen; $seen{$_}++ for split //, lc $word; next if grep {$seen{$_} > $masterLetterFreq{$_}} keys %seen; print "$word\n"; next; }

    which avoids the goto label nonsense and avoids silly flag twiddling. Note that this code often generates a warning, but I haven't seen it print 'whew '. There is something more going on in your real code that gets 'whew' printed, but we don't have that code or a sample we can run to reproduce that error.

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re: Next from inner loop only works "most of the time"
by kcott (Archbishop) on May 19, 2021 at 07:32 UTC

    G'day Marshall,

    I added strict and warnings: both missing from your code. I also show assignments to %master_letter_freq and @result: also missing from your code. The remainder is a verbatim copy of the code you posted.

    #!/usr/bin/env perl use strict; use warnings; my %master_letter_freq = qw{w 1 h 1 t 1 o 1 r 2 e 1}; my @result = qw{thew trow whew}; RESULT: foreach (@result) { my %seen; $seen{$_}++ for (split //,lc $_); foreach (keys %seen) { next RESULT if ($seen{$_} > $master_letter_freq{$_}); } print "$_\n"; } foreach (@result) { my %seen; $seen{$_}++ for (split //,lc $_); my $no_print = 0; foreach (keys %seen) { $no_print++ if ($seen{$_} > $master_letter_freq{$_}); } print "$_\n" unless $no_print; }

    Here's the output:

    thew trow thew trow

    Please run that code, exactly as is, and report the output.

    If you still see "whew" in the output, then there's potentially a bug in Perl. I'm using 5.32.0. What are you using? Post an SSCCE so we can investigate.

    If you don't see "whew" in the output, then either the code you're running is not the code you posted, or there's something else going on in the code before this that you haven't shown.

    Add code to query every variable before and/or after it changes. For instance, with Data::Dump:

    dd \%master_letter_freq; # <-- QUERY dd \@result; # <-- QUERY RESULT: foreach (@result) { print "|$_|\n"; # <-- QUERY my %seen; $seen{$_}++ for (split //,lc $_); dd \%seen; # <-- QUERY # ... and so on ...

    Note the pipe characters (|$_|) that may help to identify whitespace or control characters that aren't readily visible.

    If that doesn't help, try (platform-dependent):

    ./script.pl | cat -vet

    That's something of a last resort, but it may show up something if all else has failed to identify the problem.

    — Ken

      Hello kcott,

      I can confirm that my distros run your code with expected output

      general>check_perl_distro check-inner-loop-label.pl | thew | trow | thew | trow [OK] \perl5.26.64bit\perl\bin\perl.exe | thew | trow | thew | trow [OK] \perl5.20.64bit\perl\bin\perl.exe | thew | trow | thew | trow [OK] \perl5.22.64bit\perl\bin\perl.exe | thew | trow | thew | trow [OK] \perl5.24.64bit\perl\bin\perl.exe | thew | trow | thew | trow [OK] \perl5.26.64bit\perl\bin\perl.exe | thew | trow | thew | trow [OK] \perl-5.26.64bit-PDL\perl\bin\perl.exe

      L*

      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      Hi Ken!

      Ok, I didn't post the entire code. I did archive a version with the labeled next which I will post below, sans the dictionary file which is 150+K lines. Right now I don't remember where I got this word list from - sorry. I don't claim that this is a masterpiece of Perl code. I didn't write this with the intention of posting it here and as such it contains some things that I wouldn't want new Perler's to emulate. There are some unnecessary lc operations, I would use explicit my variables for the loops, etc.

      Here is how this code came about... I was playing "Word Nut" on my cellphone. I got up to a high level with the addition of turning the difficulty level up to the max. I was confronted with a grid of boxes like a crossword puzzle, except that there are no sentences for clues! The puzzle that "triggered" me had just 5,6,7 letter boxes, no "easy" 3 or 4 letter words. I typed in something more than 20 completely valid 5 letter words (formed according to the 7 letters given). None of these words appeared in the puzzle, but the program counts them and I got bonus points for finding valid words that don't appear.

      You can get a hint from the program, but that means that you are required to watch a bunch of ads. For some reason that offended me and I decided to retaliate with some Perl code!

      In order to "get started", it is important to find a word, any word that appears in the crossword grid. Once that is done, there will be an intersecting word and you will know that for example, for that word, the 2nd letter is an "r" or whatever.

      For the "triggering" puzzle, I typed in 32 valid 5 letter words with the help of my "cheater" program before I got the first "hit" in the crossword grid! Before completely solving the puzzle, there were 42 valid words that did not appear in the grid!

      These puzzles use uncommon words like "resaw". I am a native born, reasonably read English speaker and although I understand that word, I've never used it myself in a sentence, read a sentence with it or heard anybody else say that word!

      Here is the code... I don't see any obvious way that this can "fail" other than the code that I did post. Please also note that this code works "most of the time".

      Added: sorry if I offended anybody by describing the origins of the code. That is relevant here. I wanted to post some code that I have seen fail. If I change the code, that might affect its ability to "fail". I am not asking for help to make the code "better". I just don't understand why this can fail "some, but not all of the time". I do acknowledge that there are style failings in this code. I was just explaining that I did this quickly and hence some style failings.

        First guess:

        $master_letter_list = lc $1; ## Force all letter lists to LOWE +R CASE only %master_letter_freq = (); # <--- ** MISSING LINE ** for (split //,$master_letter_list)

        Every time $master_letter_list is changed, the frequency hash must be cleared before being added to.

        The occasional "failure" is due to the differing sequences of user interaction.

        I crunched your code down a little to turn it into a test script. You may like to try running it to see if it reproduces the issue. If it doesn't you could try adding back elements of your script until you get back to a UFO. At that point you will have identified a critical component (whatever you added last) and that may help us figure out where the issue is.

        use strict; use warnings; print "Starting\n"; for (1 .. 100000) { my @results = draw(); print "@results\n" if 2 != @results; } print "Done\n"; sub draw { my @dic = ('thew', 'trow', 'whew'); my %master_letter_freq; my $master_letter_list = 'otrwreh'; my $regex = ''; ++$master_letter_freq{$_} for split //, $master_letter_list; for (split //, '---w') { $regex .= $_ eq '-' ? "[$master_letter_list]" : $_; } my @result = grep {/^$regex$/i} @dic; my @results; RESULT: foreach (@result) { my %seen; $seen{$_}++ for (split //, lc $_); foreach (keys %seen) { next RESULT if ($seen{$_} > $master_letter_freq{$_}); } push @results, $_; } return @results; }

        On my machine running Perl 3.30.1 prints:

        Starting Done
        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

        Firstly, apologies for the slow response: I've been ill and haven't logged in for some days.

        I received a couple of private messages from you, possibly (and I'm completely guessing here) due to frustration if you thought I was ignoring you: not the case. And regarding the downvote: can't help; wasn't me; I've only just seen your post.

        I do tend to avoid mutliple uses of $_ in the same construct. Even when it's not a problem with actual logic, it does reduce readability and maintainability. I generally concur with what others have already written about that, so I'll leave it there.

        I was genuinely interested in where your problem might lie. My posted code, using your exact code, was an attempt to identify the issue: as shown, I couldn't reproduce it.

        I see there's been a lot more discussion since I was last here. I don't believe I can add anything to that.

        — Ken

Re: Next from inner loop only works "most of the time"
by tybalt89 (Monsignor) on May 19, 2021 at 19:52 UTC

    Why does your 'bad' run not show the %master_letter_freq hash like the 'good' run does? Can there be some problem in the calculation of that hash? Does the 'w' from the pattern get counted somehow? Is the 'w' from the pattern already on the board? If so, with a 'w' on your 'rack', words should be allowed to have two 'w's.

    As a side note, my favorite way of answering the question "What words can be made with these letters?" is to match against a regex. Here's an example:

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11132707 use warnings; my $letters = 'otrwreh'; my $have = '...w'; # NOTE: using . instead of - my $pattern = join '', map "$_?", sort split //, $letters; my $regex = qr/^$pattern$/im; print "regex: $regex\n\n"; @ARGV = '/usr/share/dict/words'; /^$have$/i && (join '', sort /./g) =~ $regex && print while <>;

    which outputs the computed regex and the matching words:

    regex: (?^mi:^e?h?o?r?r?t?w?$) thew trow

    this allows for only as many of a letter as initially specified.

      Good questions! A version of the full code using labeled next is posted at Re^2: Next from inner loop only works "most of the time".

      I actually first thought that there was some problem in the hashes. So I added some debug code to look at that issue. You can see what I did.

      I saw the error and then added debug stuff. That's why the good bad run doesn't have that info.

      You will see how I generated the regex. I did this in a straightforward way. My way generates some results that are not valid, but it was very easy to do while "hacking". Taking out a few extraneous results seemed to be a good way to go at the time. And it still seems like a good idea. A completely optimal regex is not necessary here.

      I made the interface so that I type in exactly the list of letters the puzzle presents. If there are 2 "e"'s, I type in 2 "e"'s.

      Now that I think about it, I make a separate reply, re: regex sets:

      I am assuming that a repeated letter in a regex set is "harmless". That /[icee][icee][icee]/ should match "ice".

      I am wondering if there is some regex weirdness if a letter repeats within a set?

        ... is [there] some regex weirdness if a letter repeats within a set?

        No. [icee] is exactly equivalent to [ice], [iceeee], [eiece], etc.


        Give a man a fish:  <%-{-{-{-<

Re: Next from inner loop only works "most of the time"
by karlgoethebier (Abbot) on May 19, 2021 at 10:03 UTC

    BTW, what about singular and plural?

    for my $result ( @results ) { …; for my $key ( keys %seen ) { …, } …; }

    N.B.: Untested

    «The Crux of the Biscuit is the Apostrophe»

      My "cheater program" is not designed to solve the puzzle automatically. It is to give me hints when I get "stuck". My 150K line dictionary contains lots of plural words. "car", "cars" etc. If on an easy puzzle, I hit on "saw", I will immediately type in "saws", but I very likely might miss say, "resaw" or "resawer".

      Some of these puzzles are ridiculously hard. I think you have to have a mutant brain or a heck of a lot experience or a Perl program assistant to solve some of these things. Or as the program authors want you to do, watch a lot ads in order to get hints.

Re: Next from inner loop only works "most of the time"
by BillKSmith (Monsignor) on May 20, 2021 at 10:37 UTC
    Did you consider hardware failure? I once had a program that failed intermittently on only one of six computers. (All other programs ran correctly on the offending computer.) I know that this is extremely rare, but it is probably easy to test.
    Bill
Re: Next from inner loop only works "most of the time"
by karlgoethebier (Abbot) on May 20, 2021 at 17:58 UTC

    One more thing: As far as I understood you play «something» on your mobile device and wrote some command line tool for support or cheating, right? A quite bizarre scenario. Probably you should consider to write some app for mobile devices. See this for Android and this as well as this for Apple devices.

    «The Crux of the Biscuit is the Apostrophe»

      No. I was just running "Word Nut" on my android phone from the Google Play store.

      I am actually a very inexperienced smart phone user. I have to go my doc's office once a month for basically an infusion deal. This does hurt a bit and I got onto this thing as a brain distraction. Then I got "hooked" and wanted to screw with this thing when I had access to my home computer.

      The current puzzle word is:
      :lceyols
      --s-
      The only suggestion my program yields is "lose", but I could already see that for myself.
      If I watch enough ads, I'll get another letter and will figure it out.

        cosy

        The way forward always starts with a minimal test.