Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Brute force perl

by jagh (Monk)
on Jun 12, 2007 at 14:31 UTC ( [id://620735]=perlmeditation: print w/replies, xml ) Need Help??

After reading ``This turned out not to be the disaster I expected'', I got to thinking. Instead of generating all the strings of a given length and evaling them, could I generate a list of all the valid perl of a given length?

I do, of course, want to include punctuation and numbers (perl isn't all that exciting without them, mind-bending japhs aside), so the magic string autoincrement won't work for me. The code I ended up with is below.

#!/usr/bin/perl -w use strict; # redirect stderr to /dev/null when run, if you'd # like output of any use. my $chars= "abcdefghijklmnopqrstuvwxyz0123456789 `~!@#\$%^&*()-_=+[{]}\\|;:'\",<. +>/?"; my $f = substr($chars, 0, 1); my $l = substr($chars, -1, 1); my $s = ""; while(length($s) < 5) { my @tmp; my $x; if (system("perl", "-ce", $s) == 0) { print $s . "\n"; } if ($s eq "") { $s = $f; next; } elsif ($s eq ($l x length($s))) { $s = $f x (length($s) + 1); next; } @tmp = split(//, $s); $x = @tmp - 1; while (1) { # loop through the string, starting at the end if ($tmp[$x] eq $l) { # Is it the last char? $tmp[$x] = $f; # If so, make it the first $x--; # And move on to the next one } else { # If not, just increment it. $tmp[$x] = substr($chars, index($chars, $tmp[$x]) + 1, 1); last; } } $s = join('', @tmp); }

There's a big problem with this code, though. It's slow. Like, really slow. I think most of the slowdown comes from having to spawn perl every iteration of the loop[0]. I'm confident there's a better way to do the syntax checking, but I don't know what it is. So, due to that limitation the results aren't very exciting.

I think a more intelligent algorithm, something that's aware of perl's syntax, could be used to get better results faster. At the very least, random generation would be more interesting on the longer strings (if the length is 8, every string length 7 needs to be generated before the front character changes...)

-- [0] The rest is from my poor code :)

Replies are listed 'Best First'.
Re: Brute force perl
by jdporter (Paladin) on Jun 12, 2007 at 15:56 UTC

    If you're looking for perl programs of a certain specified length, then genetic algorithms might work well. The building block hypothesis seems particularly applicable.

    (Note: even though you'd be evolving code, it's not really genetic programming, since that does functional composition at the semantic level, whereas here we're mucking with characters of raw source code.)

    A word spoken in Mind will reach its own level, in the objective world, by its own weight
      Hm, I really don't see the point in using GAs for such an enumeration. In contrast, you would have a lot more expensive fitness evaluations than necessary (Except when you use a cache, but then the GA overhead is still a waste of CPU time).
        you would have a lot more expensive fitness evaluations than necessary

        Why? With the OP's brute force method, you're fully enumerating the search space. With a GA, you're "intelligently" pruning, or picking a path through the search space. That's the whole point, actually. As I said, the building block hypothesis makes sense here. Building blocks such as $_ or print are more likely to survive, for example.

        A word spoken in Mind will reach its own level, in the objective world, by its own weight

      I thought about GAs for a bit. What I'm missing is a good fitness function..

      I suppose something like the following psuedocode might produce results:

      return ( valid_perl_p($individual) ? length($individual) : 0 );

      It looks like I have some work to do :)

        What I'm missing is a good fitness function.

        eval? Safe?

        A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: Brute force perl
by toma (Vicar) on Jun 12, 2007 at 17:16 UTC
    Challenge: generate the numbers 0 to 100 using math expressions composed of only the characters +-/*4() . Find the minimum length expression for each number from 0 to 100. These might be the first few numbers:
    0: 4-4 1: 4/4 2: 4/4+4/4 3: 4-4/4 4: 4 ...

    The minimum length of these expressions would make an easy fitness function if you wanted to experiment with GA.

    I originally heard this problem as a math-golf-type challenge to a fifth and sixth grade math club.

    I don't remember the algorithm that I ended up using, but I do recall writing a generator that created valid expressions, so that the code wouldn't try to eval things like 4-( as an expression.

    It should work perfectly the first time! - toma

      This is actually a textbook example of a genetic programming problem. Of course, you could attack it with GA, but GP is a much better fit.

GA perl (was: Brute force perl)
by jagh (Monk) on Jun 12, 2007 at 17:42 UTC

    Ok. A semi-working GA implementation. This fitness function was the best I could think of, and suggestions are very welcome. Also, sorry for not DRYing the code. That's what happens when you get me excited :)

    You will, again, want to redirect stderr to /dev/null :). With 50 generations, it took about 17 seconds wall-clock time to run on my machine.

    Update: Hmm. One thing I notice is that things like '<text>' or q~<text>~ get generated a lot. I've subtracted points in my fitness function for that sort of ``cheating'', but as I do so, the GA just ``finds'' more ways to ``cheat''. As it is, unless I can think of some more sophisticated fitness function (one that somehow calculates actual usefulness of the code), the current one will just keep growing as I try to (clumsily) parse the perl. (And we all know that goes. It isn't "Nothing but perl and jagh can parse Perl", is it? :)

    Update2: Fixed the mess that was the call to $ga->init(). Thank you, jdporter.

    Update3: Tweaked the fitness function. I'm also putting the code at my site, to avoid spamming PM with trivial updates as I play. Anything significant will end up back here.


      my @chars = ( 32..34,36..64,91..126 ); $_ = chr($_) for @chars; $ga->init([ map { [@chars] } 1..20 ]);
Re: Brute force perl (ex)
by tye (Sage) on Jun 12, 2007 at 19:55 UTC

    See all valid Perl code ("use strict" is in force, else the results are less interesting) consisting of 3 typical characters with optional whitespace:

    use strict; use Algorithm::Loops 'NestedLoops'; my @chars= map chr($_), ord(' ') .. ord('~'); my @spaces= ( "", "\n", shift @chars ); my $iter= NestedLoops( [ ( \@chars, \@spaces ) x 2, \@chars, ], { OnlyWhen => sub { @_ % 2 }, }, ); sub dump { 1; } sub exit { 1; } $SIG{__WARN__}= sub { 1; }; close STDIN; my @code; while( @code= $iter->() ) { my $code= join '', @code; if( eval "$code; 1" ) { print "($code)\n"; } }

    It is trivial to expand this to look at larger code strings... but the results are less interesting, again. :)

    - tye        

Re: Brute force perl
by duelafn (Parson) on Jun 12, 2007 at 19:17 UTC

    In Chapter 6 of Higher-Order Perl, Dominus builds all strings that match a given regular expression. One could generalize/modify this to generate valid perl code.

    Good Day,

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://620735]
Approved by Corion
Front-paged by Old_Gray_Bear
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2024-04-16 05:00 GMT
Find Nodes?
    Voting Booth?

    No recent polls found