Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Transcramble - Random text generator

by OeufMayo (Curate)
on Jul 09, 2001 at 02:06 UTC ( #94856=CUFP: print w/ replies, xml ) Need Help??

This small script, apart from being mostly useless, generates credible sentences from a source text.

The idea is to keep track of all the possible words following a given group of words. Once this hash of arrays has been constructed, we pick a random starting point and construct the new text by looking for words that may follow the preceding group.

The input text should be large enough to have a broad vocabulary and sentences range. You can find good samples on the Project Gutenberg site.
You can also try to create interesting mixes of texts, by feeding two novels in the input file!

Have fun!

#!/usr/bin/perl -w # # transcramble - random text generator # author: Briac Pilpré # # v.0.1 : initial release. # v.0.2 : Added Storable support and Getopt::Mixed # for a nicer interface. # # idea 'borrowed' from : http://www.eblong.com/zarf/markov/ use strict; use locale; use Storable; use Getopt::Mixed; use vars qw($VERSION $opt_g $opt_w $opt_s $opt_i $opt_h $opt_v ); $VERSION = 0.2; Getopt::Mixed::getOptions( 'g=i group>g w=i words>w s=s store>s i=s input h help>h v verbose> +v' ); $|++; # usage: transcramble group words stored_file > output if ($opt_h){ print <<"_HELP_"; transcrambler v$VERSION usage: transcramble -g 3 -w 500 -i stored_chains > output usage: transcramble -g 2 -w 400 -s chains_to_store < input > outpu +t -g --group : How to group words (1, 2 or 3) -w --words : Number of words to generate -i --input : Input file generated by the -s switch -s --store : Name of the file where the chains generated are to be stored -h --help : Display this help message and exit _HELP_ exit 0; } # How to group words in the text # 1: Group words one by one, makes nonsenses # 2: List every word following each preceding couple of words # 3: List every word following each preceding triplet of words # (make the text really like the original) my $group = $opt_g || 2; # Number of words to spew my $words = $opt_w || 1000; # Stored chains file my $in_file = $opt_i; # File to store the generated chains (can't be used with -i though) my $out_file = $opt_i ? undef : $opt_s; # Frequency list creation my ( %tuple, @remains ); if ($in_file){ print STDERR "Retrievieng the chains from $in_file:" if $opt_v; %tuple = %{ retrieve($in_file) }; print STDERR " OK\n" if $opt_v; } else { print STDERR "Creating the chains. This may take a while\n" if $op +t_v; while (<>) { # Split the current line for every 'word', including punctuati +on my @w = split ( /[^\w\n,.!"?;’'-]+/, $_ ); # If there was some words left of the previous line, add them @w = ( @remains, @w ); while (@w) { # If we have enough words to fill the group if ( scalar @w >= $group + 1 ) { # Shove the next word in the word array push @{ $tuple{ join ( ' ', @w[ 0 .. $group - 1 ] ) } +}, $w[$group]; # And move to the next word shift (@w); } else { # At the end of the line, we keep the remaining words. @remains = @w; undef @w; } } } print STDERR "Saving chains to $out_file:" if $opt_v; store \%tuple, $out_file; print STDERR " OK\n" if $opt_v; } # Used to pick a random key in the hash my $lh = scalar keys %tuple; print STDERR "Done constucting the tuples. $lh tuples\n"; # Select a random key to begin my $key = ( keys %tuple )[ int rand($lh) ]; while ( $words-- ) { my @first = split (/ /, $key ); shift @first; # Pick a word at random in the possible following words my $last = @{ $tuple{$key} }[ int rand( @{ $tuple{$key} } ) ]; print " $last"; # And see if we can continue with the sentence or if ( defined $tuple{ join ( ' ', @first, $last ) } ) { $key = join ( ' ', @first, $last ); } # Start a new one by picking a new starting group. else { $key = ( keys %tuple )[ int rand($lh) ]; } } print STDERR "\nTranscrambling done.\n\n" if $opt_v;
<kbd>--
my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>

Comment on Transcramble - Random text generator
Download Code
Re: Transcramble - Random text generator
by larsen (Parson) on Jul 09, 2001 at 15:30 UTC
    Very nice :)

    And very interesting, making some considerations.
    Grouping words one by one makes nonsense, but we have a text with the same stathistical properties of the original text. Grouping word two by two of three by three is the same, but we don't lost meaning (well, better, we don't lost too much meaning).

    Using letters instead of words as units, I think (this is only a conjecture) we could "mimic" the sound of the original language, realizing a new kind of your Acme::Translator :). A kind that could be difficult to obtain even making Acme::Translator having Sex with Transcramble - Random text generator (brrr....).

    Even more interesting experiments could be done using music instead of texts. Using something from the MIDI suite we could import a melody and use it to generate other similar (in the sense described by your program) melodies.

    Well, just a thought, but could drive to interesting projects (I've always been interested in generative arts).

Re: Transcramble - Random text generator
by thraxil (Prior) on Jul 10, 2001 at 00:09 UTC

    cool.

    i did a very similar thing with markov chains on my weblog. whenever someone posts an entry, it updates the frequency count table (it all runs off a postgres db). then, from that, it generates a random post at every time the page is loaded. the idea was that it would be sort of a "hive-mind" of all the people who post there.

    when i was writing it i had an idea that i still haven't had time to follow through on, but that your code is probably better suited for. i wanted to create a bot that would go through the discussions on slashdot, and for each thread, pick the highest rated posts, generate a frequency table from that and then post a random message to that discussion. then i wanted to watch how those posts get moderated. i'm pretty sure the markov script could get some pretty good karma.

    anders pearson

Re: Transcramble - Random text generator
by John M. Dlugosz (Monsignor) on Jul 10, 2001 at 02:48 UTC
    Many years ago, I saw a program called Bable (or something spelled similarly to that) which did this based on sample files. I think it used a Markov chain of two previous words, and punctuation was part of the word.

    The best use of it was when some crazy posts started appearing in the discussion group I hung out in. We couldn't figure out at first if this person was totally psycotic or just wrote so badly that nobody could figure out what she was trying to say! After a few attempts to communicate with her didn't work out, I used that program to mimic her writing style and replied to one. It drew a lengthy reply! We seem to have carried on a conversation, though I have no idea what it meant. It was fascenating in that the program included her pet tag-phrases and other affectations, as well as the overall flavor.

    —John

(tye)Re: Transcramble - Random text generator
by tye (Cardinal) on Jul 10, 2001 at 09:15 UTC

    Quoting some random web source whose URL I didn't bother to note:

    There's an example of a travesty generator implemented in Perl in the *first* edition of the manual by Larry Wall and Randal L. Schwartz, on page 326 (sadly, it's been deleted from the 2nd ed).
    So we see that those who forget history are doomed to reinvent it. ;)

            - tye (but my friends call me "Tye")
      And here it is:

      from the first Programming Perl (back when it was still pink)

      #!/usr/bin/perl # Usage travesty [files] # First analyse the input. while(<>) { # handle article headers and ">>>" quotes. next if /^\./; next if /^From / .. /^$/; next if /^Path: / .. /^$/; s/^\W+//; # Do each word. push(@ary,split(' ')); while ($#ary > 1) { $a = $p; $p = $n; $w = shift(@ary); $n = $num{$w}; if ($n eq '') { push(@word,$w); $n = pack('S',$#word); $num{$w} = $n; } $lookup{$a . $p} .= $n; } } # Now spew out the words, based on the frequencies. If there # is more than one possibility to choose from choose one # randomly. for (;;) { $n = $lookup{$a . $b}; ($foo,$n) = each(lookup) if $n eq ''; # A bootstrap. $n = substr($n,int(rand(length($n))) & 0177776,2); $a = $p; $p = $n; ($w) = unpack('S',$n); $w = $word[$w]; # See if word fits on line; $col += length($w) +1; if ($col >= 65) { $col = 0; print "\n"; } else { print ' '; } print $w; # Paragraph every 10 sentences or so. if ($w =~ /\.$/) { if (rand() < .1) { print "\n"; $col = 80; } } }
      p.s. These little gems are the reason I keep that old pink book around.
        I tried this using a large text file from Writing of the Early Church Father", at Christian Classics Ethereal Library. The results were pretty amazing! Nice that you kept the "old book" around.
Re: Transcramble - Random text generator
by beretboy (Chaplain) on Jul 12, 2001 at 16:28 UTC
    You may want to checkout Games::Dissociate

    "Sanity is the playground of the unimaginative" -Unknown
      This node was taken out by the NodeReaper on Tue Oct 26 21:05:22 2004 (EST)
      Reason: Limbic~Region delete - excerpt from a CB conversation that has no relation to thread

      For more information on this node visit: this

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://94856]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2014-07-30 01:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls