Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

This small script, apart from being mostly useless, generates credible sentences from a source text.

The idea is to keep track of all the possible words following a given group of words. Once this hash of arrays has been constructed, we pick a random starting point and construct the new text by looking for words that may follow the preceding group.

The input text should be large enough to have a broad vocabulary and sentences range. You can find good samples on the Project Gutenberg site.
You can also try to create interesting mixes of texts, by feeding two novels in the input file!

Have fun!

#!/usr/bin/perl -w # # transcramble - random text generator # author: Briac Pilpré # # v.0.1 : initial release. # v.0.2 : Added Storable support and Getopt::Mixed # for a nicer interface. # # idea 'borrowed' from : http://www.eblong.com/zarf/markov/ use strict; use locale; use Storable; use Getopt::Mixed; use vars qw($VERSION $opt_g $opt_w $opt_s $opt_i $opt_h $opt_v ); $VERSION = 0.2; Getopt::Mixed::getOptions( 'g=i group>g w=i words>w s=s store>s i=s input h help>h v verbose> +v' ); $|++; # usage: transcramble group words stored_file > output if ($opt_h){ print <<"_HELP_"; transcrambler v$VERSION usage: transcramble -g 3 -w 500 -i stored_chains > output usage: transcramble -g 2 -w 400 -s chains_to_store < input > outpu +t -g --group : How to group words (1, 2 or 3) -w --words : Number of words to generate -i --input : Input file generated by the -s switch -s --store : Name of the file where the chains generated are to be stored -h --help : Display this help message and exit _HELP_ exit 0; } # How to group words in the text # 1: Group words one by one, makes nonsenses # 2: List every word following each preceding couple of words # 3: List every word following each preceding triplet of words # (make the text really like the original) my $group = $opt_g || 2; # Number of words to spew my $words = $opt_w || 1000; # Stored chains file my $in_file = $opt_i; # File to store the generated chains (can't be used with -i though) my $out_file = $opt_i ? undef : $opt_s; # Frequency list creation my ( %tuple, @remains ); if ($in_file){ print STDERR "Retrievieng the chains from $in_file:" if $opt_v; %tuple = %{ retrieve($in_file) }; print STDERR " OK\n" if $opt_v; } else { print STDERR "Creating the chains. This may take a while\n" if $op +t_v; while (<>) { # Split the current line for every 'word', including punctuati +on my @w = split ( /[^\w\n,.!"?;’'-]+/, $_ ); # If there was some words left of the previous line, add them @w = ( @remains, @w ); while (@w) { # If we have enough words to fill the group if ( scalar @w >= $group + 1 ) { # Shove the next word in the word array push @{ $tuple{ join ( ' ', @w[ 0 .. $group - 1 ] ) } +}, $w[$group]; # And move to the next word shift (@w); } else { # At the end of the line, we keep the remaining words. @remains = @w; undef @w; } } } print STDERR "Saving chains to $out_file:" if $opt_v; store \%tuple, $out_file; print STDERR " OK\n" if $opt_v; } # Used to pick a random key in the hash my $lh = scalar keys %tuple; print STDERR "Done constucting the tuples. $lh tuples\n"; # Select a random key to begin my $key = ( keys %tuple )[ int rand($lh) ]; while ( $words-- ) { my @first = split (/ /, $key ); shift @first; # Pick a word at random in the possible following words my $last = @{ $tuple{$key} }[ int rand( @{ $tuple{$key} } ) ]; print " $last"; # And see if we can continue with the sentence or if ( defined $tuple{ join ( ' ', @first, $last ) } ) { $key = join ( ' ', @first, $last ); } # Start a new one by picking a new starting group. else { $key = ( keys %tuple )[ int rand($lh) ]; } } print STDERR "\nTranscrambling done.\n\n" if $opt_v;
<kbd>--
my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>

In reply to Transcramble - Random text generator by OeufMayo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (7)
As of 2024-03-19 01:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found