Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: bracketology was Re^2: making a markovian "mad lib"

by Aldebaran (Deacon)
on Mar 28, 2019 at 22:59 UTC ( #1231840=note: print w/replies, xml ) Need Help??


in reply to Re: bracketology was Re^2: making a markovian "mad lib"
in thread making a markovian "mad lib"

I didn't see any probability calculations in your script that I looked at. You asked for equiprobable.

I did. That was meant to start things off, to get on the proverbial scoreboard. They still work for the naive use of appositives, which remains a component of this output. The equaprobable outcomes do sum to one, so we aren't too far afield of bliako's development with cumulative probablity. I've included another way to generate the probabilities now. The working version of what I have now follows with abridged output and source:

from the command ./7.mm.pl hoops/ >1.txt :

sub_dir is hoops/ path1 is /home/bob/Documents/meditations path2 is /home/bob/Documents/meditations/hoops out_file is /home/bob/Documents/meditations/hoops/my_data/28-03-2019-1 +4-46-37/28-03-2019-14-46-37.1.txt r is east pairs are 1.duke vs 16.ndST 8.vcu vs 9.ucf 5.msST vs 12.lib 4.vaTech v +s 13.stlouis 6.maryland vs 11.belmont 3.lsu vs 14.yale 7.louisville v +s 10.mn 2.miST vs 15.bradley in play_game matched 1 duke 16 ndST ratio was 0.941176470588235 matched 8 vcu 9 ucf ratio was 0.529411764705882 matched 5 msST 12 lib ratio was 0.705882352941177 matched 4 vaTech 13 stlouis ratio was 0.764705882352941 matched 6 maryland 11 belmont ratio was 0.647058823529412 matched 3 lsu 14 yale ratio was 0.823529411764706 matched 7 louisville 10 mn ratio was 0.588235294117647 matched 2 miST 15 bradley ratio was 0.882352941176471 [ "1.duke", "8.vcu", "5.msST", "4.vaTech", "6.maryland", "14.yale", "10.mn", "2.miST", ] winners are 1.duke 8.vcu 5.msST 4.vaTech 6.maryland 14.yale 10.mn 2.mi +ST string sieger is 1.duke, 8.vcu, 5.msST, 4.vaTech, 6.maryland, 14.yale, + 10.mn, 2.miST -------system out--------- It is March Madness again, and I, JAPH, wanted to make some prediction +s. I pick 1.duke, 8.vcu, 5.msST, 4.vaTech, 6.maryland, 14.yale, 10.mn, 2. +miST to win in this round of the . Their cardinality is 8. The Tall P +eople looks particularly likely to win, while the rodent mascots the +underdog. The current state of March Madness is .

This output seems about right. I have tried not to fight too much in my head about what the current state is of this tournament. There's quite a bit of output, even with trials limited to 2, so let me just show the tail end and be done with it:

[ "1.nc", "8.utST", "5.auburn", "13.ne", "6.iowaST", "3.houston", "10.setonhall", "2.ky", ] winners are 1.nc 8.utST 5.auburn 13.ne 6.iowaST 3.houston 10.setonhall + 2.ky string sieger is 1.nc, 8.utST, 5.auburn, 13.ne, 6.iowaST, 3.houston, 1 +0.setonhall, 2.ky -------system out--------- It is March Madness again, and I, Dick Vitale, wanted to make some pre +dictions. I pick 1.duke, 9.ucf, 5.msST, 13.stlouis, 6.maryland, 3.lsu, 7.louisvi +lle, 2.miST to win in this round of the east. Their cardinality is 8. + The Bigger, stronger, faster looks particularly likely to win, while + the washouts the underdog. The current state of March Madness is . It is hoops, baby again, and I, Dick Vitale, wanted to make some predi +ctions. I pick 1.gonzaga, 8.syracuse, 5.marquette, 13.vermont, 6.buffalo, 3.te +xTech, 10.fla, 2.miST to win in this round of the west. Their cardina +lity is 8. The Giants looks particularly likely to win, while the was +houts the underdog. The current state of hoops, baby is . It is hoops, baby again, and I, JAPH, wanted to make some predictions. + I pick 1.va, 9.ok, 5.wi, 4.ksST, 11.stmarys, 3.purdue, 10.iowa, 2.tn t +o win in this round of the south. Their cardinality is 8. The Giants +looks particularly likely to win, while teams under indictment the un +derdog. The current state of hoops, baby is . It is the Big Dance again, and I, {$content_provider}, wanted to make +some predictions. I pick 1.nc, 8.utST, 5.auburn, 13.ne, 6.iowaST, 3.houston, 10.setonhal +l, 2.ky to win in this round of the midwest. Their cardinality is 8. +The Bigger, stronger, faster looks particularly likely to win, while +the washouts the underdog. The current state of the Big Dance is .

What this shows is that I have too much repitition of the introduction and the summary. I also don't have any mechanisms for going beyond round one. (Gladly taking suggestions on how I might do that.)

Current source:

#!/usr/bin/perl -w use 5.011; use Path::Tiny; use utf8; use open OUT => ':utf8'; use Data::Dump; use Text::Template; use POSIX qw(strftime); binmode STDOUT, 'utf8'; my ($sub_dir) = $ARGV[0]; say "sub_dir is $sub_dir"; my $path1 = Path::Tiny->cwd; say "path1 is $path1"; my $path2 = path( $path1, $sub_dir ); say "path2 is $path2"; ## populate hash for appositives my $data = [ [ 'protaganist', '{$content_provider}', 'JAPH', 'Dick Vital +e' ], [ 'event', 'March Madness', 'the Big Dance', 'hoops, bab +y' ], [ 'strongest_team', 'The Giants', , 'The Tall People', 'T +he Bigger, stronger, faster' ], [ 'weakest_team', 'teams under indictment', 'the washout +s', 'the rodent mascots' ], ]; my @region = ( 'east', 'west', 'south', 'midwest' ); #4 different b +rackets # unique point at which probability is assigned for teams. my $ref_bracket = pop_brackets(); #dd $ref_bracket; ## main loop # set trials my $trials = 2; my $dummy = 1; while ( $trials > 0 ) { # create an output file my $first_second = strftime( "%d-%m-%Y-%H-%M-%S", localtime ); my $out_file = $path2->child( 'my_data', "$first_second", "$first_second\.$dummy. +txt" ) ->touchpath; say "out_file is $out_file"; for my $r (@region) { say "r is $r"; my $ref_calc = calc_winners( $ref_bracket, $r ); dd $ref_calc; my @sieger = @$ref_calc; say "winners are @sieger"; my $anzahl = scalar @sieger; my $string_sieger = join( ', ', @sieger ); say "string sieger is $string_sieger"; # stochastic input of appositives for every "story" my %vars = map { $_->[0], $_->[ rand( $#{$_} ) + 1 ] } @{$data}; # further stochastic output from "playing" the games $vars{"winners"}=$string_sieger; $vars{"cardinality"}=$anzahl; $vars{"region"}=$r; my $rvars = \%vars; #important my @pfade = $path2->children(qr/\.txt$/); @pfade = sort @pfade; #say "paths are @pfade"; for my $file (@pfade) { #say "default is $file"; my $template = Text::Template->new( ENCODING => 'utf8', SOURCE => $file, ) or die "Couldn't construct template: $!"; my $result = $template->fill_in( HASH => $rvars ); $out_file->append_utf8($result); } say "-------system out---------"; system("cat $out_file"); say "----------------"; $trials -= 1; $dummy += 1; } #end for loop } # end while condition sub pop_brackets { use 5.016; use warnings; my %vars; my @east = qw(1.duke 16.ndST 8.vcu 9.ucf 5.msST 12.lib 4.vaTech 13.stlouis 6. +maryland 11.belmont 3.lsu 14.yale 7.louisville 10.mn 2.miST 15.bradley); my @west = qw(1.gonzaga 16.farleigh 8.syracuse 9.baylor 5.marquette 12.murray +ST 4.flaST 13.vermont 6.buffalo 11.azST 3.texTech 14.noKY 7.nevada 10.fla 2.miST 15.montana); my @south = qw(1.va 16.gardner 8.ms 9.ok 5.wi 12.or 4.ksST 13.UCirv +6.nova 11.stmarys 3.purdue 14.olddominion 7.cincy 10.iowa 2.tn 15.colgate +); my @midwest = qw(1.nc 16.iona 8.utST 9.wa 5.auburn 12.nmST 4.ks 13.n +e 6.iowaST 11.ohST 3.houston 14.gaST 7.wofford 10.setonhall 2.ky 15.abilene); $vars{east} = \@east; $vars{west} = \@west; $vars{south} = \@south; $vars{midwest} = \@midwest; return \%vars; } sub calc_winners { use 5.016; use warnings; use Data::Dump; my ( $rvars, $region ) = (@_); my %vars = %$rvars; my $new_ref = $vars{$region}; my @teams = @$new_ref; #say "east is @east"; my @pairs; while (@teams) { my $first = shift @teams; my $next = shift @teams; push @pairs, "$first vs $next"; } say "pairs are @pairs"; my $ref_pairs = \@pairs; #dd $ref_pairs; my $ref_winners = play_game($ref_pairs); return $ref_winners; # end calc_winners } sub play_game { use 5.016; use warnings; my $ref_pairs = shift; my @pairs = @$ref_pairs; say "in play_game"; #say "pairs are @pairs"; my @winners; for my $line (@pairs) { if ( $line =~ /^(\d+)\.(\w+) vs (\d+)\.(\w+)$/ ) { say "matched"; say "$1 $2 $3 $4"; my $denominator = $1 + $3; my $ratio = $3 / $denominator; say "ratio was $ratio"; my $random_number = rand(); if ( $random_number < $ratio ) { push @winners, "$1.$2"; } else { push @winners, "$3.$4"; } } } my $ref_winners = \@winners; return $ref_winners; } # end play_game __END__
So if you want to utilize Bliakos methods you need a corpus that relates. Like, previous scores. You are using rankings (seeds), but that will just give you what you already have.

The problem with that is that most of these teams will not have played each other. Trying to impose some type of transitive property on whether one team is better than another is fraught, but they do have statistical ways of ranking teams based on strength of schedule. I think the thing to consult are previous years of this bracket. It has been 64 teams since the 80's. The data are all readily available. The selection committee divides the teams into 4 regional brackets, with teams ranked 1 to 16. I can't recall the last time I saw a number one lose to a 16. Outsmarting the selection committee is the passion of every serious March Madness student. If it can be attempted with conference strengths, points, rebounds, assists, anything quantitative, but it just became something that I am allowed to bet on now. (You are likely too.)

Relevant files available here.

Thanks for your comments,

Replies are listed 'Best First'.
Re^3: bracketology was Re^2: making a markovian "mad lib"
by trippledubs (Deacon) on Mar 30, 2019 at 16:32 UTC

    Ok I sent you a pull request, here's some changes I would make in the syntax

    • default options in case user does not specify
    • hash slice
    • ternary operator
    • eliminate superfluous variables
    diff --git a/7.mm.pl b/7.mm.pl old mode 100644 new mode 100755 index 84efc24..7db8750 --- a/7.mm.pl +++ b/7.mm.pl @@ -8,7 +8,7 @@ use Text::Template; use POSIX qw(strftime); binmode STDOUT, 'utf8'; -my ($sub_dir) = $ARGV[0]; +my ($sub_dir) = $ARGV[0] || 'out'; say "sub_dir is $sub_dir"; my $path1 = Path::Tiny->cwd; say "path1 is $path1"; @@ -61,14 +61,11 @@ while ( $trials > 0 ) { my %vars = map { $_->[0], $_->[ rand( $#{$_} ) + 1 ] } @{$data}; # further stochastic output from "playing" the games - $vars{"winners"}=$string_sieger; - $vars{"cardinality"}=$anzahl; - $vars{"region"}=$r; + @vars{qw/winners cardinality region/} = ($string_sieger,$anzah +l,$r); my $rvars = \%vars; #important - my @pfade = $path2->children(qr/\.txt$/); - @pfade = sort @pfade; + my @pfade = sort $path2->children(qr/\.txt$/); #say "paths are @pfade"; @@ -80,8 +77,7 @@ while ( $trials > 0 ) { SOURCE => $file, ) or die "Couldn't construct template: $!"; - my $result = $template->fill_in( HASH => $rvars ); - $out_file->append_utf8($result); + $out_file->append_utf8($template->fill_in( HASH => $rvars)); } say "-------system out---------"; system("cat $out_file"); @@ -167,14 +163,7 @@ sub play_game { my $denominator = $1 + $3; my $ratio = $3 / $denominator; say "ratio was $ratio"; - my $random_number = rand(); - if ( $random_number < $ratio ) { - push @winners, "$1.$2"; - } - else { - push @winners, "$3.$4"; - } - + push @winners, rand() < $ratio ? "$1.$2" : "$3.$4" } }

    None of those really change anything, just make it "cleaner". What's better, a doctor that cures with more medicine or less?

    The data structure you use is a string '$rank.$team', that's not good! The rank should be a property of the team and division. And really if you think object oriented you have teams, divisions, games, lots of directions to go. You need a really flexible win_predictor() function or class, because it's likely to grow a lot. And then the templating is almost a totally separate thing, which is also going to change.

    What this shows is that I have too much repitition of the introduction and the summary. I also don't have any mechanisms for going beyond round one. (Gladly taking suggestions on how I might do that.)

    in play_game(), you have to decouple the data to do the prediction, bad. But it's almost recursive already. If you don't change the data structure, you need to put it back in the form the sub expects and just call play_game(\@winners). I guess bye rounds will make that a little trickier.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1231840]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2020-11-24 01:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?