Greetings Fellow Monks,

Later this month, I will be giving a talk titled: Machine Learning Made Easy with Perl. A preliminary outline is:

Part I: Exploratory Data Analysis

Part II: Decision Support Systems

Part III: Pattern Recognition

In each part, I plan to discuss the problem, the strategy to solve it, the choice of machine learning technique and the main configuration issues the participants need to understand to successfully deploy machine learning applications. I will also show snippets of the code used. For example:

For data gathering using Finance::YahooQuote:

#!/usr/bin/perl use strict; use warnings; use Finance::YahooQuote; my @symbols = ("IBM","DELL","GOOG","YHOO","MSFT","ORCL","SAP","COGN", +"BOBJ"); my @columns = ("Last Trade (Price Only)","Last Trade Date","Last Trade + Time","Day's Range","52-week Range","EPS Est. Next Year","P/E Ratio" +,"PEG Ratio","Dividend Yield"); my $arrptr = getcustomquote(\@symbols, \@columns); my $i = 0; foreach my $symbol (@symbols){ my @quotes = @{$arrptr->[$i++]}; print "$symbol\t@quotes\n"; }

For the FCM:

use strict; use warnings; use PDL; use PDL::NiceSlice; # ================================ # fcm # ( $performance_index, $prototypes, $current_partition_matrix) = # fcm( $patterns, $partition_matrix, $fuzzification_factor, # $tolerance, $max_iter ) # ================================ sub fcm { # # fuzzy c means implementation # my ( $patterns, $current_partition_matrix, $fuzzification_factor, +$tolerance, $max_iter ) = @_; my ( $number_of_patterns, $number_of_clusters ) = $current_partiti +on_matrix->dims(); my ( $prototypes, $performance_index ); my $iter = 0; while (1) { # computing each prototype my $temporal_partition_matrix = $current_partition_matrix ** $ +fuzzification_factor; my $temp_prototypes = ($temporal_partition_matrix x $patterns +)->xchg(1,0) / sumover($temporal_partition_matrix); $prototypes = $temp_prototypes->xchg(1,0); # copying partition matrix my $previous_partition_matrix = $current_partition_matrix->cop +y; # updating the partition matrix my $dist = zeroes($number_of_patterns, $number_of_clusters); for my $j (0..$number_of_clusters - 1){ my $diff = $patterns - $prototypes(:,$j)->dummy(1, $number +_of_patterns); $dist(:,$j) .= (sumover( $diff ** 2 )) ** 0.5; } my $temp_variable = $dist ** (-2/($fuzzification_factor - 1)); $current_partition_matrix = $temp_variable / sumover($temp_var +iable->xchg(1,0)); # # Performance Index calculation # $temporal_partition_matrix = $current_partition_matrix ** $fuz +zification_factor; $performance_index = sum($temporal_partition_matrix * ( $dist +** 2 )); # checking stop conditions my $diff_partition_matrix = $current_partition_matrix - $previ +ous_partition_matrix; $iter++; if ( ($diff_partition_matrix->max < $tolerance) || ($iter > $m +ax_iter) ) { last; } print "iter = $iter\n"; } return ( $performance_index, $prototypes, $current_partition_matri +x ); }

I expect the audience to be mainly Perl savvy people. However, the talk is open to all the people attending the conference. Therefore, some people in the audience might not be familiar with Perl.

The talk is scheduled to last 45 minutes. I plan to cover each part in about 10 minutes to leave between 5 and 10 minutes for questions and answers. I do not plan to explain the snippets in detail because I do not have enough time. However, I will make the code available for all those interested. My questions for you Fellow Monks are:

  1. If you were attending this session, would you expect me to describe the code in detail?
  2. Do you think it is a good strategy to concentrate on the machine learning part rather than on the Perl part?
  3. What suggestion do you have in terms of points that I should (should not) cover?

  4. Any other suggestions? thoughts?

Thank you,


Update: Fixed typo in header of FCM sub