Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: RFC: Machine Learning Development with Perl

by kvale (Monsignor)
on Sep 13, 2007 at 00:04 UTC ( #638705=note: print w/ replies, xml ) Need Help??


in reply to RFC: Machine Learning Development with Perl

This looks like a nice introductory talk for programmers who want to dip their toes into business intelligence and data mining techniques using Perl. Thanks for sharing it with us.

Coming from a machine learning background, I have a few comments that might be useful to add to your presentation.

First, developing a machine learning approach to a problem doesn't really stop at the implementation phase. Once a model is fit to the data, you need to test how well the model fits. To do that, you not only need to test how well the model fit compares to the correct answers, you need to test the model's generalizability. Here, generalizability means 'How well does a this model perform on new data, .i.e, what is its predictive power?" Typically, cross validation, bootstrap or Bayesian methods are used to test predictive power. I have seen many machine learning implementations fail miserably because the programmers didn't realize that testing model fits on new data was also needed.

Second, you give a nice exposition on clustering, but regression problems are nearly as important in the machine learning literature. Regression differs from classification in that models are created to predict "how much?" rather than "what class?". Giving a separate regression example would probably be too much for an introductory lecture, but mentioning that there are also machine learning approaches to regression problems would be useful.

Third, PDL is a wonderful tool for numerical computation, but it is also a language within a language. If you have time, it would help new users to explain the few PDL constructs you use in your modeling program. Otherwise the operator overloading will just be confusing.

-Mark


Comment on Re: RFC: Machine Learning Development with Perl

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://638705]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (13)
As of 2014-07-30 12:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (231 votes), past polls