Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
This looks like a nice introductory talk for programmers who want to dip their toes into business intelligence and data mining techniques using Perl. Thanks for sharing it with us.

Coming from a machine learning background, I have a few comments that might be useful to add to your presentation.

First, developing a machine learning approach to a problem doesn't really stop at the implementation phase. Once a model is fit to the data, you need to test how well the model fits. To do that, you not only need to test how well the model fit compares to the correct answers, you need to test the model's generalizability. Here, generalizability means 'How well does a this model perform on new data, .i.e, what is its predictive power?" Typically, cross validation, bootstrap or Bayesian methods are used to test predictive power. I have seen many machine learning implementations fail miserably because the programmers didn't realize that testing model fits on new data was also needed.

Second, you give a nice exposition on clustering, but regression problems are nearly as important in the machine learning literature. Regression differs from classification in that models are created to predict "how much?" rather than "what class?". Giving a separate regression example would probably be too much for an introductory lecture, but mentioning that there are also machine learning approaches to regression problems would be useful.

Third, PDL is a wonderful tool for numerical computation, but it is also a language within a language. If you have time, it would help new users to explain the few PDL constructs you use in your modeling program. Otherwise the operator overloading will just be confusing.

-Mark


In reply to Re: RFC: Machine Learning Development with Perl by kvale
in thread RFC: Machine Learning Development with Perl by lin0

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2024-04-16 08:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found