I work in machine learning and use Perl for most of my scripting, but have never bothered to use CPAN's machine learning modules. First, you often need to do some additional linear algebra on your data (e.g. centering, finding eigenvalues, SVD, etc.), and these modules don't share a common matrix representation. The lack of a common format for compact storage and a rich library of numerical algorithms makes it hard to do things quickly in pure Perl. Second, many CPAN modules I've looked at seem to have been written either for their authors' edification or without caring about large datasets (e.g. Algorithm::SVMLight
requires you to add your datapoints one at a time in bulky hash-refs), while most of the problems I care about involve huge amounts of data.
I think the PDL statistics paper someone else mentioned is the best "perl for statistics" resource I've seen. Depending on your problems and level of familiarity with the field, there may be some articles on Perl.com of interest. As much as I loathe Java, I would actually recommend Weka as an implementation of lots of machine learning algorithms that work well together. But unless PDL does what you want, I'd suggest something other than Perl (including CPAN modules) for your core algorithms.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||