Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

using Statistics::Regression

by Random_Walk (Prior)
on Apr 19, 2017 at 14:07 UTC ( #1188279=perlquestion: print w/replies, xml ) Need Help??

Random_Walk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Folks

Does anyone know the above module and what it's inputs mean? The docco is rather terse ...

use Statistics::Regression; # Create regression object my $reg = Statistics::Regression->new( "sample regression", [ "const +", "someX", "someY" ] ); # Add data points $reg->include( 2.0, [ 1.0, 3.0, -1.0 ] ); $reg->include( 1.0, [ 1.0, 5.0, 2.0 ] ); $reg->include( 20.0, [ 1.0, 31.0, 0.0 ] ); $reg->include( 15.0, [ 1.0, 11.0, 2.0 ] );

I was expecting to give a more simple two dimensional input of X and Y values, then get back a vector of Theta values such that...

$predictedY = $Theta[0] + $X * $Theta[1] + $X**2*Theta[2] ... $X**3*T +heta[n];

How can I reconcile my expectations with the multidimensional data points in the example?

Thanks in Advance,

Pereant, qui ante nos nostra dixerunt!

Replies are listed 'Best First'.
Re: using Statistics::Regression
by Anonymous Monk on Apr 19, 2017 at 15:08 UTC

    Statistics::Regression is ten years old and has a horrible interface and useless documentation. Maybe someone can suggest a better module.

    I haven't tested it, but it looks like the way you would have to use it to get a cubic fit is:

    my $reg = Statistics::Regression->new("Pain", ["C", "X", "X**2", "X**3 +", "Y"]); $reg->include(1.0, [1.0, $x, $x**2, $x**3, $y]);

      And there I was thinking of the pain I would save myself using a ready cut module ;)

      I must say your two lines of code does rather beat the existing docco. Of to give it a go, thanks.


      Pereant, qui ante nos nostra dixerunt!
        Ah, I've got it. Try this instead:
        my $reg = Statistics::Regression->new("Pain", ["C", "X", "X**2", "X**3 +"]); $reg->include($y, [1.0, $x, $x**2, $x**3]);
      Nope. I figured the first 1.0 was a weight, and you could just leave it like that if you didn't care about weighting. No go. The module crashes if the first number is the same for all data points. Hm.
        Argh! The weight is an optional third argument to include(), which isn't mentioned anywhere in the documentation.
Re: using Statistics::Regression
by Random_Walk (Prior) on Apr 20, 2017 at 16:11 UTC

    Hi folks,

    After 24 hours struggle with this module, I feel I could update the docco at least for the benefit of those that come after me. What is the Perl way to get such things done, should I file a bug in CPAN, try to talk to the one author email listed in the module, or is there are way to mail the authors directly via CPAN?


    Pereant, qui ante nos nostra dixerunt!
      Create a patch and upload that patch to the module's bug tracker. That way, others will be able to see your patch and apply it locally even if the author does not apply it. If you don't get a response within a few days, you could send an email to the author. There is one author listed in the POD for Statistics::Regression, but two are listed in RT.

      Refer to:

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1188279]
Approved by haukex
Front-paged by haukex
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2022-07-06 04:06 GMT
Find Nodes?
    Voting Booth?

    No recent polls found