Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: AI::NNFlex::Backprop error not decreasing

by caedes (Pilgrim)
on Mar 21, 2005 at 21:31 UTC ( #441318=note: print w/replies, xml ) Need Help??


in reply to Re: AI::NNFlex::Backprop error not decreasing
in thread AI::NNFlex::Backprop error not decreasing

I have some experience with what you are referring to as an analog network. In fact I've only ever used analog output, so the digital output networks are the exception for me. Actually, there is little difference between the two other than the interpretation of the values that the network produces. The ability of the network to approximate your analog training sample is going to be dependant on the suitability of underlying network's multi-dimensional nonlinear polynomial (of sorts) to approximate the function. Since a sinusoidal function can be pretty well approximated (cose to the origin) by a low order Taylor series expansion, I would expect a suitably designed and trained NN to perform nearly as well. A look at the number of free paramaters in such a series expansion would give you a good hint at the size of network you would need (my guess is not very large).

From looking at your training data and the error over training iterations, I'd say that what you see doesn't look very odd. You can see that the error does indeed decrease from the first training iteration, it then goes to a low point and then levels out at a slightly higher value. This behavior is expected for a network with its number of weights and number of training samples roughly the same order of magnitude. It shows a tendancy for the network to become overtrained: for the training samples to become hardwired into the networks weights. To fix this you would either have to add many more training samples, or reduce the number of layers in your network. To me the network you've chosen looks too complex for the task at hand , therefore much more likely to become overtrained. Try a single hidden layer of 4-5 nodes.

Another way to avoid overtraining is to parition your sample data into two sets, train on one set and then after each epoch, test the error on the other data set. You should aim for a minimized error in the second data set.

-caedes

  • Comment on Re^2: AI::NNFlex::Backprop error not decreasing

Replies are listed 'Best First'.
Re^3: AI::NNFlex::Backprop error not decreasing
by thealienz1 (Pilgrim) on Mar 21, 2005 at 22:40 UTC

    I have taken your recommendations into consideration. Yet now I am having problem with handling the output of the test set of examples on the network. It appears that the network returns an array of values, but all the values are the same.

    use strict; use AI::NNFlex::Backprop; use AI::NNFlex::Dataset; use Data::Dumper; my $n = 0.4; my $num_epochs = 100; my $network = AI::NNFlex::Backprop->new(learningrate=>.9, bias=>1, ); $network->add_layer(nodes=>3,activationfunction=>'tanh'); #$network->add_layer(nodes=>3,activationfunction=>'tanh'); #$network->add_layer(nodes=>2,activationfunction=>'tanh'); #$network->add_layer(nodes=>3,activationfunction=>'tanh'); $network->add_layer(nodes=>5,activationfunction=>'tanh'); $network->add_layer(nodes=>2,activationfunction=>'sigmoid'); $network->init(); my $test_set = AI::NNFlex::Dataset->new([ [6.28318,1.570795,0], [1,0], [6.28318,1.570795,1.570795], [0,-1], [6.28318,1.570795,3.14159], [-1,0], [6.28318,1.570795,4.712385], [0,1], [6.28318,1.570795,6.28318], [1,0], [6.28318,1.570795,7.853975], [0,-1], [6.28318,3.14159,0], [0,-1], [6.28318,3.14159,1.570795], [-1,0], [6.28318,3.14159,3.14159], [0,1], [6.28318,3.14159,4.712385], [1,0], [6.28318,3.14159,6.28318], [0,-1], [6.28318,3.14159,7.853975], [-1,0], [6.28318,4.712385,0], [-1,0], [6.28318,4.712385,1.570795], [0,1], [6.28318,4.712385,3.14159], [1,0], [6.28318,4.712385,4.712385], [0,-1], [6.28318,4.712385,6.28318], [-1,0], [6.28318,4.712385,7.853975], [0,1], [6.28318,6.28318,0], [0,1], [6.28318,6.28318,1.570795], [1,0], [6.28318,6.28318,3.14159], [0,-1], [6.28318,6.28318,4.712385], [-1,0], [6.28318,6.28318,6.28318], [0,1], [6.28318,6.28318,7.853975], [1,0], [6.28318,7.853975,0], [1,0], [6.28318,7.853975,1.570795], [0,-1], [6.28318,7.853975,3.14159], [-1,0], [6.28318,7.853975,4.712385], [0,1], [6.28318,7.853975,6.28318], [1,0], [6.28318,7.853975,7.853975], [0,-1], [7.853975,0,0], [1,0], [7.853975,0,1.570795], [0,-1], [7.853975,0,3.14159], [-1,0], [7.853975,0,4.712385], [0,1], [7.853975,0,6.28318], [1,0], [7.853975,0,7.853975], [0,-1], [7.853975,1.570795,0], [0,-1], [7.853975,1.570795,1.570795], [-1,0], [7.853975,1.570795,3.14159], [0,1], [7.853975,1.570795,4.712385], [1,0], [7.853975,1.570795,6.28318], [0,-1], [7.853975,1.570795,7.853975], [-1,0], [7.853975,3.14159,0], [-1,0], [7.853975,3.14159,1.570795], [0,1], [7.853975,3.14159,3.14159], [1,0], [7.853975,3.14159,4.712385], [0,-1], [7.853975,3.14159,6.28318], [-1,0], [7.853975,3.14159,7.853975], [0,1], [7.853975,4.712385,0], [0,1], [7.853975,4.712385,1.570795], [1,0], [7.853975,4.712385,3.14159], [0,-1], [7.853975,4.712385,4.712385], [-1,0], [7.853975,4.712385,6.28318], [0,1], [7.853975,4.712385,7.853975], [1,0], [7.853975,6.28318,0], [1,0], [7.853975,6.28318,1.570795], [0,-1], [7.853975,6.28318,3.14159], [-1,0], [7.853975,6.28318,4.712385], [0,1], [7.853975,6.28318,6.28318], [1,0], [7.853975,6.28318,7.853975], [0,-1], [7.853975,7.853975,0], [0,-1], [7.853975,7.853975,1.570795], [-1,0], [7.853975,7.853975,3.14159], [0,1], [7.853975,7.853975,4.712385], [1,0], [7.853975,7.853975,6.28318], [0,-1], [7.853975,7.853975,7.853975], [-1,0] ]); my $train_set = AI::NNFlex::Dataset->new([ [0,0,0], [0,1], [0,0,1.570795], [1,0], [0,0,3.14159], [0,-1], [0,0,4.712385], [-1,0], [0,0,6.28318], [0,1], [0,0,7.853975], [1,0], [0,1.570795,0], [1,0], [0,1.570795,1.570795], [0,-1], [0,1.570795,3.14159], [-1,0], [0,1.570795,4.712385], [0,1], [0,1.570795,6.28318], [1,0], [0,1.570795,7.853975], [0,-1], [0,3.14159,0], [0,-1], [0,3.14159,1.570795], [-1,0], [0,3.14159,3.14159], [0,1], [0,3.14159,4.712385], [1,0], [0,3.14159,6.28318], [0,-1], [0,3.14159,7.853975], [-1,0], [0,4.712385,0], [-1,0], [0,4.712385,1.570795], [0,1], [0,4.712385,3.14159], [1,0], [0,4.712385,4.712385], [0,-1], [0,4.712385,6.28318], [-1,0], [0,4.712385,7.853975], [0,1], [0,6.28318,0], [0,1], [0,6.28318,1.570795], [1,0], [0,6.28318,3.14159], [0,-1], [0,6.28318,4.712385], [-1,0], [0,6.28318,6.28318], [0,1], [0,6.28318,7.853975], [1,0], [0,7.853975,0], [1,0], [0,7.853975,1.570795], [0,-1], [0,7.853975,3.14159], [-1,0], [0,7.853975,4.712385], [0,1], [0,7.853975,6.28318], [1,0], [0,7.853975,7.853975], [0,-1], [1.570795,0,0], [1,0], [1.570795,0,1.570795], [0,-1], [1.570795,0,3.14159], [-1,0], [1.570795,0,4.712385], [0,1], [1.570795,0,6.28318], [1,0], [1.570795,0,7.853975], [0,-1], [1.570795,1.570795,0], [0,-1], [1.570795,1.570795,1.570795], [-1,0], [1.570795,1.570795,3.14159], [0,1], [1.570795,1.570795,4.712385], [1,0], [1.570795,1.570795,6.28318], [0,-1], [1.570795,1.570795,7.853975], [-1,0], [1.570795,3.14159,0], [-1,0], [1.570795,3.14159,1.570795], [0,1], [1.570795,3.14159,3.14159], [1,0], [1.570795,3.14159,4.712385], [0,-1], [1.570795,3.14159,6.28318], [-1,0], [1.570795,3.14159,7.853975], [0,1], [1.570795,4.712385,0], [0,1], [1.570795,4.712385,1.570795], [1,0], [1.570795,4.712385,3.14159], [0,-1], [1.570795,4.712385,4.712385], [-1,0], [1.570795,4.712385,6.28318], [0,1], [1.570795,4.712385,7.853975], [1,0], [1.570795,6.28318,0], [1,0], [1.570795,6.28318,1.570795], [0,-1], [1.570795,6.28318,3.14159], [-1,0], [1.570795,6.28318,4.712385], [0,1], [1.570795,6.28318,6.28318], [1,0], [1.570795,6.28318,7.853975], [0,-1], [1.570795,7.853975,0], [0,-1], [1.570795,7.853975,1.570795], [-1,0], [1.570795,7.853975,3.14159], [0,1], [1.570795,7.853975,4.712385], [1,0], [1.570795,7.853975,6.28318], [0,-1], [1.570795,7.853975,7.853975], [-1,0], [3.14159,0,0], [0,-1], [3.14159,0,1.570795], [-1,0], [3.14159,0,3.14159], [0,1], [3.14159,0,4.712385], [1,0], [3.14159,0,6.28318], [0,-1], [3.14159,0,7.853975], [-1,0], [3.14159,1.570795,0], [-1,0], [3.14159,1.570795,1.570795], [0,1], [3.14159,1.570795,3.14159], [1,0], [3.14159,1.570795,4.712385], [0,-1], [3.14159,1.570795,6.28318], [-1,0], [3.14159,1.570795,7.853975], [0,1], [3.14159,3.14159,0], [0,1], [3.14159,3.14159,1.570795], [1,0], [3.14159,3.14159,3.14159], [0,-1], [3.14159,3.14159,4.712385], [-1,0], [3.14159,3.14159,6.28318], [0,1], [3.14159,3.14159,7.853975], [1,0], [3.14159,4.712385,0], [1,0], [3.14159,4.712385,1.570795], [0,-1], [3.14159,4.712385,3.14159], [-1,0], [3.14159,4.712385,4.712385], [0,1], [3.14159,4.712385,6.28318], [1,0], [3.14159,4.712385,7.853975], [0,-1], [3.14159,6.28318,0], [0,-1], [3.14159,6.28318,1.570795], [-1,0], [3.14159,6.28318,3.14159], [0,1], [3.14159,6.28318,4.712385], [1,0], [3.14159,6.28318,6.28318], [0,-1], [3.14159,6.28318,7.853975], [-1,0], [3.14159,7.853975,0], [-1,0], [3.14159,7.853975,1.570795], [0,1], [3.14159,7.853975,3.14159], [1,0], [3.14159,7.853975,4.712385], [0,-1], [3.14159,7.853975,6.28318], [-1,0], [3.14159,7.853975,7.853975], [0,1], [4.712385,0,0], [-1,0], [4.712385,0,1.570795], [0,1], [4.712385,0,3.14159], [1,0], [4.712385,0,4.712385], [0,-1], [4.712385,0,6.28318], [-1,0], [4.712385,0,7.853975], [0,1], [4.712385,1.570795,0], [0,1], [4.712385,1.570795,1.570795], [1,0], [4.712385,1.570795,3.14159], [0,-1], [4.712385,1.570795,4.712385], [-1,0], [4.712385,1.570795,6.28318], [0,1], [4.712385,1.570795,7.853975], [1,0], [4.712385,3.14159,0], [1,0], [4.712385,3.14159,1.570795], [0,-1], [4.712385,3.14159,3.14159], [-1,0], [4.712385,3.14159,4.712385], [0,1], [4.712385,3.14159,6.28318], [1,0], [4.712385,3.14159,7.853975], [0,-1], [4.712385,4.712385,0], [0,-1], [4.712385,4.712385,1.570795], [-1,0], [4.712385,4.712385,3.14159], [0,1], [4.712385,4.712385,4.712385], [1,0], [4.712385,4.712385,6.28318], [0,-1], [4.712385,4.712385,7.853975], [-1,0], [4.712385,6.28318,0], [-1,0], [4.712385,6.28318,1.570795], [0,1], [4.712385,6.28318,3.14159], [1,0], [4.712385,6.28318,4.712385], [0,-1], [4.712385,6.28318,6.28318], [-1,0], [4.712385,6.28318,7.853975], [0,1], [4.712385,7.853975,0], [0,1], [4.712385,7.853975,1.570795], [1,0], [4.712385,7.853975,3.14159], [0,-1], [4.712385,7.853975,4.712385], [-1,0], [4.712385,7.853975,6.28318], [0,1], [4.712385,7.853975,7.853975], [1,0], [6.28318,0,0], [0,1], [6.28318,0,1.570795], [1,0], [6.28318,0,3.14159], [0,-1], [6.28318,0,4.712385], [-1,0], [6.28318,0,6.28318], [0,1], [6.28318,0,7.853975], [1,0] ]); my $epoch = 1; my $err = 1; while($err > .001 && $epoch < 100) { $err = $train_set->learn($network); my $outputsRef = $test_set->run($network); print Dumper($outputsRef); print "Error: $err\n"; $epoch++; }

    The output of the network with the test set gives the following.

    $ perl test1.pl $VAR1 = [ [ '2.22776546277668e-07', '0.011408329955622' ], [ '2.22776546277668e-07', '0.011408329955622' ], [ '2.22776546277668e-07', '0.011408329955622' ], [ '2.22776546277668e-07', '0.011408329955622' ], [ '2.22776546277668e-07', '0.011408329955622' ], [ '2.22776546277668e-07', '0.011408329955622' ], .... ....

    Am I handling the output of the network in correctly? The module says that Runs the dataset through the network and returns a reference to an array of output patterns. I guess I am not handling the reference array correctly.

    Thanks for all the help.

      First of all, thanks to caedes for giving such a detailed answer to the question. I haven't done much with analog target output, and am no sort of mathematician, so that was really helpful to me as well as hopefully to the OP.

      You are definitely handling the outputs correctly, I suspect the reason that you are getting the same values for every pattern during training is mathematical rather than programmatic. Setting the network to debug=>[4] shows that the network is adjusting weights successfully, but seems unable to learn the data as it stands. If you look at the output from the beginning, during the first few epochs it changes, then settles on a consistent value. Combined with the fact that the RMS error is very high, that suggests that an identical response of [x,y] for every pattern is the best solution the network has been able to come up with, as the network is currently set up.

      Correctly set up, backprop will find a solution (not necessarily the optimum) provided one exists. It looks to me like a backprop solution for your dataset as it stands doesn't exist.

      BUT, I don't want to mislead you. As I said above, I'm no kind of mathematician (psychologist really - I wrote this module for cognitive modelling), and my understanding of backprop is pretty much empirical. Some more mathematical monk may be able to give you (and for that matter me) better guidance on this.

      Update: FWIW I've taken another look at this. I switched on debug (debug=>[5]) to get the return values from the activatation functions, and since the inputs are mostly greater than 1, tanh is returning 1 or close to it for all of them.

      g0n, backpropagated monk

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://441318]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2019-11-12 16:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Strict and warnings: which comes first?



    Results (66 votes). Check out past polls.

    Notices?