|Perl: the Markov chain saw|
Re: AI NEWBIE -- Neural Net problem/question... um Tutorial request?? :)by gjb (Vicar)
|on May 24, 2004 at 18:59 UTC||Need Help??|
The reason this isn't working is quite simple: you're asking the NN to do something that is impossible.
The network you use consists of 5 input units and 1 output unit. Let's simplify this to 2 input units and 1 output unit so that we cna visualize what's happening. This implies:
0 0 -> 1 1 0 -> 0 0 1 -> 0 1 1 -> 1Now lets draw this:
| 1 0- - - -1 | | | | 0 1-------0-- 0 1This is a visual representation of the tabel above: (0,0) yields 1, (0, 1) yields 0, (1, 0) yields 0, (1, 1) yields 1.
Now there's one more thing you should realize and that's the maths behind the thing. For the two input units, one output unit case, the output of the network is given by: o = f(x_1 w_1 + x_2 w_2) where x_1, x_2 represent the values of the first and second input unit, w_1 and w_2 are the weights to be determined by the backpropagation algorithm during the training phase of the network, f is some transfer function, typically nonlinear but monotone and o is the output value.
Now since the transfer function f is monotone, the expression x_1 w_1 + x_2 w_2 above essentially defines a line in the plane of the plot above. All input tuples that are to be mapped to 1 should be on one side of that line, all those to be mapped to 0 on the other. Aha, that's exactly the root of the problem! Just try to draw a line that separates the 1s and 0s in the drawing above, you simply can't!
| 1 0- - - -1 \| \ | |\ 0 1-\-----0-- 0 \ 1
The problem you're trying to solve simply can't be solved by a two layer network. You'll need one with three layers, that will (almost) do the trick. The reason is that now the plane is not divided by one line, but rather by two. Now map those points that are to the left of both lines to 1, those that are to the right of both lines to 1 and everything in between to 0 and you're done.
\ | \ 1 \0- - \ -1 \ \ |\ \| | \ \ 0 1--\----0\-- 0 \ 1\
will do better, but there's still an extra trick to apply to get good results. The problem is too symmetric and will be very hard to learn, so the trick is to break the symmetry in the input, and that's simple. Rather than using the network above, use
and pad the inputs you have with a 1, so
Note however that it will still take a considerable amount of learning steps in order to train this very hard example (yes, you picked a though one). It could take a few thousand steps, depending on the version of the backprop algorithm used.
One further point: it would be better to use +/- 1 values rather than 0/1, but don't try that since I don't think this implementation supports that kind of data.
For tutorials you might have a look at http://www.calresco.org/tutorial.htm, but especially at Denni Rögnvaldsson lecture notes that I very much enjoyed (and I shamelessly adapted some of his slides for the course I teach on this subject ;)
Hope this helps, -gjb-
PS: not that it particularly matters in this context, but things get much more interesting when the transfer function is not monotone.