[pdp-discuss] Error-tweaking & scalarvallayerspecs

Randall C. O'Reilly Randy.OReilly at colorado.edu
Wed Mar 21 21:40:04 MDT 2007


Lowering the learning rate usually helps in my experience -- I typically train 
to a rough asymptote at .01, then drop to .001, and get a big drop in error, 
consistent with the idea that there has been a lot of weight thrashing 
(interference) at the higher lrate.  But you need the higher lrate to explore 
the space more thoroughly at the start -- if you just start at .001, it never 
does as well.  The lrate_sched can be configured to do this for you 
automatically.

I thought about doing the normalized width thing too... probably it is better 
that way.  Send me your patches and I'll put them in 4.0.

- Randy

On Wednesday 21 March 2007 09:44, Frank Leoné wrote:
> Dear all,
>
> First of all I got a question, which is possibly useful for more people:
> how to tweak the error in a leabra network, in my case for scalar valars
> represented by gaussians?
>
> I know it helps to switch the opt thresh off and not use it for weight
> updating (in the unitspec settings). Also for me it helped a lot to lower
> the dt vm to 0.02, so really small. Also I tweaked the inhibition values
> using batch runs and turned soft bounding and symmetric weights off
> (especially the first helped quite a lot). More hidden units does not seem
> to help though. What is there I can do to further improve my error, because
> I'm not content with the result yet.
>
> Also, a more specific question, how can it be that my error rises when I
> lower the learning rate? I would expect it to go down, or maybe halt, but
> not rise. Hopefully someone can shine some light on this behaviour.
>
> In addition, something also possibly useful for others: I changed the code
> of the scalarvallayerspec and its 2D counterpart in order to remove the
> restriction of the activation, on the basis of the instruction given by
> O'Reilly (thanks once again). I also changed to code for both specs so that
> the width of the gaussian does not represent the absolute width, but the
> relative width with respect to the range of values represented. So, if the
> layer represents values from -100 to 100 and you would normally fill in 10
> as width, you now fill in 0.1. If you than change the range of values, you
> don't have to change the width anymore if you want to keep the same ratio.
> This is especially useful in the 2D version, because you can only specify
> one width. This makes it impossible to get circular gaussians in 2D spec
> with different ranges on the axes. With the ratio instead of the absolute
> width, it does work.
>
> So, if anyone is interesting in this minor functionality tweak, I am happy
> to share it with you.
>
> With kind regards,
>
> Frank
>
> _________________________________________________________________
> Live Search, for accurate results! http://www.live.nl
>
> _______________________________________________
> PDP-Discuss mailing list
> PDP-Discuss at psych.Colorado.EDU
> http://psych.colorado.edu/mailman/listinfo/pdp-discuss


More information about the PDP-Discuss mailing list