From ftmleone at hotmail.com Fri Apr 6 12:54:15 2007 From: ftmleone at hotmail.com (Frank Leoné) Date: Fri Apr 6 12:54:19 2007 Subject: [pdp-discuss] Error-tweaking & scalarvallayerspecs In-Reply-To: <200703212340.04151.Randy.OReilly@colorado.edu> Message-ID: Hello! Thanks for the answer. Lowering the learning rate sometimes makes the error rise at first, without going down afterwards. A bit strange behavior. I still have a problem with the error though. Now I use a bigger set, with the same kind of data, but where is reached an average error of 2 with the smaller set, it only reaches 15 with the bigger set. Even more hidden units hardly seems to help. Anyone any ideas how to get the error any lower? Two things I want to try are: - a batch run over different numbers of hidden units, but how to automatically build and connect all? I tried it in an init script, but it didn't work. - a batch run over the width of scalarvallayers, to give the network a higher/lower input and output resolution. They all have the same ratio, so I made a parent with that ratio and tried a batch on the width value for that parent, but the children didn't inherit it. Any idea how this can be fixed? Any other variables I can try to change? Thanks in advance, any help is grealy appreciated! with kind regards, happy Easter! Frank PS. Randy, I'll send you my leabra.cc later on, hotmail in Konqueror doesn't let me attach files. I hope it will be of use for you. _________________________________________________________________ Play online games with your friends with Messenger http://www.join.msn.com/messenger/overview From Randy.OReilly at colorado.edu Mon Apr 9 02:35:50 2007 From: Randy.OReilly at colorado.edu (Randall C. O'Reilly) Date: Mon Apr 9 02:35:54 2007 Subject: [pdp-discuss] Error-tweaking & scalarvallayerspecs In-Reply-To: References: Message-ID: <200704090235.50790.Randy.OReilly@colorado.edu> Frank -- a script with the network->Build(); and network->Connect(); function calls in it should work. For the width, you might try adding a script that calls UpdateAfterEdit() on the layerspec after setting the new value. btw, this kind of thing should be much easier to do in the new 4.0 version of the software -- we're working hard on getting a public beta version available soon.. - Randy On Friday 06 April 2007 12:54, Frank Leon? wrote: > Hello! > > Thanks for the answer. Lowering the learning rate sometimes makes the error > rise at first, without going down afterwards. A bit strange behavior. > > I still have a problem with the error though. Now I use a bigger set, with > the same kind of data, but where is reached an average error of 2 with the > smaller set, it only reaches 15 with the bigger set. Even more hidden units > hardly seems to help. Anyone any ideas how to get the error any lower? > > Two things I want to try are: > > - a batch run over different numbers of hidden units, but how to > automatically build and connect all? I tried it in an init script, but it > didn't work. > - a batch run over the width of scalarvallayers, to give the network a > higher/lower input and output resolution. They all have the same ratio, so > I made a parent with that ratio and tried a batch on the width value for > that parent, but the children didn't inherit it. Any idea how this can be > fixed? > > Any other variables I can try to change? Thanks in advance, any help is > grealy appreciated! > > with kind regards, happy Easter! > > Frank > > PS. Randy, I'll send you my leabra.cc later on, hotmail in Konqueror > doesn't let me attach files. I hope it will be of use for you. > > _________________________________________________________________ > Play online games with your friends with Messenger > http://www.join.msn.com/messenger/overview > > _______________________________________________ > PDP-Discuss mailing list > PDP-Discuss@psych.Colorado.EDU > http://psych.colorado.edu/mailman/listinfo/pdp-discuss From s0675643 at sms.ed.ac.uk Tue Apr 10 17:54:13 2007 From: s0675643 at sms.ed.ac.uk (M Snel) Date: Tue Apr 10 17:54:17 2007 Subject: [pdp-discuss] TD learning in PDP Message-ID: <20070411005413.xuzslk934000088k@www.sms.ed.ac.uk> Hi, I am trying to construct a model of TD learning in a simple navigation task. The network should learn to select the optimal action to get to a goal state; thus, the weights between inputs (encoding location in the environment) and outputs (encoding navigational actions) should be updated based on reward. I have connected the input units to the predicted reward layer, and clamp the external reward upon reachning goal state. By this construction the network accurately learns to represent the "value" of each input unit (i.e. higher expected reward closer to goal). I have connected the TDlayer to the DaModUnit action units and have turned on the Da modulation and "p dwt" in those units so that they should learn from the modulation. However, results for learning in the actions units are not as I expected: the action units don't learn to map an input state to a correct action. I was assuming that in PDP the modulation from the TDlayer would be "interpreted" by the DaModUnits as feedback on the PREVIOUS action (as per the "p dwt" parameter). Is this correct? Also, do the units learn using the Da modulation directly or by a difference in Da modulations from one timestep to the next? Thanks, Matthijs From Randy.OReilly at colorado.edu Wed Apr 18 23:40:54 2007 From: Randy.OReilly at colorado.edu (Randall C. O'Reilly) Date: Wed Apr 18 23:40:56 2007 Subject: [pdp-discuss] TD learning in PDP In-Reply-To: <20070411005413.xuzslk934000088k@www.sms.ed.ac.uk> References: <20070411005413.xuzslk934000088k@www.sms.ed.ac.uk> Message-ID: <200704182340.54395.Randy.OReilly@colorado.edu> Matthijs, this is a bit confusing, but p_dwt only works in conjunction with the TDRewPredConSpec to use prior activations.. For a regular DaModUnitSpec guy, this will ENABLE such learning by maintaining the appropriate variables and calling the dwt functions, but the ConSpec must also be configured to actually do the weight change using prior sending unit variables. Although TDRewPredConSpec is typically used only for that specific rew pred guy in TD, I'm pretty sure you can just use it on the action connections.. Give that a try. Meanwhile, here is some irrelevant info about the p_dwt variable in the connection that I typed before I realized what you were talking about -- it might be useful to someone.. p_dwt is a bit of a weird variable: the dwt variable reflects any currently accumulating weight changes, which may not actually be applied for several steps depending on the learning parameters (e.g., SMALL_BATCH in the epoch process). It is reset after the weight changes are applied to update the weights, and p_dwt is updated to reflect that value. So, in the usual ON_LINE mode, dwt is 0 and p_dwt shows the weight change that was computed on the trial that just finished. - Randy On Tuesday 10 April 2007 17:54, M Snel wrote: > Hi, > > I am trying to construct a model of TD learning in a simple navigation > task. The network should learn to select the optimal action to get to a > goal state; thus, the weights between inputs (encoding location in the > environment) and outputs (encoding navigational actions) should be > updated based on reward. > > I have connected the input units to the predicted reward layer, and > clamp the external reward upon reachning goal state. By this > construction the network accurately learns to represent the "value" of > each input unit (i.e. higher expected reward closer to goal). I have > connected the TDlayer to the DaModUnit action units and have turned on > the Da modulation and "p dwt" in those units so that they should learn > from the modulation. > > However, results for learning in the actions units are not as I > expected: the action units don't learn to map an input state to a > correct action. I was assuming that in PDP the modulation from the > TDlayer would be "interpreted" by the DaModUnits as feedback on the > PREVIOUS action (as per the "p dwt" parameter). Is this correct? Also, > do the units learn using the Da modulation directly or by a difference > in Da modulations from one timestep to the next? > > Thanks, > Matthijs > > _______________________________________________ > PDP-Discuss mailing list > PDP-Discuss@psych.Colorado.EDU > http://psych.colorado.edu/mailman/listinfo/pdp-discuss