This sub-class of the Two-Layer ECM adds a variable for action priming. This variable should be a list of floats, each element of which corresponds to an action in the ECM. These “priming values” are summed with h-values of any edge connected to the associated action node prior to calculating walk probabilites with the softmax function
Type
Default
Details
num_actions
int
The number of available actions.
glow
float
0.1
The glow (or eta) parameter.
damp
float
0.01
The damping (or gamma) parameter.
softmax
float
0.5
The softmax (or beta) parameter.
action_primes
list
None
weights on the probability that deliberation steps into each action. Defaults to 0 for each action
Example
test_ECM = Priming_ECM(num_actions =2, action_primes = [0., 1.5])#Number of steps to run simulationT =100data_log = [None] * Tenv = RLGL() #create a default red-light-green-light environmentfor t inrange(T): observation = env.get_observation() action = test_ECM.sample(observation) reward = env.get_reward(action) test_ECM.learn(reward) data_log[t] = {"env_state": env.state, "action": action, "reward": reward} env.transition(action)plt.plot(range(T), [np.mean([data_log[step]["reward"] for step inrange(i-10,i+1) if step >=0]) for i inrange(T)]) #plot a 10 step moving average of the reward