Q-Learning .*;

Q-Learning Example 2

This example is simply a 16-node version of Example 1.  Similar code, with a few refinements.

Note how much wasted memory is used on matrix Q (all the zeros in the results) where there are no links between nodes to record any learning for.  This implies the need for a better method of storing learned information.  At first, we might see geometric patterns of zeros in matrix Q, but these are really only consequential to the layout of the node/link pattern during design.  Now consider that the indices of matrix Q are what we're using to map the agent's progress.  Why not just record these index/coordinates with their resulting learning score, and forget anything with a zero?  At any rate, this example illustrates the disadvantages of using an entire matrix for Q.

 

Example Results 1

Shortest routes from initial states:
1, 0, 4, 8, 9, 5, 6, 2, 3, 7, 11, 15
3, 7, 11, 15
5, 6, 2, 3, 7, 11, 15
2, 3, 7, 11, 15
4, 8, 9, 5, 6, 2, 3, 7, 11, 15
0, 4, 8, 9, 5, 6, 2, 3, 7, 11, 15

public void footer() {
About | Contact | Privacy Policy | Terms of Service | Site Map
Copyright© 2009-2012 John McCullock. All Rights Reserved.
}