Tuesday, August 23, 2011

Recurrent Neural Network State Trace Animation



Animation depicting the state of a recurrent neural network evolved to perform a balancing task (non-Markov double cart/pole). The hidden layer of the RNN is composed of two sinusoidal nodes, the states of which are being plotted through time, with some decayed traces to aid in visualization.



Here is an attempt at a similar style of visualization, but using 5 rather than 2 hidden neurons. The traces are of all the pairwise combinations of hidden neuron state. The combinations are: {(1,2), (1,3), (1,4), (1,5), (2,3), (2,4), (2,5), (3,4), (3,5), (4,5)}, for a total of 10 traces.



Above is a picture created by allowing a single trace to accumulate over about 6000 time steps.



This second image depicts the dynamics of an RNN evolved to perform on a more complex task; the 89-state maze, described here[PDF]. As can be quite obviously seen, the dynamics are far more complex than those of the pole balance task.



This third image depicts the RNN dynamics when evolved on the embedded Reber grammar.


Tuesday, June 21, 2011

Memory-based black box optimization



Experimenting with a memory-based approach to black-box optimization to help avoid being trapped in local optima.

The image on the left is a two-dimensional slice out of a (much higher dimensional) fitness landscape from a recurrent neural network. The optimal point is just above the lower left hand corner, and it is discovered after exactly 808 samples. The sampling process is shown on the right.

The size of the entire image being searched is 500x500 (250,000) pixels.

Thursday, March 17, 2011

Balancing Seven Carts



Single recurrent neural network (15 hidden neurons) evolved to balance seven double-pole carts simultaneously. No velocity information is available to the controller, so it is a non-Markovian task. For each of the seven carts, the controller has access to the cart position, and the angle of both poles, for a total of 21 inputs. The controller has seven outputs with which it applies force to the carts at each time step. The force is continuously valued, rather than "bang-bang".

It is able to successfully keep balance for 5000 time steps, although it becomes unstable at the end. The process of evolving the controller took precisely 13,373,736 fitness evaluations (roughly 20 hours of CPU time.)

This controller was, for the sake of time, only evolved on a single initial condition (with slightly off-center pole angles), so it is unknown how well it generalizes to other initial conditions. However, it still proves to be a very challenging task.

Tuesday, March 08, 2011

Balancing Four Carts



It grows unstable towards the end, as the carts sway further and faster from the center, but I think it is still impressive nonetheless.

See here and here for details.

UPDATE: More videos balancing five and six carts!

Balancing Three Carts



Same as my previous post, except this time with three carts, and the following changes in experimental setup:

First, the forces applied to the carts have been switched to continuous rather than "bang-bang".

Second, the random number generator is being seeded such that the fitness evaluations are always performed over the same set of 10 initial conditions (previously, it had been given an entirely new set of 10 for each evaluation). The advantage is that this greatly reduces noise in the fitness evaluation function. This does have the drawback of potentially harming generalization to some degree, although in this particular case I don't think it is really much of an issue.

Now to try four...

Sunday, March 06, 2011

Non-Markovian Double Pole Balancing with Multiple Carts



Above is an animation depicting the results of having evolved a single recurrent neural network (with 10 hidden neurons) to act as a balancing controller for two carts simultaneously.

The RNN has no access to velocity information, thus making the task non-Markovian, and hence, significantly more difficult.

Cart-pole balancing is a fairly standard benchmark problem in the Reinforcement Learning literature, although to the best of my knowledge this is the first example of controlling multiple carts at the same time.

Friday, February 18, 2011

Animated Fitness Landscapes







Now using time to visualize a 3rd dimension in the high-dimensional space. See my previous posts for more details.

Tuesday, February 15, 2011

Fitness Landscapes

Here are some visualizations of fitness landscapes for recurrent neural networks. I have taken 2-dimensional "slices" out of an otherwise very high-dimensional parameter space. To create a "slice", two links from the RNN are randomly selected and then systematically varied and re-evaluated for fitness. So in order to generate one of these 800-by-800 pixel images, 640,000 fitness evaluations are required.













I had previously done some similar experiments, although I think the results above are a bit nicer, aesthetically.

Monday, January 31, 2011

High-Order Adaptive Mutation



This is a visualization of an experiment I'm doing involving high-order, self-adaptive mutation distributions. This idea has grown out of an ongoing conversation with the creator of floatworld, a really neat (and increasingly full-featured) Artificial Life simulator.

The basic idea is that over time, the environment an adaptive organism finds itself in will change, and it may do so at varying rates. Rather than allowing only a fixed size/rate of mutation, why not allow the parameters controlling mutation to themselves be folded into the evolutionary process?

The visualization above is the result of creating such a system, and additionally making use of a covariance matrix, so that the distribution can be stretched and rotated arbitrarily.