Stimulus -> Action <-> Reward Network (SARN)
by Ondrej Pacovsky
On-line
reinforcement learning algorithm for real-time application with
real-valued inputs.
Prototype implementation was done in Unreal 2004 environment. A
SARN-controlled bot learns to evade missiles during the game.

The thesis
Online learning in Real-time environments
Abstract
In this work, a novel reinforcement
learning algorithm, SARN, is developed. It is targeted for application
in real-time
domains where the inputs are usually continuous and adaptation must
proceed on-line, without separate training periods. Another objective
is to minimise the amount of problem-specific teacher (human) input
needed for successful application of the algorithm. The SARN
architecture combines a connectionist network and scalar reinforcement
feedback by employing Hebbian principles. By
adapting the network weights, connections are established between
stimuli and actions that lead to positive feedback. Since the links
between the input stimuli and the actions are formed quite rapidly, it
is possible to use a large number of stimuli. This leads to the idea of
using recurrent random network (Echo State Network) as a pre-processing
layer. Prototype implementation is tested in Unreal 2004 game
environment. The comparison with Q-learning shows that on the time
scale of tens of seconds to minutes, SARN typically achieves better
performance. When coupled with an Echo State Network, SARN requires a
uniquely low amount of problem-specific information supplied by the
teacher. These features make SARN useful for domains such as autonomous
robot control and game AI.
Full text - PDF
Movies
In low quality for now. To be updated soon.
All done with SARN-80 controller (=SARN coupled with Echo State Network
with 80 nodes)
Source Code
The source code is available here. Feel
free to have a look, but
compiling/running will be tricky as I only tested it in my environment.
The high-level programmer doc is contained in the thesis (slightly
outdated) and you can also look at the auto-generated doxygen pages.
[back to top]