title:
 
Feature Markov Decision Processes
publication:
 
AGI-09
part of series:
  Advances in Intelligent Systems Research
ISBN:
  978-90-78677-24-6
ISSN:
  1951-6851
DOI:
  doi:10.2991/agi.2009.30 (how to use a DOI)
author(s):
 
Marcus Hutter
publication date:
 
May 2009
abstract:
 
General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well- developed for small finite state Markov Decision Processes (MDPs). So far it is an art performed by human designers to extract the right state representation out of the bare observa- tions, i.e. to reduce the agent setup to the MDP framework. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main con- tribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Ex- tensions to more realistic dynamic Bayesian networks are de- veloped in the companion article [Hut09].
copyright:
 
Atlantis Press. This article is distributed under the terms of the Creative Commons Attribution License, which permits non-commercial use, distribution and reproduction in any medium, provided the original work is properly cited.
full text: