Feature Markov Decision Processes

Marcus Hutter

doi:10.2991/agi.2009.30

<Previous Article In Volume

Next Article In Volume>

Feature Markov Decision Processes

Authors

Marcus Hutter

Corresponding Author

Marcus Hutter

Available Online June 2009.

DOI: 10.2991/agi.2009.30 How to use a DOI?
Abstract: General purpose intelligent learning agents cycle through (complex,non-MDP) sequences of observations, actions, and rewards. On the other hand, reinforcement learning is well- developed for small finite state Markov Decision Processes (MDPs). So far it is an art performed by human designers to extract the right state representation out of the bare observa- tions, i.e. to reduce the agent setup to the MDP framework. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main con- tribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Ex- tensions to more realistic dynamic Bayesian networks are de- veloped in the companion article [Hut09].
Copyright: © 2009, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2nd Conference on Artificial General Intelligence (2009)
Series: Advances in Intelligent Systems Research
Publication Date: June 2009
ISBN: 978-90-78677-24-6
ISSN: 1951-6851
DOI: 10.2991/agi.2009.30 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Marcus Hutter
PY  - 2009/06
DA  - 2009/06
TI  - Feature Markov Decision Processes
BT  - Proceedings of the 2nd Conference on Artificial General Intelligence (2009)
PB  - Atlantis Press
SP  - 138
EP  - 143
SN  - 1951-6851
UR  - https://doi.org/10.2991/agi.2009.30
DO  - 10.2991/agi.2009.30
ID  - Hutter2009/06
ER  -

download .riscopy to clipboard