Learning model-free robot control by a Monte Carlo EM algorithm

Toussaint Marc; Kontes Georgios; Piperidis Savvas; Πιπεριδης Σαββας; Vlassis Nikos

Learning model-free robot control by a Monte Carlo EM algorithm

Αρχεία

Vlassis_et_al_Autonomous_Robots_27_2009.pdf (751.76 KB)

Ημερομηνία

2009

Συγγραφείς

Toussaint Marc

Kontes Georgios

Piperidis Savvas

Πιπεριδης Σαββας

Vlassis Nikos

Εκδότης

Springer Verlag

Περίληψη

We address the problem of learning robot control by model-free reinforcement learning (RL). We adopt the probabilistic model of Vlassis and Toussaint (2009) for model-free RL, and we propose a Monte Carlo EM algorithm (MCEM) for control learning that searches directly in the space of controller parameters using information obtained from randomly generated robot trajectories. MCEM is related to, and generalizes, the PoWER algorithm of Kober and Peters (2009). In the finite-horizon case MCEM reduces precisely to PoWER, but MCEM can also handle the discounted infinite-horizon case. An interesting result is that the infinite-horizon case can be viewed as a ‘randomized’ version of the finite-horizon case, in the sense that the length of each sampled trajectory is a random draw from an appropriately constructed geometric distribution. We provide some preliminary experiments demonstrating the effects of fixed (PoWER) vs randomized (MCEM) horizon length in two simulated and one real robot control tasks.

Λέξεις-κλειδιά

Reinforcement learning

Παραπομπή

N. Vlassis, M. Toussaint, G. Kontes, and S. Piperidis, "Learning model-free robot control by a Monte Carlo EM algorithm," Autonomous Robots, vol. 27, no. 2, pp. 123-130, 2009.

URI

https://dspace.library.tuc.gr/handle/123456789/585

Συλλογές

Σχολή Μηχανικών Παραγωγής και Διοίκησης -> Δημοσιεύσεις σε Περιοδικά

Πλήρης σελίδα τεκμηρίου

Learning model-free robot control by a Monte Carlo EM algorithm

Αρχεία

Ημερομηνία

Συγγραφείς

Τίτλος Εφημερίδας

Περιοδικό ISSN

Τίτλος τόμου

Εκδότης

Περίληψη

Περιγραφή

Λέξεις-κλειδιά

Παραπομπή

Έχει διάδοχο το τεκμήριο

Είναι διάδοχο του τεκμηρίου

Περιέχει το τεκμήριο

Είναι μέρος του τεκμηρίου

Αναφέρει το τεκμήριο

Αναφέρεται από το τεκμήριο

Έπεται το τεκμήριο

Προηγείται του τεκμηρίου

Έχει ως έκδοση το τεκμήριο

Αποτελεί έκδοση του τεκμηρίου

Έχει ως συμπληρωματικό το τεκμήριο

Είναι συμπληρωματικό του τεκμηρίου

Έχει μετατραπει στο τεκμήριο

Αποτελεί μετατροπή του τεκμηρίου

URI

Συλλογές