Diferencia entre revisiones de «D-RR-QL»

Revisión del 10:24 24 oct 2014

Distributed Round-Robin Q-Learning (D-RR-QL) is a Reinforcement Learning algorithm that allows to approximate the optimal joint-policy of a multi-agent system in a two-step fashion. First, each agent learns in its own local state-action following a round-robin schedule, thus avoiding non-stationarity due to the rest of agents learning their own policies. Then a coordination procedure approximates the optimal joint-policy by a greedy selection procedure using message passing.

The main advantage of D-RR-QL is that it allows each agent to use Modular State-Action Vetoes, which is a technique that allows RL agents to boost their exploration efficiency when approaching over-constrained systems, such as Linked Multicomponent Robotic Systems. The following source-code was used in the experiments of the following paper:

"Learning Multirobot Hose Transportation and Deployment by Round-Robin Distributed Q-Learning" Borja Fernandez-Gauna, Ismael Etxeberria-Agiriano and Manuel Graña Plos-One

Anónimo

Buscar

Diferencia entre revisiones de «D-RR-QL»

Espacios de nombres

Más

Acciones de página

Revisión del 10:24 24 oct 2014

Navegación

Navegación

Herramientas wiki

Herramientas wiki

@@ Línea 1: / Línea 1: @@
 Distributed Round-Robin Q-Learning (D-RR-QL) is a Reinforcement Learning algorithm that allows to approximate the optimal joint-policy of a multi-agent system in a two-step fashion. First, each agent learns in its own local state-action following a round-robin schedule, thus avoiding non-stationarity due to the rest of agents learning their own policies. Then a coordination procedure approximates the optimal joint-policy by a greedy selection procedure using message passing.
-The main advantage of D-RR-QL is that it allows each agent to use Modular State-Action Vetoes, which is a technique that allows RL agents to boost their exploration efficiency when approaching over-constrained systems, such as Linked Multicomponent Robotic Systems. The code that follows was used in the experiments of the following paper:
+The main advantage of D-RR-QL is that it allows each agent to use Modular State-Action Vetoes, which is a technique that allows RL agents to boost their exploration efficiency when approaching over-constrained systems, such as Linked Multicomponent Robotic Systems. The following source-code was used in the experiments of the following paper:
 "Learning Multirobot Hose Transportation and Deployment by Round-Robin Distributed Q-Learning"
 Borja Fernandez-Gauna, Ismael Etxeberria-Agiriano and Manuel Graña
 Plos-One

Anónimo

Buscar

Diferencia entre revisiones de «D-RR-QL»

Revisión del 10:24 24 oct 2014

Navegación

Herramientas wiki

Herramientas de página