Diferencia entre revisiones de «D-RR-QL»
(Página creada con «Distributed Round-Robin Q-Learning (D-RR-QL) is a Reinforcement Learning algorithm that allows to approximate the optimal joint-policy of a multi-agent system in a two-step...») |
Sin resumen de edición |
||
(No se muestran 4 ediciones intermedias del mismo usuario) | |||
Línea 1: | Línea 1: | ||
Distributed Round-Robin Q-Learning (D-RR-QL) is a Reinforcement Learning algorithm that allows to approximate the optimal joint-policy of a multi-agent system in a two-step fashion. First, each agent learns in its own local state-action following a round-robin schedule, thus avoiding non-stationarity due to the rest of agents learning their own policies. Then a coordination procedure approximates the optimal joint-policy by a greedy selection procedure using message passing. | Distributed Round-Robin Q-Learning (D-RR-QL) is a Reinforcement Learning algorithm that allows to approximate the optimal joint-policy of a multi-agent system in a two-step fashion. First, each agent learns in its own local state-action following a round-robin schedule, thus avoiding non-stationarity due to the rest of agents learning their own policies. Then a coordination procedure approximates the optimal joint-policy by a greedy selection procedure using message passing. | ||
The main advantage of D-RR-QL is that it allows each agent to use Modular State-Action Vetoes, which is a technique that allows RL agents to boost their exploration efficiency when approaching over-constrained systems, such as Linked Multicomponent Robotic Systems. The code | The main advantage of D-RR-QL is that it allows each agent to use Modular State-Action Vetoes, which is a technique that allows RL agents to boost their exploration efficiency when approaching over-constrained systems, such as Linked Multicomponent Robotic Systems. The following source-code was used in the experiments of the following paper: | ||
"Learning Multirobot Hose Transportation and Deployment by Round-Robin Distributed Q-Learning" | ;"Learning Multirobot Hose Transportation and Deployment by Round-Robin Distributed Q-Learning" | ||
Borja Fernandez-Gauna, Ismael Etxeberria-Agiriano and Manuel Graña | :Borja Fernandez-Gauna, Ismael Etxeberria-Agiriano and Manuel Graña | ||
Plos-One | :Plos-One | ||
http://github.com/borjafdezgauna/D-RR-QL-PlosONE |
Revisión actual - 15:05 13 abr 2015
Distributed Round-Robin Q-Learning (D-RR-QL) is a Reinforcement Learning algorithm that allows to approximate the optimal joint-policy of a multi-agent system in a two-step fashion. First, each agent learns in its own local state-action following a round-robin schedule, thus avoiding non-stationarity due to the rest of agents learning their own policies. Then a coordination procedure approximates the optimal joint-policy by a greedy selection procedure using message passing.
The main advantage of D-RR-QL is that it allows each agent to use Modular State-Action Vetoes, which is a technique that allows RL agents to boost their exploration efficiency when approaching over-constrained systems, such as Linked Multicomponent Robotic Systems. The following source-code was used in the experiments of the following paper:
- "Learning Multirobot Hose Transportation and Deployment by Round-Robin Distributed Q-Learning"
- Borja Fernandez-Gauna, Ismael Etxeberria-Agiriano and Manuel Graña
- Plos-One