Learning state-action correspondence across reinforcement learning control tasks via partially paired trajectories


연구 분야: Artificial Intelligence



학회: Applied Intelligence


초록

In many reinforcement learning (RL) tasks, the state-action space may be subject to changes over time (e.g., increased number of observable features, changes of representation of actions). Given these changes, the previously learnt policy will likely fail due to the mismatch of input and output features, and another policy must be trained from scratch, which is inefficient in terms of sample complexity. Recent works in transfer learning have succeeded in making RL algorithms more efficient by incorporating knowledge from previous tasks, thus partially alleviating this problem. However, such methods typically must provide an explicit state-action correspondence of one task into the other. An autonomous agent may not have access to such high-level information, but should be able to analyze its experience to identify similarities between tasks. In this paper, we propose a novel method for automatically learning a correspondence of states and actions from one task to another through an agent’s experience. In contrast to previous approaches, our method is based on two key insights: i) only the first state of the trajectories of the two tasks is paired, while the rest are unpaired and randomly collected, and ii) the transition model of the source task is used to predict the dynamics of the target task, thus aligning the unpaired states and actions. Additionally, this paper intentionally decouples the learning of the state-action corresponce from the transfer technique used, making it easy to combine with any transfer method. Our experiments demonstrate that our approach significantly accelerates transfer learning across a diverse set of problems, varying in state/action representation, physics parameters, and morphology, when compared to state-of-the-art algorithms that rely on cycle-consistency.


Author Profile
Javier García

Department of Electronics and Computer Science Universidad de Santiago de Compostela Lugo Spain

Andorra
Author Profile
Iñaki Rañó

Department of Electronics and Computer Science Universidad de Santiago de Compostela Lugo Spain

Andorra
Author Profile
J. Miguel Burés

CiTIUS (Centro de Investigación en Tecnoloxías Intelixentes) Universidad de Santiago de Compostela Santiago de Compostela Spain

Germany

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Germany, Andorra
사이트 Springer
좋아요 수 0

연관 논문 목록 (219건)