Cited 7 time in
StARformer: Transformer With State-Action-Reward Representations for Robot Learning
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Shang, Jinghuan | - |
| dc.contributor.author | Li, Xiang | - |
| dc.contributor.author | Kahatapitiya, Kumara | - |
| dc.contributor.author | Lee, Yu-Cheol | - |
| dc.contributor.author | Ryoo, Michael S. | - |
| dc.date.accessioned | 2024-08-08T08:31:01Z | - |
| dc.date.available | 2024-08-08T08:31:01Z | - |
| dc.date.issued | 2023-11 | - |
| dc.identifier.issn | 0162-8828 | - |
| dc.identifier.issn | 1939-3539 | - |
| dc.identifier.uri | https://scholarworks.dongguk.edu/handle/sw.dongguk/20484 | - |
| dc.description.abstract | Reinforcement Learning (RL) can be considered as a sequence modeling task, where an agent employs a sequence of past state-action-reward experiences to predict a sequence of future actions. In this work, we propose State-Action-Reward Transformer (StARformer), a Transformer architecture for robot learning with image inputs, which explicitly models short-term state-action-reward representations (StAR-representations), essentially introducing a Markovian-like inductive bias to improve long-term modeling. StARformer first extracts StAR-representations using self-attending patches of image states, action, and reward tokens within a short temporal window. These StAR-representations are combined with pure image state representations, extracted as convolutional features, to perform self-attention over the whole sequence. Our experimental results show that StARformer outperforms the state-of-the-art Transformer-based method on image-based Atari and DeepMind Control Suite benchmarks, under both offline-RL and imitation learning settings. We find that models can benefit from our combination of patch-wise and convolutional image embeddings. StARformer is also more compliant with longer sequences of inputs than the baseline method. Finally, we demonstrate how StARformer can be successfully applied to a real-world robot imitation learning setting via a human-following task. © 1979-2012 IEEE. | - |
| dc.format.extent | 16 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | IEEE | - |
| dc.title | StARformer: Transformer With State-Action-Reward Representations for Robot Learning | - |
| dc.type | Article | - |
| dc.publisher.location | 미국 | - |
| dc.identifier.doi | 10.1109/TPAMI.2022.3204708 | - |
| dc.identifier.scopusid | 2-s2.0-85137939771 | - |
| dc.identifier.wosid | 001085050900010 | - |
| dc.identifier.bibliographicCitation | IEEE Transactions on Pattern Analysis and Machine Intelligence, v.45, no.11, pp 12862 - 12877 | - |
| dc.citation.title | IEEE Transactions on Pattern Analysis and Machine Intelligence | - |
| dc.citation.volume | 45 | - |
| dc.citation.number | 11 | - |
| dc.citation.startPage | 12862 | - |
| dc.citation.endPage | 12877 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Artificial Intelligence | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
| dc.subject.keywordAuthor | imitation learning | - |
| dc.subject.keywordAuthor | reinforcement learning | - |
| dc.subject.keywordAuthor | robot learning | - |
| dc.subject.keywordAuthor | Transformer | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
30, Pildong-ro 1-gil, Jung-gu, Seoul, 04620, Republic of Korea+82-2-2260-3114
Copyright(c) 2023 DONGGUK UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
