The International Conference for High Performance Computing, Networking, Storage, and Analysis

Research and ACM SRC Posters Archive

Shortcut Mixup Policy: Toward Improving Robustness and Speed in Goal-Conditioned RL


Poster Type: Research Posters

Author: Matthew Hyatt (Loyola University Chicago, Argonne National Laboratory (ANL)), Yassir Atlas (Argonne National Laboratory (ANL)), Hal Brynteson (Argonne National Laboratory (ANL)), Diego Roa Perdomo (Argonne National Laboratory (ANL)), Athena Angara (Argonne National Laboratory (ANL)), Mengjiao Han (Argonne National Laboratory (ANL)), Joseph Insley (Argonne National Laboratory (ANL)), Janet Knowles (Argonne National Laboratory (ANL)), Yongho Kim (Argonne National Laboratory (ANL)), Victor Mateevitsi (Argonne National Laboratory (ANL)), Michael Papka (Argonne National Laboratory (ANL)), Silvio Rizzi (Argonne National Laboratory (ANL)), George Thiruvathukal (Loyola University Chicago, Argonne National Laboratory (ANL)), Nicola Ferrier (Argonne National Laboratory (ANL))

Supervisor:

Abstract: Neural networks trained on large datasets can be effective policies for the control of robotic manipulators. Using self-supervised learning, these networks can achieve near-perfect success rates on complex pick-and-place-style tasks. However, the speed of task completion is often a barrier to making learned policies practical for deployment. For instance, tasks that require 500 distinct token predictions will require many forward passes through the network, in real time. Moreover, to learn optimal task behavior—as in reinforcement learning—would require state value assignment across a long time horizon. This is often an impediment to learning. To address these challenges, we present Shortcut Mixup Policy, a method to artificially reduce the task horizon length. Our method consists of training a model on next-token prediction tasks optionally conditioned on a target state-shortcut size. We present initial results using Shortcut Mixup Policy and propose future directions for improvement.

Best Poster Finalist (BP): no
Poster: PDF
Poster Summary: PDF


Back to Poster Archive Listing