2 min read

Optimal demonstrations for behavior cloning

Abstract

Behavior cloning for training visuomotor policies has become a popular framework in robotics. Yet, behavior cloning heavily depends on the quality of human demonstrations that tend to be both sup-optimal and time-consuming to collect. We propose to use Graphs of Convex Sets (GCS) to automatically create optimal demonstrations and use these demonstrations to train a Diffusion Policy. We show that the action trajectories executed by the Diffusion Policy are close to the optimal ones that GCS would have run for the same initial conditions while not requiring any human demonstrations. In doing so, we reveal this novel paradigm’s potential to overcome the many downsides of human-generated demonstrations.