Motor adaptation via distributional learning

Abstract

Both artificial and biological controllers experience errors during learning that are probabilistically distributed. We develop a framework for modeling distributions of errors and relating deviations in these distributions to neural activity. The biological system we consider is a task where human subjects are required to learn to minimize the roll of an inverted T-shaped object with an unbalanced weight (i.e. one side of the object is heavier than the other side) during lift. We also collect BOLD activity during this process. For our experimental setup, we define the state of the system to be the maximum magnitude roll of the object after lift onset and give subjects the goal of achieving the zero state. We derive a model for this problem from a variant of Temporal Difference Learning. We then combine this model with Distributional Reinforcement Learning (DRL), a framework that involves defining a value distribution by treating the reward as stochastic. This model transforms the goal of the controller from achieving a target state, to achieving a distribution over distances from the target state. We call it a Distributional Temporal Difference Model (DTDM). The DTDM allows us to model errors in unsuccessfully minimizing object roll using deviations in the value distribution when the center of mass of the unbalanced object is changed. We compute deviations in global neural activity and show that they vary continuously with deviations in the value distribution. Different aspects might contribute to this global shift or signal difference, including a difference in grasp and lift force at lift onset, as well as sensory feedback of error/roll after lift onset. We predict that there exists a coordinated, global response to errors that incorporates all of this information, which is encoding the DTDM objective and used on subsequent trials enabling success. We validate the utility of the DTDM as a model for biological adaptation by using it to engineer a robotic controller to solve a similar problem. We develop a novel theoretical framework and show that it can be used to model a non-trivial motor learning task. Because this theoretical framework is consistent with state-of-the-art reinforcement learning, we can also use it to program a robot to perform a similar task. These results suggest a way to model the multiple subsystems composing global neural activity in a way that transfers well to engineering artificial intelligence.