IBS Institute for Basic Science

A derivative-like prediction error computation by dopamine neurons in a dynamic environment


October 1(Tue) - October 1(Tue), 2019

16:00 - 17:00

# 86314

CNIR Seminar

Date: 04:00 pm Tuesday, October 1st

Place: #86314

Speaker: 김형구, Ph.D.

Harvard University, Naoshige Uchida's Lab



Title: "A derivative-like prediction error computation by dopamine neurons in a dynamic environment" 

Abstract: Previous studies have revealed an exceptional correspondence between the activity of midbrain dopamine neurons and a ‘teaching signal’ in reinforcement learning algorithms. In particular, the reward prediction error (RPE) used in the temporal difference (TD) learning algorithm captures aspects of phasic dopamine responses. However, this idea has been challenged by recent observations that dopamine signals ramp up gradually over the timescale of seconds as animals approach a reward location. It has been argued that these slow fluctuations of dopamine are inconsistent with the RPE model, and instead represent the state value, which gradually increases toward a reward location. Whether these slowly fluctuating dopamine signals represent value or RPE, and under what conditions a dopamine ramp occurs, remain elusive. As originally formulated, the TD RPE approximates the derivative of the value function. Here we developed a set of novel experimental paradigms that dissociate RPE from value. We employed visual virtual reality in mice to manipulate the location of the animal and the speed of scene movement independent of the animal’s locomotion. We found that the manipulation of scene movement – teleport and speed manipulations – caused dopamine responses in the ventral striatum that were consistent with TD RPEs but inconsistent with state values. Furthermore, we found that a more abstract, non-navigational stimulus that indicates temporal proximity to reward is sufficient to cause a dopamine ramp. These results support the previously untested central tenet of TD RPEs that dopamine neurons signal RPEs through a derivative-like computation over value on a momentby-moment basis. 


Host: Prof. Joonyeol Lee