Abstract: Dr. David Poeppel

How complex auditory input, in general, and speech, in particular, is represented and processed in human cortex constitutes a major interdisciplinary challenge for cognitive neuroscience. Adopting (and adapting) the perspective of Marr's (1982) approach to vision, a model is presented that formulates linking hypotheses between biological mechanisms and the representations that underlie auditory recognition.

At the implementational level, data from psychophysics, imaging, and electrophysiology suggest that speech perception is a multi-time resolution process, with temporal analyses proceeding concurrently on (at least) syllabic (~200 ms) and segmental (~30 ms) scales. Neuronal mechanisms such as temporal integration and phase-locking in well-defined frequency bands are shown to be foundational for the construction of elementary auditory events.

At the algorithmic level, psychophysical and EEG data favor an analysis-by-synthesis interpretation, in the contemporary sense of an internal forward model. Sparse information suffices to trigger internal predictions about an auditory perceptual target. At the computational level of description, it is argued that stored representations that are the goal of the perceptual process are abstract and structured. In the case of speech, distinctive features are the elementary representational currency that can effectively mediate between perception and action.