Some behavioral and neurobiological constraints on theories of audiovisual speech integration: A review and suggestions for new directions

Nicholas Altieri, David Pisoni, James T. Townsend

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.

Original languageEnglish
Pages (from-to)513-539
Number of pages27
JournalSeeing and Perceiving
Volume24
Issue number6
DOIs
StatePublished - Nov 1 2011

Fingerprint

Speech Perception
Automatic Data Processing
Gestures
Phonetics
Brain
Speech analysis
Processing
Research
Direction compound
Networks (circuits)

Keywords

  • AUDIO-VISUAL SPEECH PERCEPTION
  • MCGURK EFFECT
  • MULTISENSORY ENHANCEMENT

ASJC Scopus subject areas

  • Sensory Systems
  • Cognitive Neuroscience
  • Ophthalmology
  • Experimental and Cognitive Psychology
  • Computer Vision and Pattern Recognition

Cite this

Some behavioral and neurobiological constraints on theories of audiovisual speech integration : A review and suggestions for new directions. / Altieri, Nicholas; Pisoni, David; Townsend, James T.

In: Seeing and Perceiving, Vol. 24, No. 6, 01.11.2011, p. 513-539.

Research output: Contribution to journalArticle

@article{24b99c3373f14c549c5157990b44502b,
title = "Some behavioral and neurobiological constraints on theories of audiovisual speech integration: A review and suggestions for new directions",
abstract = "Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.",
keywords = "AUDIO-VISUAL SPEECH PERCEPTION, MCGURK EFFECT, MULTISENSORY ENHANCEMENT",
author = "Nicholas Altieri and David Pisoni and Townsend, {James T.}",
year = "2011",
month = "11",
day = "1",
doi = "10.1163/187847611X595864",
language = "English",
volume = "24",
pages = "513--539",
journal = "Multisensory research",
issn = "2213-4794",
publisher = "Brill",
number = "6",

}

TY - JOUR

T1 - Some behavioral and neurobiological constraints on theories of audiovisual speech integration

T2 - A review and suggestions for new directions

AU - Altieri, Nicholas

AU - Pisoni, David

AU - Townsend, James T.

PY - 2011/11/1

Y1 - 2011/11/1

N2 - Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.

AB - Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.

KW - AUDIO-VISUAL SPEECH PERCEPTION

KW - MCGURK EFFECT

KW - MULTISENSORY ENHANCEMENT

UR - http://www.scopus.com/inward/record.url?scp=84858175159&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858175159&partnerID=8YFLogxK

U2 - 10.1163/187847611X595864

DO - 10.1163/187847611X595864

M3 - Article

C2 - 21968081

AN - SCOPUS:84858175159

VL - 24

SP - 513

EP - 539

JO - Multisensory research

JF - Multisensory research

SN - 2213-4794

IS - 6

ER -