Multimodal perceptual organization of speech: Evidence from tone analogs of spoken utterances

Robert E. Remez, Jennifer M. Fellowes, David B. Pisoni, Winston D. Goh, Philip E. Rubin

Research output: Contribution to journalArticle

19 Citations (Scopus)

Abstract

Theoretical and practical motives alike have prompted recent investigations of multimodal speech perception. Theoretically, multimodal studies have extended the conceptualization of perceptual organization beyond the familiar modality-bound accounts deriving from Gestalt psychology. Practically, such investigations have been driven by a need to understand the proficiency of multimodal speech perception using an electrocochlear prosthesis for hearing. In each domain, studies have shown that perceptual organization of speech can occur even when the perceiver's auditory experience departs from natural speech qualities. Accordingly, our research examined auditor-visual multimodal integration of videotaped faces and selected acoustic constituents of speech signals, each realized as a single sinewave tone accompanying a video image of an articulating face. The single tone reproduced the frequency and amplitude of the phonatory cycle or of one of the lower three oral formants. Our results showed a distinct advantage for the condition pairing the video image of the face with a sinewave replicating the second formant, despite its unnatural timbre and its presentation in acoustic isolation from the rest of the speech signal. Perceptual coherence of multimodal speech in these circumstances is established when the two modalities concurrently specify the same underlying phonetic attributes.

Original languageEnglish (US)
Pages (from-to)65-73
Number of pages9
JournalSpeech Communication
Volume26
Issue number1-2
DOIs
StatePublished - Oct 1998

Fingerprint

Analogue
organization
Speech Perception
Speech Signal
evidence
Face
Speech Acoustics
Modality
Acoustics
Phonetics
acoustics
Alike
Hearing
gestalt psychology
Prostheses and Implants
video
Pairing
Isolation
Psychology
Speech analysis

Keywords

  • Auditory-visual speech perception
  • Intersensory integration
  • Multimodal speech perception
  • Perceptual organization
  • Sinewave speech
  • Speechreading

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Experimental and Cognitive Psychology
  • Linguistics and Language

Cite this

Multimodal perceptual organization of speech : Evidence from tone analogs of spoken utterances. / Remez, Robert E.; Fellowes, Jennifer M.; Pisoni, David B.; Goh, Winston D.; Rubin, Philip E.

In: Speech Communication, Vol. 26, No. 1-2, 10.1998, p. 65-73.

Research output: Contribution to journalArticle

Remez, Robert E. ; Fellowes, Jennifer M. ; Pisoni, David B. ; Goh, Winston D. ; Rubin, Philip E. / Multimodal perceptual organization of speech : Evidence from tone analogs of spoken utterances. In: Speech Communication. 1998 ; Vol. 26, No. 1-2. pp. 65-73.
@article{bda73e2395724c259dd1e0203c2bb337,
title = "Multimodal perceptual organization of speech: Evidence from tone analogs of spoken utterances",
abstract = "Theoretical and practical motives alike have prompted recent investigations of multimodal speech perception. Theoretically, multimodal studies have extended the conceptualization of perceptual organization beyond the familiar modality-bound accounts deriving from Gestalt psychology. Practically, such investigations have been driven by a need to understand the proficiency of multimodal speech perception using an electrocochlear prosthesis for hearing. In each domain, studies have shown that perceptual organization of speech can occur even when the perceiver's auditory experience departs from natural speech qualities. Accordingly, our research examined auditor-visual multimodal integration of videotaped faces and selected acoustic constituents of speech signals, each realized as a single sinewave tone accompanying a video image of an articulating face. The single tone reproduced the frequency and amplitude of the phonatory cycle or of one of the lower three oral formants. Our results showed a distinct advantage for the condition pairing the video image of the face with a sinewave replicating the second formant, despite its unnatural timbre and its presentation in acoustic isolation from the rest of the speech signal. Perceptual coherence of multimodal speech in these circumstances is established when the two modalities concurrently specify the same underlying phonetic attributes.",
keywords = "Auditory-visual speech perception, Intersensory integration, Multimodal speech perception, Perceptual organization, Sinewave speech, Speechreading",
author = "Remez, {Robert E.} and Fellowes, {Jennifer M.} and Pisoni, {David B.} and Goh, {Winston D.} and Rubin, {Philip E.}",
year = "1998",
month = "10",
doi = "10.1016/S0167-6393(98)00050-8",
language = "English (US)",
volume = "26",
pages = "65--73",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",
number = "1-2",

}

TY - JOUR

T1 - Multimodal perceptual organization of speech

T2 - Evidence from tone analogs of spoken utterances

AU - Remez, Robert E.

AU - Fellowes, Jennifer M.

AU - Pisoni, David B.

AU - Goh, Winston D.

AU - Rubin, Philip E.

PY - 1998/10

Y1 - 1998/10

N2 - Theoretical and practical motives alike have prompted recent investigations of multimodal speech perception. Theoretically, multimodal studies have extended the conceptualization of perceptual organization beyond the familiar modality-bound accounts deriving from Gestalt psychology. Practically, such investigations have been driven by a need to understand the proficiency of multimodal speech perception using an electrocochlear prosthesis for hearing. In each domain, studies have shown that perceptual organization of speech can occur even when the perceiver's auditory experience departs from natural speech qualities. Accordingly, our research examined auditor-visual multimodal integration of videotaped faces and selected acoustic constituents of speech signals, each realized as a single sinewave tone accompanying a video image of an articulating face. The single tone reproduced the frequency and amplitude of the phonatory cycle or of one of the lower three oral formants. Our results showed a distinct advantage for the condition pairing the video image of the face with a sinewave replicating the second formant, despite its unnatural timbre and its presentation in acoustic isolation from the rest of the speech signal. Perceptual coherence of multimodal speech in these circumstances is established when the two modalities concurrently specify the same underlying phonetic attributes.

AB - Theoretical and practical motives alike have prompted recent investigations of multimodal speech perception. Theoretically, multimodal studies have extended the conceptualization of perceptual organization beyond the familiar modality-bound accounts deriving from Gestalt psychology. Practically, such investigations have been driven by a need to understand the proficiency of multimodal speech perception using an electrocochlear prosthesis for hearing. In each domain, studies have shown that perceptual organization of speech can occur even when the perceiver's auditory experience departs from natural speech qualities. Accordingly, our research examined auditor-visual multimodal integration of videotaped faces and selected acoustic constituents of speech signals, each realized as a single sinewave tone accompanying a video image of an articulating face. The single tone reproduced the frequency and amplitude of the phonatory cycle or of one of the lower three oral formants. Our results showed a distinct advantage for the condition pairing the video image of the face with a sinewave replicating the second formant, despite its unnatural timbre and its presentation in acoustic isolation from the rest of the speech signal. Perceptual coherence of multimodal speech in these circumstances is established when the two modalities concurrently specify the same underlying phonetic attributes.

KW - Auditory-visual speech perception

KW - Intersensory integration

KW - Multimodal speech perception

KW - Perceptual organization

KW - Sinewave speech

KW - Speechreading

UR - http://www.scopus.com/inward/record.url?scp=0032178685&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0032178685&partnerID=8YFLogxK

U2 - 10.1016/S0167-6393(98)00050-8

DO - 10.1016/S0167-6393(98)00050-8

M3 - Article

AN - SCOPUS:0032178685

VL - 26

SP - 65

EP - 73

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

IS - 1-2

ER -