Audio-visual speech perception without speech cues

Helena M. Saldana, David Pisoni, Jennifer M. Fellowes, Robert E. Remez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

A series of experiments was conducted in which listeners were presented with audio-visual sentences in a transcription task. The visual components of the stimuli consisted of a male talker's face. The acoustic components consisted of: natural speech; enveloped-shaped noise which preserved the duration and amplitude of the original speech waveform and various type of sinewave speech signals that followed the formant frequencies of a natural utterance. Further experiments demonstrated that the intelligibility of single tones increased differentially depending on which formant analog was presented. The increase in intelligibility for the sinewave speech with an added video display would be greater than the gain observed with envelope-shaped noise.

Original languageEnglish
Title of host publicationInternational Conference on Spoken Language Processing, ICSLP, Proceedings
Editors Anon
PublisherIEEE
Pages2187-2190
Number of pages4
Volume4
StatePublished - 1996
EventProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA
Duration: Oct 3 1996Oct 6 1996

Other

OtherProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4)
CityPhiladelphia, PA, USA
Period10/3/9610/6/96

Fingerprint

Speech intelligibility
Transcription
Acoustics
Experiments
Display devices

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Saldana, H. M., Pisoni, D., Fellowes, J. M., & Remez, R. E. (1996). Audio-visual speech perception without speech cues. In Anon (Ed.), International Conference on Spoken Language Processing, ICSLP, Proceedings (Vol. 4, pp. 2187-2190). IEEE.

Audio-visual speech perception without speech cues. / Saldana, Helena M.; Pisoni, David; Fellowes, Jennifer M.; Remez, Robert E.

International Conference on Spoken Language Processing, ICSLP, Proceedings. ed. / Anon. Vol. 4 IEEE, 1996. p. 2187-2190.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Saldana, HM, Pisoni, D, Fellowes, JM & Remez, RE 1996, Audio-visual speech perception without speech cues. in Anon (ed.), International Conference on Spoken Language Processing, ICSLP, Proceedings. vol. 4, IEEE, pp. 2187-2190, Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4), Philadelphia, PA, USA, 10/3/96.
Saldana HM, Pisoni D, Fellowes JM, Remez RE. Audio-visual speech perception without speech cues. In Anon, editor, International Conference on Spoken Language Processing, ICSLP, Proceedings. Vol. 4. IEEE. 1996. p. 2187-2190
Saldana, Helena M. ; Pisoni, David ; Fellowes, Jennifer M. ; Remez, Robert E. / Audio-visual speech perception without speech cues. International Conference on Spoken Language Processing, ICSLP, Proceedings. editor / Anon. Vol. 4 IEEE, 1996. pp. 2187-2190
@inproceedings{ce8acf2d483540bf97daa6e47208c496,
title = "Audio-visual speech perception without speech cues",
abstract = "A series of experiments was conducted in which listeners were presented with audio-visual sentences in a transcription task. The visual components of the stimuli consisted of a male talker's face. The acoustic components consisted of: natural speech; enveloped-shaped noise which preserved the duration and amplitude of the original speech waveform and various type of sinewave speech signals that followed the formant frequencies of a natural utterance. Further experiments demonstrated that the intelligibility of single tones increased differentially depending on which formant analog was presented. The increase in intelligibility for the sinewave speech with an added video display would be greater than the gain observed with envelope-shaped noise.",
author = "Saldana, {Helena M.} and David Pisoni and Fellowes, {Jennifer M.} and Remez, {Robert E.}",
year = "1996",
language = "English",
volume = "4",
pages = "2187--2190",
editor = "Anon",
booktitle = "International Conference on Spoken Language Processing, ICSLP, Proceedings",
publisher = "IEEE",

}

TY - GEN

T1 - Audio-visual speech perception without speech cues

AU - Saldana, Helena M.

AU - Pisoni, David

AU - Fellowes, Jennifer M.

AU - Remez, Robert E.

PY - 1996

Y1 - 1996

N2 - A series of experiments was conducted in which listeners were presented with audio-visual sentences in a transcription task. The visual components of the stimuli consisted of a male talker's face. The acoustic components consisted of: natural speech; enveloped-shaped noise which preserved the duration and amplitude of the original speech waveform and various type of sinewave speech signals that followed the formant frequencies of a natural utterance. Further experiments demonstrated that the intelligibility of single tones increased differentially depending on which formant analog was presented. The increase in intelligibility for the sinewave speech with an added video display would be greater than the gain observed with envelope-shaped noise.

AB - A series of experiments was conducted in which listeners were presented with audio-visual sentences in a transcription task. The visual components of the stimuli consisted of a male talker's face. The acoustic components consisted of: natural speech; enveloped-shaped noise which preserved the duration and amplitude of the original speech waveform and various type of sinewave speech signals that followed the formant frequencies of a natural utterance. Further experiments demonstrated that the intelligibility of single tones increased differentially depending on which formant analog was presented. The increase in intelligibility for the sinewave speech with an added video display would be greater than the gain observed with envelope-shaped noise.

UR - http://www.scopus.com/inward/record.url?scp=0030363027&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030363027&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0030363027

VL - 4

SP - 2187

EP - 2190

BT - International Conference on Spoken Language Processing, ICSLP, Proceedings

A2 - Anon, null

PB - IEEE

ER -