Multi-modal encoding of speech in memory

A first report

David Pisoni, Helena M. Saldana, Sonya M. Sheffert

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Why do people like to watch videos on TV? Why is there now increased interest in video telephones and multi-media technologies that were developed back in the 1960's? Obviously, the availability of new digital technology has played an enormous role in this transition. But, we also believe this is in part due to the same operating principle that encourages listeners in noisy environments to orient toward a talker's face. A multi-modal speech signal is extremely robust and informative and provides information that perceivers are able to exploit during perceptual analysis. In this paper, we present results from two experiments that examined performance in immediate memory and serial recall tasks with normal-hearing listeners using unimodal (auditory-only) and multi-modal (auditory+visual) presentation. Our findings suggest that the addition of visual information in the stimulus display about the speakers' articulation affects the efficiency of initial encoding operations at the time of perception and also results in more detailed and robust representations of the stimulus events in memory. These results have implications for current theories of speech perception and spoken language processing.

Original languageEnglish
Title of host publicationInternational Conference on Spoken Language Processing, ICSLP, Proceedings
Editors Anon
PublisherIEEE
Pages1664-1667
Number of pages4
Volume3
StatePublished - 1996
EventProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4) - Philadelphia, PA, USA
Duration: Oct 3 1996Oct 6 1996

Other

OtherProceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4)
CityPhiladelphia, PA, USA
Period10/3/9610/6/96

Fingerprint

Data storage equipment
Telephone sets
Audition
Display devices
Availability
Processing
Experiments

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Pisoni, D., Saldana, H. M., & Sheffert, S. M. (1996). Multi-modal encoding of speech in memory: A first report. In Anon (Ed.), International Conference on Spoken Language Processing, ICSLP, Proceedings (Vol. 3, pp. 1664-1667). IEEE.

Multi-modal encoding of speech in memory : A first report. / Pisoni, David; Saldana, Helena M.; Sheffert, Sonya M.

International Conference on Spoken Language Processing, ICSLP, Proceedings. ed. / Anon. Vol. 3 IEEE, 1996. p. 1664-1667.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pisoni, D, Saldana, HM & Sheffert, SM 1996, Multi-modal encoding of speech in memory: A first report. in Anon (ed.), International Conference on Spoken Language Processing, ICSLP, Proceedings. vol. 3, IEEE, pp. 1664-1667, Proceedings of the 1996 International Conference on Spoken Language Processing, ICSLP. Part 1 (of 4), Philadelphia, PA, USA, 10/3/96.
Pisoni D, Saldana HM, Sheffert SM. Multi-modal encoding of speech in memory: A first report. In Anon, editor, International Conference on Spoken Language Processing, ICSLP, Proceedings. Vol. 3. IEEE. 1996. p. 1664-1667
Pisoni, David ; Saldana, Helena M. ; Sheffert, Sonya M. / Multi-modal encoding of speech in memory : A first report. International Conference on Spoken Language Processing, ICSLP, Proceedings. editor / Anon. Vol. 3 IEEE, 1996. pp. 1664-1667
@inproceedings{0b9b608f56464184982047b39858c566,
title = "Multi-modal encoding of speech in memory: A first report",
abstract = "Why do people like to watch videos on TV? Why is there now increased interest in video telephones and multi-media technologies that were developed back in the 1960's? Obviously, the availability of new digital technology has played an enormous role in this transition. But, we also believe this is in part due to the same operating principle that encourages listeners in noisy environments to orient toward a talker's face. A multi-modal speech signal is extremely robust and informative and provides information that perceivers are able to exploit during perceptual analysis. In this paper, we present results from two experiments that examined performance in immediate memory and serial recall tasks with normal-hearing listeners using unimodal (auditory-only) and multi-modal (auditory+visual) presentation. Our findings suggest that the addition of visual information in the stimulus display about the speakers' articulation affects the efficiency of initial encoding operations at the time of perception and also results in more detailed and robust representations of the stimulus events in memory. These results have implications for current theories of speech perception and spoken language processing.",
author = "David Pisoni and Saldana, {Helena M.} and Sheffert, {Sonya M.}",
year = "1996",
language = "English",
volume = "3",
pages = "1664--1667",
editor = "Anon",
booktitle = "International Conference on Spoken Language Processing, ICSLP, Proceedings",
publisher = "IEEE",

}

TY - GEN

T1 - Multi-modal encoding of speech in memory

T2 - A first report

AU - Pisoni, David

AU - Saldana, Helena M.

AU - Sheffert, Sonya M.

PY - 1996

Y1 - 1996

N2 - Why do people like to watch videos on TV? Why is there now increased interest in video telephones and multi-media technologies that were developed back in the 1960's? Obviously, the availability of new digital technology has played an enormous role in this transition. But, we also believe this is in part due to the same operating principle that encourages listeners in noisy environments to orient toward a talker's face. A multi-modal speech signal is extremely robust and informative and provides information that perceivers are able to exploit during perceptual analysis. In this paper, we present results from two experiments that examined performance in immediate memory and serial recall tasks with normal-hearing listeners using unimodal (auditory-only) and multi-modal (auditory+visual) presentation. Our findings suggest that the addition of visual information in the stimulus display about the speakers' articulation affects the efficiency of initial encoding operations at the time of perception and also results in more detailed and robust representations of the stimulus events in memory. These results have implications for current theories of speech perception and spoken language processing.

AB - Why do people like to watch videos on TV? Why is there now increased interest in video telephones and multi-media technologies that were developed back in the 1960's? Obviously, the availability of new digital technology has played an enormous role in this transition. But, we also believe this is in part due to the same operating principle that encourages listeners in noisy environments to orient toward a talker's face. A multi-modal speech signal is extremely robust and informative and provides information that perceivers are able to exploit during perceptual analysis. In this paper, we present results from two experiments that examined performance in immediate memory and serial recall tasks with normal-hearing listeners using unimodal (auditory-only) and multi-modal (auditory+visual) presentation. Our findings suggest that the addition of visual information in the stimulus display about the speakers' articulation affects the efficiency of initial encoding operations at the time of perception and also results in more detailed and robust representations of the stimulus events in memory. These results have implications for current theories of speech perception and spoken language processing.

UR - http://www.scopus.com/inward/record.url?scp=0030376377&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030376377&partnerID=8YFLogxK

M3 - Conference contribution

VL - 3

SP - 1664

EP - 1667

BT - International Conference on Spoken Language Processing, ICSLP, Proceedings

A2 - Anon, null

PB - IEEE

ER -