Coding of the speech spectrum in three time-varying sinusoids

R. E. Remez, P. E. Rubin, David Pisoni

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Recent perceptual experiments with normal adult listeners show that phonetic information can readily be conveyed by sinewave replicas of speech signals. These tonal patterns are made of three sinusoids set equal in frequency and amplitude to the respective peaks of the first three formants of natural-speech utterances. Unlike natural and most synthetic speech, the spectrum of sinusoidal patterns contains neither harmonics nor broadband formants, and is identified as grossly unnatural in voice timbre. Despite this drastic recoding of the short-time speech spectrum, listeners perceive the phonetic content if the temporal properties of spectrum variation are preserved. These observations suggest that phonetic perception may depend on properties of coherent spectrum variation, a second-order property of the acoustic signal, rather than any particular set of acoustic elements present in speech signals.

Original languageEnglish (US)
Pages (from-to)485-489
Number of pages5
JournalAnnals of the New York Academy of Sciences
VolumeVol. 405
StatePublished - 1983

Fingerprint

Phonetics
Speech analysis
Acoustics
Formants
Listeners
Experiments
Timbre
Harmonics
Utterance
Tonal
Natural Speech
Experiment

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Coding of the speech spectrum in three time-varying sinusoids. / Remez, R. E.; Rubin, P. E.; Pisoni, David.

In: Annals of the New York Academy of Sciences, Vol. Vol. 405, 1983, p. 485-489.

Research output: Contribution to journalArticle

@article{d679db4dfb16464195c227c45004170d,
title = "Coding of the speech spectrum in three time-varying sinusoids",
abstract = "Recent perceptual experiments with normal adult listeners show that phonetic information can readily be conveyed by sinewave replicas of speech signals. These tonal patterns are made of three sinusoids set equal in frequency and amplitude to the respective peaks of the first three formants of natural-speech utterances. Unlike natural and most synthetic speech, the spectrum of sinusoidal patterns contains neither harmonics nor broadband formants, and is identified as grossly unnatural in voice timbre. Despite this drastic recoding of the short-time speech spectrum, listeners perceive the phonetic content if the temporal properties of spectrum variation are preserved. These observations suggest that phonetic perception may depend on properties of coherent spectrum variation, a second-order property of the acoustic signal, rather than any particular set of acoustic elements present in speech signals.",
author = "Remez, {R. E.} and Rubin, {P. E.} and David Pisoni",
year = "1983",
language = "English (US)",
volume = "Vol. 405",
pages = "485--489",
journal = "Annals of the New York Academy of Sciences",
issn = "0077-8923",
publisher = "Wiley-Blackwell",

}

TY - JOUR

T1 - Coding of the speech spectrum in three time-varying sinusoids

AU - Remez, R. E.

AU - Rubin, P. E.

AU - Pisoni, David

PY - 1983

Y1 - 1983

N2 - Recent perceptual experiments with normal adult listeners show that phonetic information can readily be conveyed by sinewave replicas of speech signals. These tonal patterns are made of three sinusoids set equal in frequency and amplitude to the respective peaks of the first three formants of natural-speech utterances. Unlike natural and most synthetic speech, the spectrum of sinusoidal patterns contains neither harmonics nor broadband formants, and is identified as grossly unnatural in voice timbre. Despite this drastic recoding of the short-time speech spectrum, listeners perceive the phonetic content if the temporal properties of spectrum variation are preserved. These observations suggest that phonetic perception may depend on properties of coherent spectrum variation, a second-order property of the acoustic signal, rather than any particular set of acoustic elements present in speech signals.

AB - Recent perceptual experiments with normal adult listeners show that phonetic information can readily be conveyed by sinewave replicas of speech signals. These tonal patterns are made of three sinusoids set equal in frequency and amplitude to the respective peaks of the first three formants of natural-speech utterances. Unlike natural and most synthetic speech, the spectrum of sinusoidal patterns contains neither harmonics nor broadband formants, and is identified as grossly unnatural in voice timbre. Despite this drastic recoding of the short-time speech spectrum, listeners perceive the phonetic content if the temporal properties of spectrum variation are preserved. These observations suggest that phonetic perception may depend on properties of coherent spectrum variation, a second-order property of the acoustic signal, rather than any particular set of acoustic elements present in speech signals.

UR - http://www.scopus.com/inward/record.url?scp=0020645030&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0020645030&partnerID=8YFLogxK

M3 - Article

C2 - 6575670

AN - SCOPUS:0020645030

VL - Vol. 405

SP - 485

EP - 489

JO - Annals of the New York Academy of Sciences

JF - Annals of the New York Academy of Sciences

SN - 0077-8923

ER -