Differences and distinguishability in the acoustic characteristics of hello in voices of similar-sounding speakers: A forensic phonetic investigation

Rose, Phil

doi:10.1075/aral.22.1.01ros

Article published In: Australian Review of Applied Linguistics
Vol. 22:1 (1999) ► pp.1–42

Get fulltext from our e-platform

Download PDF

Differences and distinguishability in the acoustic characteristics of hello in voices of similar-sounding speakers

A forensic phonetic investigation

Phil Rose | Australian National University

Published online: 1 January 1999

https://doi.org/10.1075/aral.22.1.01ros

Abstract

Forensic Phonetics is an important application of Linguistics that has emerged as a discipline over the last decade. This paper describes a Forensic Phonetic experiment which investigates the nature of within- and between-speaker variation in the acoustic characteristics of the word hello in demonstrably similar-sounding voices. The nature of within-segment variation is determined in repeats of the same word said under different prosodic conditions in order to exclude as much of the linguistically determined variation as is consistent with the realities of the forensic situation, thus providing a good estimate of variation associated with speakers. Intonationally varying tokens of the naturally produced single word utterance hello from six adult Australian males are compared with respect to fundamental frequency, and to centre frequencies and bandwidths of the F-pattern below 5 KHz. This comprises the first five formants and extra resonances, including a possible singer’s formant and tracheal resonance. Results show that between-speaker acoustic differences are pervasive, though not ubiquitous. Magnitudes of between-speaker differences are presented for all parameters, and their forensic significance evaluated. ANOVA and chi-square tests show that even similar-sounding voices differ significantly in their acoustics, especially centre frequencies of F2-F4, formant bandwidth, and incidence of extra resonances. Simulated forensic conditions show that some of these differences are not realistically demonstrable. Nevertheless, there remain sufficient significant differences to distinguish 13 out of 15 pairs: a value of 13% for the denominator of the associated Bayesian Likelihood Ratio for the prosecution hypothesis. Directions for future research are indicated.

References (35)

References

Aitken, C.G.G. (1995) Statistics and the evaluation of evidence for forensic scientists. Chichester, John Wiley & Sons.

Baldwin, J and Peter French (1990) Forensic Phonetics. London, Pinter.

Broad, David J. (1972) Formants in automatic speech recognition. International Journal of Man-Machine Studies 141:411–424.

Butcher, Andrew (1996) Getting the voice line-up right: analysis of a multiple auditory confrontation. In Paul McCormak and Alison Russell (eds) Proceedings of the sixth Australian international conference on speech science and technology. Canberra, Australian Speech Science and Technology Association.

Cruttendon, A. (1986) Intonation. Cambridge, Cambridge University Press.

Fant, G. (1960) Acoustic theory of speech production. The Hague, Mouton.

(1973) Speech sounds and features. Cambridge, Mass., MIT press.

Gibbons, J. (1994) Introduction: Forensic Linguistics. In John Gibbons (ed.) Language and the Law. Longman, London.

van der, Giet (1987) Der Einsatz des Computers in der Sprechererkennung. In H. Kunzel (ed.) Sprechererkennung: Grundziige forensischer Sprachverarbeitung. Heidelberg, Kriminalistik Verlag.

Greisbach, Reinhold, Otto Esser and Constanze Weinstock (1995) Speaker identification by formant contours. In Angelika Braun and Peter Köster (eds) Studies in forensic phonetics. Beiträge zur Phonetik und Linguistik 64. Trier, Wissenschaftlicher Verlag.

Hirson, Allen (1995) Human laughter - A forensic phonetic perspective. In Angelika Braun and Peter Köster (eds) Studies in forensic phonetics. Beiträge zur Phonetik und Linguistik 64. Trier, Wissenschaftlicher Verlag.

Hollien, H. (1990) The Acoustics of crime . New York, Plenum.

Ingram, J. (1996) Formant trajectories as indices of phonetic variation for speaker identification. Journal of Forensic Linguistics 3, 1.

Jakobson, Roman, C. Gunnar M. Fant and Morris Halle (1952) Preliminaries to speech analysis [tenth reprint 1972]. Cambridge, Mass., MIT Press.

Jones, Alex (1994) The limitations of voice identification. In John Gibbons (ed.) Language and the Law. Harlow, Longman.

Kunzel, H.J. (1987) Sprechererkennung: Grundziige forensischer Sprachverarbeitung. Heidelberg, Kriminalistik Verlag.

Labov, William and Wendell Harris (1994) Addressing social issues through linguistic evidence. In John Gibbons (ed.) Language and the Law. Harlow, Longman.

LaRiviere, C. (1975) Contributions of fundamental frequency and formant frequencies to speaker identification. Phonetica 311: 185–197.

Naik, Jay (1994) Speaker verification over the telephone network: databases, algorithms and performance assessment. ESCA Workshop on Automatic Speaker Recognition, Identification and Verification 31–38.

Nolan, Francis (1983) The phonetic bases of speaker recognition. Cambridge, Cambridge University Press.

(1990) The limitations of auditory-phonetic speaker identification. In H. Kniffka (ed.) Texte zur Theorie and Praxis forensischer Linguistik. Tubingen, Max Niemayer Verlag.

(1997) Speaker recognition and forensic phonetics. In W. Hardcastle and J.M.D. Laver (eds) The handbook of phonetic sciences. Oxford, Blackwell.

Robertson, Bernard and G.A. Vigneaux (1995) Interpreting evidence. Chichester, Wiley.

Rose, Phil (1982) Acoustic characteristics of the Shanghai-Zhenhai phonation types. In D. Bradley (ed.) Tonation. Papers in South-East Asian Linguistics 8. Pacific Linguistics series A, 62.

(1989) On the non-equivalence of fundamental frequency and pitch in tonal description. In D. Bradley, E. Henderson and M. Mazaudon (eds) Prosodic Analysis and Asian Linguistics: to Honour R.K.Sprigg. Canberra, Pacific Linguistics.

(1995) On the Acoustics of Similar Voices. Paper given at the International Conference on Forensic Linguistics, University of New England.

(1996a) Speaker verification under realistic forensic conditions. In Paul McCormak and Alison Russell (eds) Proceedings of the sixth Australian international conference on speech science and technology. Canberra, Australian Speech Science and Technology Association.

(1996b) Aerodynamic involvement in intrinsic F0 perturbations - evidence from Thai-phake. In Paul McCormak and Alison Russell (eds) Proceedings of the sixth Australian international conference on speech science and technology. Canberra, Australian Speech Science and Technology Association.

Rose, Phil and Duncan, S. (1995) Naive auditory identification and discrimination of similar voices by familiar listeners. Journal of Forensic Linguistics 2, 1: 1–17.

Rose, Phil and Alison Simmons (1996) F-pattern variability in disguise and over the telephone - comparisons for forensic speaker identification. In Paul McCormak and Alison Russel (eds) Proceedings of the 6th Australian International Conference on Speech Science and Technology. Canberra, Australian Speech Science and Technology Association.

Schegloff, Emanuel A. (1968) Sequencing in conversational openings. American Anthropology 70, 6: 1075–95.

Stevens, Kenneth N. (1971) Sources of inter- and intra-speaker variability in the acoustic properties of speech sounds. Proceedings of the 7th international congress of phonetic sciences.

(1997) Articulatory-acoustic-auditory relationships. In Hardcastle and Laver (eds) The Handbook of Phonetic Sciences. Oxford, Blackwell.

Titze, Ingo R. (1994) Principles of voice production. Englewood Cliffs, Prentice Hall.

Wolf, J. J. (1972) Efficient acoustic parameters for speaker recognition. Journal of the Acoustical Society of America, 511: 2044–56.

Cited by (10)

Cited by ten other publications

Order by:

Hernandez-Laredo, Enrique, Marco Antonio Hernández-Galicia, René Arnulfo García-Hernández & Yulia Ledeneva

2025. Detecting Confusing Drug Names Based on the Phonetic Characteristics of Mel-Frequency Cepstral Coefficient and Evolutionary Computation. In XLVII Mexican Conference on Biomedical Engineering [IFMBE Proceedings, 116], ► pp. 159 ff.

Padmini, Palli, C. Paramasivam, G. Jyothish Lal, Sadeen Alharbi & Kaustav Bhowmick

2022. Age-Based Automatic Voice Conversion Using Blood Relation for Voice Impaired. Computers, Materials & Continua 70:2 ► pp. 4027 ff.

Gresse, Adrien, Mathias Quillot, Richard Dufour, Vincent Labatut & Jean-Francois Bonastre

2019. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), ► pp. 6585 ff.

Zuo, Donghui & Peggy Pik Ki Mok

2015. Formant dynamics of bilingual identical twins. Journal of Phonetics 52 ► pp. 1 ff.

Morrison, Geoffrey Stewart

2009. Forensic voice comparison and the paradigm shift. Science & Justice 49:4 ► pp. 298 ff.

Clermont, Frantz

2007. A Linear-Scaling Approach to Speaker Variability in Poly-segmental Formant Ensembles. In Speaker Classification II [Lecture Notes in Computer Science, 4441], ► pp. 116 ff.

McDougall, Kirsty

2006. Dynamic features of speech and the characterization of speakers: towards a new approach using formant frequencies. International Journal of Speech, Language and the Law 13:1 ► pp. 89 ff.

Zhang, Cuiling, Joost van de Weijer & Jingxu Cui

2006. Intra- and inter-speaker variations of formant pattern for lateral syllables in Standard Chinese. Forensic Science International 158:2-3 ► pp. 117 ff.

Rose, Phil

1999. Long- and short-term within-speaker differences in the formants of Australian hello. Journal of the International Phonetic Association 29:1 ► pp. 1 ff.

Rose, Phil

2006. Technical forensic speaker recognition: Evaluation, types and testing of evidence. Computer Speech & Language 20:2-3 ► pp. 159 ff.

This list is based on CrossRef data as of 14 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.