Article published In: New Questions for the Next Decade
Edited by Gonia Jarema, Gary Libben † and Victor Kuperman
[The Mental Lexicon 11:3] 2016
► pp. 375–400
Why we need to investigate casual speech to truly understand language production, processing and the mental lexicon
Published online: 31 December 2016
https://doi.org/10.1075/ml.11.3.03tuc
https://doi.org/10.1075/ml.11.3.03tuc
The majority of studies addressing psycholinguistic questions focus on speech produced and processed in a careful, laboratory speech style. This ‘careful’ speech is very different from the speech that listeners encounter in casual conversations. This article argues that research on casual speech is necessary to show the validity of conclusions based on careful speech. Moreover, research on casual speech produces new insights and questions on the processes underlying communication and on the mental lexicon that cannot be revealed by research using careful speech. This article first places research on casual speech in its historic perspective. It then provides many examples of how casual speech differs from careful speech and shows that these differences may have important implications for psycholinguistic theories. Subsequently, the article discusses the challenges that research on casual speech faces, which stem from the high variability of this speech style, its necessary casual context, and that casual speech is connected speech. We also present opportunities for research on casual speech, mostly in the form of new experimental methods that facilitate research on connected speech. However, real progress can only be made if these new methods are combined with advanced (still to be developed) statistical techniques.
References (97)
Anderson, A.H., Bader, M., Bard, E.G., Boyle, E., Doherty, G., Garrod, S., & Sotillo, C. (1991). The HCRC map task corpus. Language and Speech, 34(4), 351–366.
Baayen, R.H. (2008). Analyzing linguistic data. A practical introduction to statistics using r. Cambridge University Press.
Baayen, R.H., van Rij, J., de Cat, C. & Wood, S.N. (to appear). Autocorrelated errors in experimental data in the language sciences: Some solutions offered by Generalized Additive Mixed Models. In D. Speelman, K. Heylen, & D. Geeraerts (Eds.), Mixed effects regression models in linguistics. Berlin: Springer. Retrieved from [URL].
Bates, D., Kliegl, R., Vasishth, S. & Baayen, R.H. (submitted). Parsimonious mixed models.
Bentum, M., Ernestus, M., ten Bosch, L. & van den Bosch, A. (submitted). How do speech registers differ in the predictability of words?
Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L., Laface, P., Mertins, A., Ris, S., Rose, R., Tyagi, V., & Wellekens, C. (2007). Automatic speech recognition and speech variability: A review. Speech Communication, 49(10), 763–786.
Bernhard, D., & Tucker, B. (2015). The effects of duration on human processing of reduced speech. Canadian Acoustics, 43(3).
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge University Press.
Brand, Sophie, & Ernestus, Mirjam. (submitted). How do native listeners and learners of French comprehend French word pronunciation variants?
Brenner, D. (2013). The acoustics of Mandarin tones in careful and conversational speech. The Journal of the Acoustical Society of America, 134(5), 4246.
Brenner, D.S. (2015). The phonetics of Mandarin tones in conversation. Retrieved from [URL]
Brouwer, S., Mitterer, H., & Huettig, F. (2012). Speech reductions change the dynamics of competition during spoken word recognition. Language and Cognitive Processes, 27(4), 539–571.
Bürki, A., Ernestus, M., Gendrot, C., Fougeron, C., & Frauenfelder, U.H. (2011). What affects the presence versus absence of schwa and its duration: A corpus analysis of French connected speech. The Journal of the Acoustical Society of America, 130(6), 3980–3991.
Bürki, A., Ernestus, M., & Frauenfelder, U.H. (2010). Is there only one “fenêtre” in the production lexicon? On-line evidence on the nature of phonological representations of pronunciation variants for French schwa words. Journal of Memory and Language, 621, 421–437.
Çetin, Ö., & Shriberg, E. (2006). Speaker overlaps and ASR errors in meetings: Effects before, during, and after the overlap. In
2006 IEEE international conference on Acoustics Speech and Signal Processing Proceedings
(vol. 11).
Chen, T.-Y., & Tucker, B.V. (2013). Sonorant onset pitch as a perceptual cue of lexical tones in Mandarin. Phonetica, 70(3), 207–239.
Connine, C.M., & Titone, D. (1996). Phoneme monitoring. Language and Cognitive Processes, 11(6), 635–646.
De Chat, C. (2007). French dislocation. interpretation, syntax, acquisition [Oxford Studies in Theoretical Linguistics, 17] (pp. 288). Oxford: Oxford University Press.
Dilts, P.C. (2013). Modelling phonetic reduction in a corpus of spoken English using random forests and mixed-effects regression (Thesis). Retrieved from [URL]
Drijvers, L., & Özyürek, A. (in press). Visual context enhanced: The joint contribution of iconic gestures and visible speech to degraded speech comprehension. Journal of Speech, Language, and Hearing Research.
Engen, K.J.V., Baese-Berk, M., Baker, R.E., Choi, A., Kim, M., & Bradlow, A.R. (2010). The wildcat corpus of native-and foreign-accented English: Communicative efficiency across conversational dyads with varying language alignment profiles. Language and Speech, 53(4), 510–540.
Ernestus, M. (2000). Voice assimilation and segment reduction in casual Dutch: A corpus-based study of the phonology-phonetic interface. Holland Institute of Generative Linguistics, Utrecht.
. (2012). Message related variation: Segmental within speaker variation. In A.C. Cohn, C. Fougeron, & M. Huffman (Eds.), The Oxford handbook of laboratory phonology (pp. 92–102). Oxford: Oxford University Press.
Ernestus, M., & R.H. Baayen. (2011). Corpora and exemplars in phonology. In J. Goldsmith, J. Riggle, & A. Yu (Eds.), The handbook of phonological theory (2nd ed., pp. 374–400). Chichester, West Sussex: Wiley-Blackwell.
Ernestus, M., Baayen, R.H., & Schreuder, R. (2002). The recognition of reduced word forms. Brain and Language, 811, 162–173.
Ernestus, M., Hanique, I., & Verboom, E. (2015). The effect of speech situation on the occurrence of reduced word pronunciation variants. Journal of Phonetics, 481, 60–75.
Ernestus, M., Lahey, M., Verhees, F., & Baayen, R.H. (2006). Lexical frequency and voice assimilation. Journal of the Acoustical Society of America, 1201, 1040–1051.
Fowler, C.A., & Turvey, M.T. (1981). Immediate compensation in bite-block speech. Phonetica, 37(5–6), 306–326.
Fu, Q., Zeng, F. (2000). Identification of temporal envelop cues in Chinese tone recognition. Asia Pacific Journal of Speech Language and Hearing, 51, 45–57.
Gahl, S., Yao, Y., & Johnson, K. (2012). Why reduce? Phonological neighborhood density and phonetic reduction in spontaneous speech. Journal of Memory and Language, 66(4), 789–806.
Galliano, S., Georois, E., Mostefa, D., Choukri, K., Bonastre, J.-F., & Gravier, J. (2005). ESTER phase II evaluation campaign for the rich transcription of French broadcast news. Proc. Interspeech 20051, 2453–2456.
Gaskell, G., & William, M.-W. (1998). Mechanisms of phonological inference in speech perception. Journal of Experimental Psychology: Human Perception and Performance, 241, 380–396.
Gaygen, D.E., & Luce, P.A. (1998). Effects of modality on subjective frequency estimates and processing of spoken and printed words. Perception & Psychophysics, 60(3), 465–483.
Gick, B. (2002). The use of ultrasound for linguistic phonetic fieldwork. Journal of the International Phonetic Association, 32(02), 113–121.
Godfrey, J.J., Holliman, E.C., & McDaniel, J. (1992). Switchboard: Telephone speech corpus for research and development. In
1992 IEEE international conference on Acoustics, Speech, and Signal Processing, 1992. ICASSP-92
(vol. 11, pp. 517–520).
Greenberg, S. (1999). Speaking in shorthand – A syllable-centric perspective for understanding pronunciation variation. Speech Communication, 291, 159–176.
Goldinger, S.D., & Papesh, M.H. (2012). Pupil dilation reflects the creation and retrieval of memories. Current Directions in Psychological Science, 21(2), 90–95.
Heylighen, F., & Dewaele, J.-M. (2002). Variation in the contextuality of language: An empirical measure. Foundations of Science, 7(3), 293–340.
Hymes, D. (1992). The concept of communicative competence revisited. Thirty years of linguistic evolution. In Studies in honour of René Dirven on the occasion of his sixtieth birthday (pp. 31–57).
Kemps, R., Ernestus, M., Schreuder, R., & Baayen, R.H. (2004). Processing reduced word forms: The suffix restoration effect. Brain and Language, 191, 117–127.
Klingner, J., Tversky, B., & Hanrahan, P. (2011). Effects of visual and verbal presentation on cognitive load in vigilance, memory, and arithmetic tasks. Psychophysiology, 48(3), 323–332.
Koch, X., & Janse, E. (2016). Speech rate effects on the processing of conversational speech across the adult life span. The Journal of the Acoustical Society of America, 139(4), 1618–1636.
Kruschke, J.K. (2010). What to believe: Bayesian methods for data analysis. Trends in Cognitive Sciences, 14(7), 293–300.
Kryuchkova, T., Tucker, B.V., Wurm, L.H., & Baayen, R.H. (2012). Danger and usefulness are detected early in auditory lexical processing: Evidence from electroencephalography. Brain and Language, 122(2), 81–91.
Lahiri, A., & Reetz, H. (2002). ‘Underspecified recognition’. In Carlos Gussenhoven, Natasha Warner, & Toni Rietveld (Eds.), Phonology & phonetics: Laboratory phonology VII (pp. 637–676). Berlin, Mouton.
Levelt, W.J.M., Roelofs, A., & Meyer, A.S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 221, 1–38.
Lindblom, B. (1963). Spectrographic study of vowel reduction. The Journal of the Acoustical Society of America, 35(11), 1773–1781.
Liu, S., & Samuel, A.G. (2004). Perception of Mandarin lexical tones when F0 information is neutralized. Language & Speech, 471, 109–138.
MacWhinney, B. (2000). The childes project: Tools for analyzing talk (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
McLennan, C.T., Luce, P.A., & Charles-Luce, J. (2003). Representation of lexical form. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29(4), 539–553.
Mehta, G., & Cutler, A. (1988). Detection of target phonemes in spontaneous and read speech. Language and Speech, 31(Pt 2), 135–156.
Mirman, D. , Dixon, J.A., & Magnuson, J.S. (2008). Statistical and computational models of the visual world paradigm: Growth curves and individual differences. Journal of Memory and Language, 59(4), 475–494.
Mulder, K., ten Bosch, L., & Boves, L. (submitted). Comparing different methods for analyzing ERP signals.
Munson, B., & Solomon, N.P. (2004). The effect of phonological neighborhood density on vowel articulation. Journal of Speech, Language, and Hearing Research, 47(5), 1048–1058.
Oleson, J.J., Cavanaugh, J.E., McMurray, B., & Brown, G. (2015). Detecting time-specific differences between temporal nonlinear curves: Analyzing data from the visual world paradigm. Statistical Methods in Medical Research, 0962280215607411.
Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E., & Fosler-Lussier, E. (2007). Buckeye corpus of conversational speech (2nd release) [[URL]] Columbus, OH: Department of Psychology. Ohio State University (Distributor).
Pluymaekers, M., Ernestus, M., & Baayen, R. (2006). Articulatory planning is continuous and sensitive to informational redundancy. Phonetica, 62(2–4), 146–159.
Podlubny, R., Geeraert, K., Tucker, B.V. (2015). It’s all about, like, acoustics. Proceedings of the
18th international Congress of Phonetic Sciences
. Glasgow, UK: The University of Glasgow. Paper number 0477.
Podlubny, R., Tucker, B.V., & Nearey, T. (2011). ‘Sorry, what was that?’: The roles of pitch, duration, and amplitude in the perception of reduced speech. Poster presented at the
Nijmegen Spontaneous Speech Workshop
, Nijmegen, NL.
Pollack, I., & Pickett, J.M. (1963). Intelligibility of excerpts from conversational speech. Language and Speech, 61, 165–171.
Ranbom, L.J., & Connine, C.M. (2007). Lexical representation of phonological variation in spoken word recognition. Journal of Memory and Language, 57(2), 273–298.
Richter, E. 1930. Beobachtungen über Anglitt und Abglitt an Sprachkurven und umgekehrt laufenden Phonogrammplatten. In Paul Menzerath (Ed.), Berichte über die I. Tagung der Internationalen Gesellschaft für experimentelle Phonetik (pp. 87–90). Bonn: Scheur.
Ruiter, de, L.E. (2015). Information status marking in spontaneous vs. read speech in story-telling tasks – Evidence from intonation analysis using GToBI. Journal of Phonetics, 481, 29–44.
Schönle, P.W., Gräbe, K., Wenig, P., Höhne, J., Schrader, J., & Conrad, B. (1987). Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain and Language, 31(1), 26–35.
Schweitzer, K., Walsh, M., Calhoun, S., Schütze, H., Möbius, B., Schweitzer, A., & Dogil, G. (2015). Exploring the relationship between intonation and the lexicon: Evidence for lexicalised storage of intonation. Speech Communication, 661, 65–81.
Stone, M. (1990). A three‐dimensional model of tongue movement based on ultrasound and X‐ray microbeam data. The Journal of the Acoustical Society of America, 87(5), 2207–2217.
Taft, M., & Chen, H.C. (1992). Judging homophony in Chinese: The influence of tones. Advances in Psychology, 901, 151–172.
Tagliamonte, S.A., & Baayen, R.H. (2012). Models, forests, and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change, 24(2), 135–178.
Torreira, F., Adda-Decker, M., & Ernestus, M. (2010). The nijmegen corpus of casual French. Speech Communication, 521, 201–221.
Tucker, B.V. (2007). Spoken word recognition of the reduced American English Flap. The University of Arizona. Retrieved from [URL]
. (2011). The effect of reduction on the processing of flaps and /g/ in isolated words. Journal of Phonetics, 39(3), 312–318.
Tyrone, M.E., & Mauk, C.E. (2010). Sign lowering and phonetic reduction in American Sign Language. Journal of Phonetics, 38(2), 317–328.
van Rij, J., Natalya, P., van Rijn, H., Wood, S.N., & Baayen, R.H. (submitted). Pupil dilation to study cognitive processing: Challenges and solutions for time course analyses.
Van de Ven, M., Ernestus, M., & Schreuder, R . (2012). Predicting acoustically reduced words in spontaneous speech: The role of semantic/syntactic and acoustic cues in context. Laboratory Phonology, 31, 455–481.
Viebahn, M., Ernestus, M., & McQueen, J. (2015). Syntactic predictability in the recognition of carefully and casually produced speech. Journal of Experimental Psychology: Learning, Memory, and Cognition, 41(6), 1684–1702.
Wagner, P., Trouvain, J., & Zimmerer, F. (2015). In defense of stylistic diversity in speech research. Journal of Phonetics, 481, 1–12.
Warner, N. (2011). Reduction. In M. van Oostendorp, C. Ewen, E. Hume, & K. Rice (Eds.), The Blackwell Companion to Phonology: General issues and segmental phonology (vol. 11, pp. 1866–1891). John Wiley & Sons.
. (2012). Methods for studying spontaneous speech. In A. Cohn, C. Fougeron, & M. Huffman (Eds.), The Oxford handbook of laboratory phonology (pp. 621–633). Oxford: Oxford University Press.
Warner, N., & Tucker, B.V. (2011). Phonetic variability of stops and flaps in spontaneous and careful speech. The Journal of the Acoustical Society of America, 130(3), 1606–1617.
Wiggers, P., & Rothkrantz, L.J.M. (2007). Exploratory analysis of word use and sentence length in the spoken Dutch Corpus. In V. Matoušek & P. Mautner (Eds.), Text, speech and dialogue (pp. 366–373). Springer Berlin Heidelberg.
Willems, R.M., Frank, S.L., Nijhof, A.D., Hagoort, P., & Bosch, A. van den. (2016). Prediction during natural language comprehension. Cerebral Cortex, 26(6), 2506–2516.
Wrench, A.A., & Scobbie, J.M. (2011). Very high frame rate ultrasound tongue imaging. In Proceedings of the
9th International Seminar On Speech Production (ISSP)
(pp. 155–162).
Wurm, L.H., & Fisicaro, S.A. (2014). What residualizing predictors in regression analyses does (and what it does not do). Journal of Memory and Language, 721, 37–48.
Xiong, W., Droppo, J., Huang, X., Seide, F., Seltzer, M., Stolcke, A., & Zweig, G. (2016). The Microsoft 2016 Conversational Speech Recognition System. arXiv:1609.03528 [Cs]. Retrieved from [URL]
Cited by (32)
Cited by 32 other publications
Huettig, Falk & Jan Hulstijn
Linke, Julian, Sophie Steger, Philipp Steinwender, Gernot Kubin, Franz Pernkopf & Barbara Schuppler
Machač, Pavel & Mirjam Fried
2025. Utterance comprehension in spontaneous speech. In Multimodal Communication from a Construction Grammar Perspective [Constructional Approaches to Language, 38], ► pp. 69 ff.
Lorenz, David & David Tizón-Couto
Ondondo, Emily Ayieta
Ondondo, Emily Ayieta
Schuppler, Barbara, Martine Adda-Decker, Catia Cucchiarini & Rudolf Muhr
Soo, Rachel, Molly Babel & Khia A. Johnson
Watkins, Freya, Diar Abdlkarim, Bodo Winter & Robin L. Thompson
Herrero de Haro, Alfredo & John Hajek
Nenadić, Filip, Benjamin V. Tucker & Louis ten Bosch
Verbeke, Gil & Ellen Simon
Warner, Natasha
Beechey, Timothy
Miles, Kelly, Timothy Beechey, Virginia Best & Jörg Buchholz
Schebesta, Annika & Gero Kunter
Dayter, Maria & Elena Riekhakaynen
Engemann, Marie & Ingo Plag
2021. Phonetic reduction and paradigm uniformity effects in spontaneous speech. The Mental Lexicon 16:1 ► pp. 165 ff.
Mclennan, Conor T. & Sara Incera
Stein, Simon David & Ingo Plag
Stein, Simon David & Ingo Plag
Westbury, Chris
2021. Pay no attention to that man behind the curtain. In Polylogues on The Mental Lexicon, ► pp. 45 ff.
Nenadić, Filip & Benjamin V. Tucker
Baese-Berk, Melissa M., Laura C. Dilley, Molly J. Henry, Louis Vinke & Elina Banzina
Orzechowska, Paula
Tucker, Benjamin V., Daniel Brenner, D. Kyle Danielson, Matthew C. Kelley, Filip Nenadić & Michelle Sims
Felker, E., A. Troncoso-Ruiz, M. Ernestus & M. Broersma
Podlubny, Ryan G., Terrance M. Nearey, Grzegorz Kondrak & Benjamin V. Tucker
Ben Hedia, Sonia & Ingo Plag
This list is based on CrossRef data as of 27 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
