Article published In: Review of Cognitive Linguistics
Vol. 13:1 (2015) ► pp.191–219
Intonation unit boundaries and the storage of bigrams
Evidence from bidirectional and directional association measures
Published online: 23 June 2015
https://doi.org/10.1075/rcl.13.1.08wah
https://doi.org/10.1075/rcl.13.1.08wah
Much recent work on language and cognition has examined the psychological
status of collocations/formulas/multi-word expressions as mentally stored units.
These studies have used a variety of statistical metrics to quantify the degree of
strength or association of these sequences, and then they have correlated these
strengths with particular behavioral effects that evidence mental storage. However,
the relationship between intonational prosody and storage of collocations
has received little attention. Through a corpus-based approach, this study examines
the hypothesis that boundaries between successive intonation units avoid
splitting word bigrams that exhibit high statistical association, with such high
association taken to be an index of mental storage of these bigrams. Conversely,
bigrams exhibiting lower statistical association ought to be more likely to be
split by intonation unit boundaries under this hypothesis.
References (50)
Altenberg, B. (1990a). Automatic text segmentation into tone units. In J. Svartvik (Ed.), The London-Lund corpus of spoken English: Description and research (pp. 287–324). Lund, UK: Lund University Press.
. (1990b). Predicting text segmentation into tone units. In J. Svartvik (Ed.), The London-Lund corpus of spoken English: Description and research (pp. 275–86). Lund, UK: Lund University Press.
Amir, N., Silber-Varod, V., & Izre’el, S. (2004). Characteristics of intonation unit boundaries in spontaneous spoken Hebrew: Perception and acoustic correlates. In B. Bel & I. Marlien (Eds.), Speech prosody 2004 (pp. 677–680). Conference proceedings. Nara, Japan, March 23–26.
Arnon, I., & Snider, N. (2010). More than words: Frequency effects for multi-word phrases. Journal of Memory and Language, 62(1), 67–82.
Barth-Weingarten, D. (2013). From “intonation units” to cesuring: An alternative approach to the prosodic-phonetic structuring of talk-in-interaction. In B. Szczepek Reed & G. Raymond (Eds.), Units of talk: Units of action (pp. 91–124). Philadelphia: John Benjamins.
Brook O’Donnell, M. (2011). The adjusted frequency list: A method to produce cluster-sensitive frequency lists. ICAME Journal, 351, 135–169.
Bybee, J. (2002). Sequentiality as the basis of constituent structure. In T. Givón & B.F. Malle (Eds.), The evolution of language out of pre-language (pp. 109–132). Amsterdam: John Benjamins.
Bybee, J., & Scheibman, J. (1999). The effect of usage on degrees of constituency: The reduction of don’t in English. Linguistics, 37(4), 575–596.
Chafe, W.L. (1987). Cognitive constraints on information flow. In R.S. Tomlin (Ed.), Coherence and grounding in discourse (pp. 21–51). Amsterdam: John Benjamins.
. (1994). Discourse, consciousness, and time: The flow and displacement of conscious experience in speaking and writing. Chicago: The University of Chicago Press.
Crystal, D. (1969). Prosodic systems and intonation in English. Cambridge: Cambridge University Press.
Daudaravičius, V., & Marcinkevičienė, R. (2004). Gravity counts for the boundaries of collocations. International Journal of Corpus Linguistics, 9(2), 321–348.
Du Bois, J.W. (2008). Intonation unit cues in context. Unpublished manuscript. University of California, Santa Barbara.
Du Bois, J.W., Chafe, W.L., Meyers, C., & Thompson, S.A. (2000). Santa Barbara corpus of spoken American English, part 1. Philadelphia: Linguistic Data Consortium.
Du Bois, J.W., Chafe, W.L., Meyers, C., Thompson, S.A., & Martey, N. (2003). Santa Barbara corpus of spoken American English, part 2. Philadelphia: Linguistic Data Consortium.
Du Bois, J.W., & Englebretson, R. (2004). Santa Barbara corpus of spoken American English, part 3. Philadelphia: Linguistic Data Consortium.
. (2005). Santa Barbara corpus of spoken American English, part 4. Philadelphia: Linguistic Data Consortium.
Dunning, T.E. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Durrant, P., & Doherty, A. (2010). Are high-frequency collocations psychologically real? Investigating the thesis of collocational priming. Corpus Linguistics and Linguistic Theory, 6(2), 125–155.
Ellis, N.C., & Ferreira-Junior, F. (2009). Constructions and their acquisition: Islands and the distinctiveness of their occupancy. Annual Review of Cognitive Linguistics, 71, 187–220.
Ellis, N.C., Frey, E., & Jalkanen, I. (2009). The psychological reality of collocation and semantic prosody. In U. Romer & R. Schulze (Eds.), Exploring the lexis-grammar interface (pp. 89–114). Philadelphia: John Benjamins.
Ellis, N.C., Simpson-Vlach, R., & Maynard, C. (2008). Formulaic language in native and second language speakers: Psycholinguistics, corpus linguistics, and TESOL. TESOL Quarterly, 42(3), 375–396.
Erman, B., & Warren, B. (2000). The idiom principle and the open choice principle. Text, 20(1), 29–62.
Evert, S. (2005). The statistics of word cooccurrences: Word pairs and collocations. Dissertation. Universität Stuttgart.
Ferreira da Silva, J., Dias, G., Guilloré, S., & Lopes, J.G.P. (1999). Using LocalMax algorithm for the extraction of contiguous and non-contiguous multiword lexical units. In P. Barahona & J.J. Alferes (Eds.), Progress in artificial intelligence: 9th Portuguese conference on artificial intelligence – EPIA ‘99 (pp. 113–132.). Conference proceedings. Evora, Portugal, September 21–24.
Gregory, M.L., Raymond, W.D., Bell, A., Fosler-Lussier, E., & Jurafsky, D. (1999). The effects of collocational strength and contextual predictability in lexical production. In
Proceedings of the 35th annual Chicago Linguistic Society. Chicago.
Gries, S.T. (2013a). 50-something years of work on collocations: What is or should be next … International Journal of Corpus Linguistics, 18(1), 137–165.
Gries, S.T., & Mukherjee, J. (2010). Lexical gravity across varieties of English: An ICE-based study of n-grams in Asian Englishes. International Journal of Corpus Linguistics, 15(4), 520–548.
Jurafsky, D., Bell, A., Gregory, M., & Raymond, W.D. (2000). Probabilistic relations between words: Evidence from reduction in lexical production. In J. Bybee & P. Hopper (Eds.), Frequency and the emergence of linguistic structure (pp. 229–254). Amsterdam: John Benjamins.
Knowles, G., & Lawrence, L. (1987), Automatic intonation assignment. In R. Garside, G. Leech, & G. Sampson (Eds.), The computational analysis of English: A corpus-based approach (pp. 139–148). London: Longman.
Lin, P.M.S. (2010). The phonology of formulaic sequences: A review. In D. Wood (Ed.), Perspectives on formulaic language: Acquisition and communication (pp. 174–193). London: Continuum.
Manning, C.D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge, MA: The MIT Press.
Mcdonald, S.A., & Shillcock, R.C. (2003). Eye movements reveal the on-line computation of lexical probabilities during reading. Psychological Science, 14(6), 648–652.
Onnis, L., & Thiessen, E. (2013). Language experience changes subsequent learning. Cognition, 126(2), 168–284.
Pecina, P. (2009). Lexical association measures: Collocation extraction. Prague: Charles University.
Perruchet, P., & Peereman, R. (2004). The exploitation of distributional information in syllable processing. Journal of Neurolinguistics, 171, 97–119.
Reali, F., & Christiansen, M.H. (2007). Word chunk frequencies affect the processing of pronominal object-relative clauses. The Quarterly Journal of Experimental Psycholinguistics, 60(2), 161–170.
Rescorla, R.A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 661, 1–5.
Saffran, J., Newport, E., & Aslin, R.N. (1996). Word segmentation: The role of distributional cures. Journal of Memory and Language, 351, 606–621.
Shannon, C.E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 271, 379–423.
Steedman, M. (1990). Structure and intonation. Technical reports (CIS). Paper 571. Philadelphia: University of Pennsylvania.
Wahl, A. (2014). A computational approach to extracting (dis)continuous collocations of un-prespecified length.
6th international conference on corpus linguistics
. Las Palmas de Gran Canaria, Spain, May 22–24.
Cited by (9)
Cited by nine other publications
Chen, Alvin Cheng-Hsien
Chen, Alvin Cheng-Hsien
Chen, Alvin Cheng‐Hsien
Gilquin, Gaëtanelle
Chen, Alvin C.-H.
Meng, Fanqi, Yujie Zheng, Songbin Bao & Jingdong Wang
Schneider, Ulrike
Wahl, Alexander & Stefan Th. Gries
Wahl, Alexander & Stefan Th. Gries
2020. Computational extraction of formulaic sequences from corpora. In Computational Phraseology [IVITRA Research in Linguistics and Literature, 24], ► pp. 83 ff.
This list is based on CrossRef data as of 30 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
