In:How to do Linguistics with R: Data exploration and statistical analysis
Natalia Levshina
[Not in series 195] 2015
► pp. 425–432
References
Published online: 25 November 2015
https://doi.org/10.1075/z.195.refs
https://doi.org/10.1075/z.195.refs
Allan, L.G. (1980). A note on measurement of contingency between two binary variables in judgment tasks. Bulletin of the Psychonomic Society, 15, 147–149.
Anishchanka, A. (2013). Seeing it in color: A usage-based perspective on color naming in advertising. PhD diss., University of Leuven.
Arppe, A., Han, W., & Newman, J. (2013). Polytomous logistic regression with Shanghainese topic markers. Vignette, CRAN-R Project. [URL] (last access 13.12.2014).
Atkins, B.T.S. (1987). Semantic ID tags: Corpus evidence for dictionary senses.
The uses of large text databases. Proceedings of the Third Annual Conference of the UW Centre for the New Oxford English Dictionary
(pp. 17–36). Waterloo, Canada.
Baayen, R.H. (2008). Analyzing Linguistic Data. A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press.
Balota, D.A., Yap, M.J., & Cortese, M.J., et al. (2007). The English Lexicon Project. Behavior Research Methods, 39(3), 445–459.
Barnbrook, G., Mason, O., & Krishnamurthy, R. (2013). Collocation: Applications and Implications. Basingstoke, Hampshire: Palgrave Macmillan.
Bates, E., & Goodman, J.C. (1997). On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing. Language and Cognitive Processes, 12(5/6), 507–586.
Berlin, B., & Kay, P. (1969). Basic Color Terms: Their Universality and Evolution. Berkeley, CA: University of California Press.
Borg, I., & Groenen, P. (1997). Modern Multidimensional Scaling: Theory and Applications. New York: Springer.
Boroditsky, L. (2001). Does language shape thought?: Mandarin and English speakers’ conceptions of time. Cognitive Psychology, 43, 1–22.
Bowerman, M., & Choi, S. (2003). Space under construction: Language-specific spatial categorization in first language acquisition. In D. Gentner & S. Goldin-Meadow (Eds.), Language in Mind: Advances in the Study of Language and Thought (pp. 387–427). Cambridge, MA: MIT Press.
Bresnan, J., & Hay, J. (2008). Gradient Grammar: An effect of animacy on the syntax of give in New Zealand and American English. Lingua, 118(2), 245–259.
Brugman, C. (1988 [1981]). The Story of Over: Polysemy, Semantics and the Structure of the Lexicon. New York: Garland.
Bullinaria, J.A., & Levy, J.P. (2007). Extracting semantic representations from word co-occurrence statistics: A Computational Study. Behavior Research Methods, 39, 510–526.
Conover, W.J., Johnson, M.E., & Johnson, M.M. (1981). A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. Technometrics, 23, 351–361.
Cox, T.F., & Cox, M.A.A. (2001). Multidimensional Scaling (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC Press.
Dąbrowska, E. (2009). Words as constructions. In V. Evans & S. Pourcel (Eds.), New Directions in Cognitive Linguistics (pp. 201–223). Amsterdam: John Benjamins.
Davies, M. (2008) The Corpus of Contemporary American English: 450 million words, 1990 – present. Available online at [URL].
. (2011). N-grams and word frequency data from the Corpus of Historical American English (COHA). Available online at [URL].
. (2013). Corpus of Global Web-Based English: 1.9 billion words from speakers in 20 countries. Available online at [URL].
de Leeuw, J. (1977). Applications of convex analysis to multidimensional scaling. In J. Barra, F. Brodeau, G. Romier, & B.V. Cutsem (Eds.), Recent Developments in Statistics (pp. 133–145). Amsterdam: North Holland Publishing Company.
Deerwester, S., Dumais, S.T., Furnas, G.W., Landayer, T.K., & Harshman, R. (1990). Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41, 391–407.
Diessel, H. (2007). Frequency effects in language acquisition, language use, and diachronic change. New Ideas in Psychology, 25, 108–127.
Divjak, D. (2003). On trying in Russian: A tentative network model for near(er) synonyms. In
Belgian Contributions to the 13th International Congress of Slavicists
, Ljubljana, 15–21 August 2003. Special issue of Slavica Gandensia
. (pp. 25–58).
Divjak, D., & Gries, S. Th. (2006). Ways of trying in Russian: Clustering behavioral profiles. Corpus Linguistics and Linguistic Theory, 2, 23–60.
. (2009). Corpus-based cognitive semantics: A contrastive study of phasal verbs in English and Russian. In K. Dziwirek & B. Lewandowska-Tomaszczyk (Eds.), Studies in Cognitive Corpus Linguistics (pp. 273–296). Frankfurt am Main: Peter Lang.
Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Ellis, N. (2006). Language acquisition as rational contingency learning. Applied Linguistics, 27(1), 1–24.
Ellis, N., & Ferreira-Junior, F.G. (2009). Constructions and their acquisition: Islands and the distinctiveness of their occupancy. Annual Review of Cognitive Linguistics, 7, 188–221.
Ember, C.R., & Ember, M. (2007). Climate, econiche, and sexuality: Influences on sonority inlanguage. American Anthropologist, 109(1), 180–185.
Everett, D. (2005). Cultural Constraints on Grammar and Cognition in Pirahã: Another Look at the Design Features of Human Language. Current Anthropology, 46, 621–646.
Evert, S. (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. IMS, University of Stuttgart.
Everitt, B., & Hothorn, T. (2011). An Introduction to Applied Multivariate Analysis with R. New York: Springer.
Everitt, B.S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster Analysis (5th ed.). Chichester: Wiley.
Fox, J. (2008). Applied Regression Analysis and Generalized Linear Models (2nd ed.). Thousand Oaks, CA: Sage Publications.
Firth, J.R. (1957). A synopsis of linguistic theory 1930–1955. In J.R. Firth (Ed.), Studies in Linguistic Analysis (pp. 1–32). Oxford: Blackwell.
Friendly, M. (1996). Paivio, et al. Word List Generator, Online application. Retrieved April 28, 2013, from [URL]
Geeraerts, D. (1999). Idealist and empiricist tendencies in cognitive linguistics. In T. Janssen & G. Redeker (Eds.), Cognitive Linguistics: Foundations, Scope, and Methodology (pp. 163–194). Berlin/New York: Mouton de Gruyter.
Gilquin, G. (2006). The place of prototypicality in corpus linguistics: Causation in the hot seat. In S. Th. Gries & A. Stefanowitsch (Eds.), Corpora in Cognitive Linguistics: Corpus-Based Approaches to Syntax and Lexis (pp. 159–191). Berlin/New York: Mouton de Gruyter.
. (2010). Corpus, Cognition and Causative Constructions. Amsterdam: John Benjamins.
Gipper, H. (1959). Sessel oder Stuhl? Ein Beitrag zur Bestimmung von Wortinhalten im Bereich der Sachkultur. In H. Gipper (Ed.), Sprache – Schlüssel zur Welt: Festschrift für Leo Weisgerber (pp. 271–92). Düsseldorf: Schwann.
Goldberg, A.E., Casenhiser, D., & Sethuraman, N. (2004). Learning argument structure generalizations. Cognitive Linguistics, 14(3), 289–316.
Gower, J.C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27, 857–874.
Greenacre, M. (2007). Correspondence Analysis in Practice (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC Press.
. (2006). Corpus-based methods and Cognitive Semantics: The many senses of to run
. In S. Th. Gries & A. Stefanowitsch (Eds.), Corpora in Cognitive Linguistics. Corpus-based Approaches to Syntax and Lexis (pp. 57–99). Berlin/New York: Mouton de Gruyter.
. (2008). Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403–437.
. (2009). Quantitative Corpus Linguistics with R: A Practical Introduction. New York/London: Routledge.
. (2012). Behavioral Profiles: A fine-grained and quantitative approach in corpus-based lexical semantics. In G. Jarema, G. Libben, & C. Westbury (Eds.), Methodological and Analytic Frontiers in Lexical Research (pp. 57–80). Amsterdam: John Benjamins.
Gries, S. Th., Hampe, B., & Schönefeld, D. (2005). Converging evidence: Bringing together experimental and corpus data on the association of verbs and constructions. Cognitive Linguistics, 16(4), 635–676.
Gries, S. Th., & Stefanowitsch, A. (2004). Extending collostructional analysis: A corpus-based perspective on ‘alternations’. International Journal of Corpus Linguistics, 9(1), 97–129.
Hanks, P. (1996). Contextual dependency and lexical sets. International Journal of Corpus Linguistics, 1(1), 75–98.
Harrell, F.E. (2001). Regression Modeling Strategies. With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer.
Hilpert, M. (2011). Dynamic visualizations of language change: Motion charts on the basis of bivariate and multivariate data from diachronic corpora. International Journal of Corpus Linguistics, 16(4), 435–461.
. (2013). Constructional Change in English: Developments in Allomorphy, Word Formation, and Syntax. Cambridge: Cambridge University Press.
Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.
Husson, F., Lê, S., & Pagès, J. (2010). Exploratory Multivariate Analysis by Example Using R. Boca Raton, FL: Chapman and Hall/CRC Press.
Itkonen, E. (1980). Qualitative vs. quantitative analysis in linguistics. In T.A. Perry (Ed.), Evidence and Argumentation in Linguistics (pp. 334–366). Berlin: Mouton.
Kaufman, L., & Rousseeuw, P.J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. New York: Wiley-Interscience.
Kay, P., & McDaniel, C.K. (1978). The linguistic significance of the meanings of Basic Color Terms. Language, 54(3), 610–646.
Kepser, S., & Reis, M. (2005). Evidence in Linguistics. In S. Kepser & M. Reis (Eds.), Linguistic Evidence: Empirical, Theoretical and Computational Perspectives (pp. 1–6). Berlin/New York: Mouton de Gruyter.
Keuleers, E., Lacey, P., Rastle, K., & Brysbaert, M. (2012). The British Lexicon Project: Lexical decision data for 28,730 monosyllabic and disyllabic English words. Behavior Research Methods, 44(1), 287–304.
Kortmann, B., & Lunkenheimer, K. (Eds.). (2013). The Electronic World Atlas of Varieties of English. Leipzig: Max Planck Institute for Evolutionary Anthropology. Retrieved from [URL]
Kruskal, J.B. (1964). Multidimensional Scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrica, 29(1), 1–27.
Kučera, H., & Francis, W.N. (1967). Computational Analysis of Present-day American English. Providence: Brown University Press.
Landauer, T.K., & Dumais, S.T. (1997). A solution to Plato’s problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240.
Langacker, R.W. (1987). Foundations of Cognitive Grammar: Theoretical Prerequisites. Stanford, CA: Stanford University Press.
Larson-Hall, J. (2010). A Guide to Doing Statistics in Second Language Research Using SPSS. New York: Routledge.
Lehrer, A. (1974). Semantic Fields and Lexical Structure. Amsterdam: North Holland Publishing Company.
Levshina, N. (2011). Doe wat je niet laten kan [Do what you cannot let]: A usage-based analysis of Dutch causative constructions. PhD diss., University of Leuven.
. (2014). Geographic variation of quite + ADJ in twenty national varieties of English: A pilot study. Yearbook of the German Cognitive Linguistics Association, 2, 109–126.
. (In preparation). Convergent evidence of divergent knowledge: A study of the associations between the Russian ditransitive construction and its collexemes.
Levshina, N., Geeraerts, D., & Speelman, D. (2011). Changing the world vs. changing the mind: Distinctive collexeme analysis of the causative construction with doen in Belgian and Netherlandic Dutch. In F. Gregersen, J. Parrot, & P. Quist (Eds.),
Language variation - European perspectives III. Selected papers from the 5th International Conference on Language Variation in Europe, Copenhagen, June 2009
(pp. 111–123). Amsterdam: John Benjamins.
. (2013). Towards a 3D-Grammar: Interaction of linguistic and extralinguistic factors in the use of Dutch causative constructions. Journal of Pragmatics, 52, 34–48.
Levshina, N., & Heylen, K. (2014). A radically data-driven construction grammar: Experiments with Dutch causative constructions. In R. Boogaart, T. Colleman, & G. Rutten (Eds.), Extending the Scope of Construction Grammar (pp. 17–46). Berlin/New York: Mouton de Gruyter.
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49, 764–766.
Lijffijt, J., & Gries, S. Th. (2012). Correction to “Dispersions and adjusted frequencies in corpora”. International Journal of Corpus Linguistics, 17(1), 147–149.
Lin, D. (1998). Automatic retrieval and clustering of similar words.
Proceedings of the 17th International Conference on Computational linguistics
, Montreal, Canada, August 1998 (pp. 768–774).
Louviere, J.J., Hensher, D.A., & Swait, J.D. (2000). Stated Choice Methods: Analysis and application. Cambridge: Cambridge University Press.
Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrences. Behavior Research Methods, Instruments, & Computers, 28, 203–208.
Manning, C., & Schütze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.
Matloff, N. (2011). The Art of R Programming: A Tour of Statistical Software Design. San Francisco: No Starch Press.
Michelbacher, L., Evert, S., & Schutze, H. (2011). Asymmetry in corpus-derived and human word associations. Corpus Linguistics and Linguistic Theory, 7(2), 245–276.
Miller, G.A., & Charles, W.G. (1991). Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1), 1–28.
Mitchell, J., & Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science, 34(8), 1388–1439.
Newman, J. (2011). Corpora and cognitive linguistics. Brazilian Journal of Applied Linguistics, 11(2), 521–559.
Núñez, R.E., & Sweetser, E. (2006). With the future behind them: Convergent evidence from Aymara language and gesture in the crosslinguistic comparison of spatial construals of time. Cognitive Science, 30, 401–450.
Pado, S., & Lapata, M. (2007). Dependency-based construction of Semantic Space Models. Computational Linguistics, 33(2), 161–199.
Peirsman, Y. (2008). Word Space Models of semantic similarity and relatedness. In
Proceedings of the ESSLLI-2008 Student Session
, Hamburg, Germany.
Peirsman, Y., Heylen, K., & Geeraerts, D. (2010). Applying Word Space Models to sociolinguistics. Religion names before and after 9/11. In D. Geeraerts, G. Kristiansen, & Y. Peirsman (Eds.), Recent Advances in Cognitive Sociolinguistics (pp. 111–137). Berlin/New York: Mouton de Gruyter.
Paivio, A., Juille, J.C., & Madigan, S. (1968). Concreteness, imagery, and meaningfulness values for 925 nouns. Journal of Experimental Psychology, 76(1, Pt. 2), 1–25.
Paradis, C. (1997). Degree Modifiers of Adjectives in Spoken British English. Lund: Lund University Press.
Rosch Heider, E., & Olivier, D.C. (1972). The structure of the color space in naming and memory for two languages. Cognitive Psychology, 3, 337–345.
Rosch, E. (1975). Cognitive representation of semantic categories. Journal of Experimental Psychology, 104(3), 192–233.
Rosch, E., & Mervis, C.B. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7, 573–605.
Salkind, N.J. (2011). Statistics for People Who (Think They) Hate Statistics (4th ed.). Los Angeles: Sage.
Schmid, H.-J. (2000). English Abstract Nouns as Conceptual Shells. From corpus to cognition. Berlin/New York: Mouton de Gruyter.
Schütze, H. (1992). Dimensions of meaning. In
Proceedings of Supercomputing 92
(pp. 787–796). Minneapolis, MN.
Senghas, A., & Coppola, M. (2001). Children creating language: How Nicaraguan Sign Language acquired a spatial grammar. Psychological Science, 12(4), 323–328.
Senghas, A., Kita, S., & Özyürek, A. (2004). Children creating core properties of language: Evidence from an emerging Sign Language in Nicaragua. Science, 305(5691), 1779–1782.
Sheskin, D.J. (2011). Handbook of Parametric and Nonparametric Statistical Procedures. Boca Raton, FL: Chapman and Hall/CRC Press.
Speelman, D., & Geeraerts, D. (2009). Causes for causatives: The case of Dutch ‘doen’ and ‘laten’. In T. Sanders & E. Sweetser (Eds.), Causal Categories in Discourse and Cognition (pp. 173–204). Berlin/New York: Mouton de Gruyter.
Steels, L. (Ed.). (2012). Experiments in Cultural Language Evolution. Amsterdam: John Benjamins.
Steen, G.J., Dorst, A.G., Herrmann, J.B., Kaal, A.A., Krennmayr, T., & Pasma, T. (2010). A Method for Linguistic Metaphor Identification. From MIP to MIPVU. Amsterdam: John Benjamins.
Stefanowitsch, A. (2001). Constructing causation: A construction grammar approach to analytic causatives. PhD diss., Rice University.
. (2010). Empirical Cognitive Semantics: Some thoughts. In D. Glynn & K. Fischer (Eds.), Quantitative Methods in Cognitive Semantics: Corpus-driven Approaches (pp. 355–380). Berlin/New York: De Gruyter Mouton.
Stefanowitsch, A., & Gries, S. Th. (2003). Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics, 8(2), 209–243.
Szmrecsanyi, B. (2010). The English genitive alternation in a cognitive sociolinguistics perspective. In D. Geeraerts, G. Kristiansen, & Y. Peirsman (Eds.), Advances in Cognitive Sociolinguistics (pp. 141–166). Berlin/New York: Mouton de Gruyter.
Tagliamonte, S., & Baayen, R.H. (2012). Models, forests and trees of York English: Was/were variation as a case study for statistical practice. Language Variation and Change, 24(2), 135–178.
Talmy, L. (1985). Lexicalization patterns: Semantic structure in lexical forms. In T. Shopen (Ed.), Grammatical Categories and the Lexicon, Vol. III (pp. 57–149). Cambridge: Cambridge University Press.
Taylor, J. (2012). The Mental Corpus. How Language is Represented in the Mind. Oxford: Oxford University Press.
Turney, P.D., & Pantel, P. (2010). From frequency to meaning: Vector Space Models of semantics. Journal of Articial Intelligence Research, 37, 141–188.
Verhagen, A., & Kemmer, S. (1997). Interaction and causation: Causative constructions in modern standard Dutch. Journal of Pragmatics, 24, 61–82.
Verhoeven, J., De Pauw, G., & Kloots, H. (2004). Speech rate in a pluricentric language: A comparison between Dutch in Belgium and the Netherlands. Language and Speech, 47(3), 297–308.
Wiechmann, D. (2008). On the computation of Collostruction Strength. Corpus Linguistics and Linguistic Theory, 4(2), 253–290.
Winke, P., Gass, S., & Sydorenko, T. (2010). The effects of captioning videos used for foreign language listening activities. Language Learning and Technology, 14(1), 65–86.
Wolk, C., Bresnan, J., Rosenbach, A., & Szmrecsanyi, B. (2013). Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica, 30(3), 382–419.
Wulff, S. (2006). Go-V vs. go-and-V in English: A case of constructional synonymy? In S. Th. Gries & A. Stefanowitsch (Eds.), Corpora in Cognitive Linguistics. Corpus-based Approaches to Syntax and Lexis (pp. 101–125). Berlin/New York: Mouton de Gruyter.
Wulff, S., Gries, S. Th., & Stefanowitsch, A. (2007). Brutal Brits and persuasive Americans: Variety-specific meaning construction in the into-causative. In G. Radden, K.-M. Köpcke, T. Berg, & P. Siemund (Eds.), Aspects of Meaning Construction (pp. 265–281). Amsterdam: John Benjamins.
