In:Spanish-English Codeswitching in the Caribbean and the US
Edited by Rosa E. Guzzardo Tamargo, Catherine M. Mazak and M. Carmen Parafita Couto
[Issues in Hispanic and Lusophone Linguistics 11] 2016
► pp. 171–189
The stratification of English-language lone-word and multi-word material in Puerto Rican Spanish-language press outlets
A computational approach
Published online: 7 September 2016
https://doi.org/10.1075/ihll.11.07bul
https://doi.org/10.1075/ihll.11.07bul
This chapter considers the presence of English in a 3.3-million-word corpus of Puerto Rican news press addressed to distinct social classes: El Vocero, published for a working-class population, El Nuevo Día for a mainstream market, and 80grados for an intellectual readership. Statistical models reveal no significant differences between sub-corpora with respect to the frequency of English unigram and bigram tokens. However, significant differences are returned for English 3+grams sequences: 80grados presents longer, more diverse and complex English spans than do El Nuevo Día and El Vocero. Interpreting the results in view of the social context, we suggest that, in Puerto Rico, the use simplex and compound anglicisms might not signal prestige; it could be code-switching that is linked with status.
Keywords: computational, corpus, loanwords, Spanish-English codeswitching
References (38)
Bills, G.D., & Vigil, N.A. (2008). The Spanish language of New Mexico and Southern Colorado: A linguistic atlas. Albuquerque, NM: UNM Press.
Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python. San Francisco, CA: O’Reilly Media.
Blommaert, J. (1992). Codeswitching and the exclusivity of social identities: Some data from Campus Kiswahili. Journal of Multilingual & Multicultural Development, 13(1–2), 57–70.
Bullock, B.E., Serigos, J.L., & Toribio, A.J. (2015). The use of loan translations and its consequences in an oral bilingual corpus. Paper presented at the
10th International Symposium on Bilingualism
. Rutgers University.
Bullock, B.E., Serigos, J.L., Toribio, A.J., & Wendorf, A. (2013, submitted). The challenges and benefits of annotating oral bilingual corpora: The Spanish in Texas Corpus Project.
Bullock, B.E., & Toribio, A.J. (2013). The Spanish in Texas Corpus Project. Center for Open Educational Resources and Language Learning (COERLL), The University of Texas at Austin. <[URL]>
Callahan, L. (2004). Spanish/English codeswitching in a written corpus. Amsterdam: John Benjamins.
Chesley, P., & Baayen, R.H. (2010). Predicting new words from newer words: Lexical borrowings in French. Linguistics, 48(6), 1343–1374.
Coe, R. (2002). It’s the effect size, stupid: what effect size is and why it is important. Retrieved July 7, 2014, from <[URL]>
Cohen, J. (2013). Statistical power analysis for the behavioral sciences (2nd ed). Hoboken, NJ: Taylor and Francis.
Elfardy, H., & Diab, M.T. (2012). Token level identification of linguistic code switching. In COLING (Posters) (pp. 287–296).
Jurafsky, D., & Martin, J.H. (2008). Speech and language processing (2nd edition.). Upper Saddle River, NJ: Prentice Hall.
Kilgarriff, A. (2005). Language is never, ever, ever, random. Corpus Linguistics and Linguistic Theory, 1(2), 263–276.
Larsen, J. (2014). Social stratification of loanwords: A corpus-based approach to Anglicisms in Argentina. (Master of Arts report). Austin, TX: University of Texas.
Matus-Mendoza, M. (2002). The English lexical loan: A class marker. Journal of Hispanic Higher Education, 1(4), 329–337.
Montes-Alcalá, C., & Lapidus Shin, N. (2011). Las keys versus el key: Feminine gender assignment in mixed-language texts. Spanish in Context, 8(1), 119–143.
Morin, R. (2006). Evidence in the Spanish language press of linguistic borrowings of computer and Internet-related terms. Spanish in Context, 3(2), 161–179.
Myers-Scotton, C. (1993). Duelling languages: Grammatical structure in code-switching. Oxford: Oxford University Press.
Ngom, F. (2000). Sociolinguistic motivations of lexical borrowings in Senegal. Studies in the Linguistic Sciences, 30(2), 159–172.
. (2002). Linguistic borrowing as evidence of the social history of the Senegalese speech community. International Journal of the Sociology of Language, 158, 37–51.
. (2003). The social status of Arabic, French, and English in the Senegalese speech community. Language Variation and Change, 15(3), 351–368.
Onysko, A. (2007). Anglicisms in German: Borrowing, lexical productivity, and written codeswitching. Berlin: Walter de Gruyter.
Poplack, S., Sankoff, D., & Miller, C. (1988). The social correlates and linguistic processes of lexical borrowing and assimilation. Linguistics, 26(1), 47–104.
Pousada, A. (2000). The competent bilingual in Puerto Rico. International Journal of the Sociology of Language, 142(1), 103–118.
R Development Core Team. (2013). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. <[URL]>
Sánchez, M. (1995). Clasificación y análisis de préstamos de inglés en la prensa de España y México. Lewiston, NY: The Edwin Mellen Press.
Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the
international conference on new methods in language processing
(Vol. 12, pp. 44–49). Manchester, UK.
Solorio, T., Blair, E., Maharjan, S., Bethard, S., Diab, M., Gohneim, M., … & Fung, P. (2014). Overview for the first shared task on language identification in code-switched data. In Proceedings of the
First Workshop on Computational Approaches to Code Switching
(pp. 62–72). Doha: Association for Computational Linguistics.
Solorio, T., & Liu, Y. (2008a). Learning to predict code-switching points. In Proceedings of the
Conference on Empirical Methods in Natural Language Processing
(pp. 973–981). Honolulu, HI: Association for Computational Linguistics.
. (2008b). Part-of-speech tagging for English-Spanish code-switched text. In Proceedings of the
Conference on Empirical Methods in Natural Language Processing
(pp. 1051–1060). Honolulu, HI: Assocation for Computational Linguistics.
Thomason, S.G., & Kaufman, T. (1988). Language contact, creolization, and genetic linguistics. Berkeley, CA: University of California Press.
Thomason, S.G. (2001). Language Contact: An Introduction. Washington, DC: Georgetown University Press.
Toribio, A.J., Bullock, B.E., Serigos, J., Neupane, R. & Ball, K. (2015). Towards developing a metric for U.S. Spanish. Paper presented at the
25th Conference on Spanish in the United States
. New York, NY: City University of New York.
Torres Cacoullos, R. & Aaron, J.E. (2003). Bare English origin nouns in Spanish: rates, constraints, and discourse functions. Language Variation and Change, 15(3), 289–328.
Cited by (1)
Cited by one other publication
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
