The C-ORAL-ROM book and DVD provide a unique set of comparable corpora of spontaneous speech for the main Romance languages, French, Italian, Portuguese and Spanish. The corpora are accompanied by comparative linguistic studies, models and standard linguistic measures of spoken language variability. Each corpus is built to the same design using identical sampling techniques, and each corpus is presented in multimedia format, allowing simultaneous access to aligned acoustic and textual information. Texts are headed with information about provenance, participants, etc. and the transcriptions show changes of speaker. Speech acts are tagged according to the evidence of prosodic criteria. Each corpus totals 300,000 words and presents formal and informal speech in a variety of contexts of use, dialogue structure and text genres, semantic domains and speech act typologies. The corpora have great statistical relevance for spoken language structures and can address key issues in human language technology such as speech recognition in unrestricted discourse, the suitability of speech synthesis in natural prosody, and multilingual applications of the spoken language interface. The work provides new data and innovative theoretical perspectives that are relevant for corpus linguistics, romance linguistics, syntactic theory, speech and prosody research, and second language acquisition.
The original C-ORAL-ROM DVD was made to run under Windows XP when Windows 7 and 8 were not yet in existence. A new version of WINPITCH-C-ORAL-ROM makes it possible to run the C-ORAL-ROM DVD under Windows 7 and 8. It can be downloaded from www.winpitch.com/
“This is a great resource for researchers in the areas of Romance linguistics, corpus linguistics, syntax, second language acquisition, and speech and prosody research. The operation of the DVD and the tools included in is quite straightforward.”
Cornelia González, Florida State University, on eLanguage 2008-10-30 15:47:32
“Sarà chiaro anche da questa segnalazione per forza di cose breve che C-ORAL-ROM è un lavoro molto notevole, di eccellente qualità, all’avanguardia anche per rigore metodologico, che arricchisce la linguistica romanza in un settore da prevedere sempre più in sviluppo qual è la corpus linguistics e mette a disposizione della communità scientifica un importante strumento di ricerca.”
Gaetano Berruto, Torino, in Romanische Forchungen 119(3), 2007
Cited by (130)
Cited by 130 other publications
Abouda, Lotfi, Florence Lefeuvre & Flora Badin
2025. Annotation de l’oral. Langages N° 238:2 ► pp. 9 ff.
Averyanov, Oleg
2025. La locution adversative par contre : éléments de structuration sémantique et énonciative (à l’usage des linguistes et des didacticiens). Langue française :1 ► pp. 77 ff.
2025. Prosody and co-speech gestures in a Brazilian audiovisual spontaneous speech corpus: the case of Parentheticals. Corpus Linguistics and Linguistic Theory
Raso, Tommaso, Saulo Mendes Santos, Albert Rilliard & João A. Moraes
2025. 65Defining and Identifying Discourse Markers in Spontaneous Speech. In Prosodic Interfaces, ► pp. 65 ff.
Rocha, João Victor Pessoa & Átila Augusto Soares Vital
2025. Comparisons between face-to-face and telephone speech. Journal of Speech Sciences 14 ► pp. e025003 ff.
Xi, Xiaorao
2025. Desvíos pragmáticos en la producción oral de los alumnos chinos en la comunicación intercultural. Círculo de Lingüística Aplicada a la Comunicación 103 ► pp. 289 ff.
You, Mu, Jing Zhang, Derek F. Wong & Kaixin Lan
2025. Umplc: the first longitudinal learner corpus of Portuguese. Language Resources and Evaluation 59:3 ► pp. 3353 ff.
de Albuquerque, Marianna Bicalho
2024. Gestures and multilevel discourse in spontaneous speech corpora: the case of reported speech. DILEF. Rivista digitale del Dipartimento di Lettere e Filosofia :3 ► pp. 243 ff.
Di Domenico, Elisa
2024. Narrow Focus Without Prosody: Some Observations from the Written Italian of University Students. Languages 9:12 ► pp. 357 ff.
Fliessbach, Jan, Lisa Brunetti & Hiyon Yoo
2024. On the overlapping discourse functions of Spanish ‘cómo que’ and French ‘comment ça’ interrogatives. Open Linguistics 10:1
Manzato, Typhaine
2024. Enseigner les langues voisines dans une modalité bi-plurilingue : quel appui sur la proximité linguistique des langues romanes pour leur enseignement ?. Contextes et didactiques 23
Manzato, Typhaine, F. Neveu, S. Prévost, A. Montébran, A. Steuckardt, G. Bergounioux, G. Merminod & G. Philippe
2024. Interroger la conscience normative des apprenant-e-s dans l’apprentissage de la langue voisine : un travail sur les usages oraux en français langue étrangère. SHS Web of Conferences 191 ► pp. 07007 ff.
2024. Dialogical and monological functions of the discourse marker bueno in spoken and written Spanish. Linguistics
Silvano, Purificação & María Gómez González
2024. Developing a Comparative Model of Predicted Associations for Invariable Question Tag Types in British English and European Portuguese. In Constructional and Cognitive Explorations of Contrastive Linguistics, ► pp. 173 ff.
Verdonik, Darinka, Mitja Trojar & Andreja Bizjak
2024. Prednosti in slabosti dvotirnega zapisovanja govora v slovenskih govornih virih. In Stanje in perspektive uporabe govornih virov v raziskavah govora, ► pp. 63 ff.
Ballarè, Silvia & Massimo Cerruti
2023. Sociolinguistic variation in spoken Italian: An introduction. Sociolinguistica 37:1 ► pp. 1 ff.
2022. A Protocol for Comparing Gesture and Prosodic Boundaries in Multimodal Corpora. In Computational Processing of the Portuguese Language [Lecture Notes in Computer Science, 13208], ► pp. 313 ff.
Ferroni, Roberta & Marilisa Birello
2022. L’insegnamento dei segnali discorsivi allora, dunque e beh: riflessioni metapragmatiche di studenti di italiano LS. Cuadernos de Filología Italiana 29 ► pp. 125 ff.
2022. Corpus Data and the Position of Information Focus in Spanish. Studies in Hispanic and Lusophone Linguistics 15:1 ► pp. 67 ff.
Moneglia, Massimo & Alessandro Panunzi
2022. Micro-Diachronic Corpora for Measuring the Lexical Change of Spontaneous Speech in Florence Compared to Standard Italian. Langages N° 226:2 ► pp. 41 ff.
Raso, Tommaso, Albert Rilliard & Saulo Mendes Santos
2022. Para uma modelagem das formas prosódicas dos Marcadores Discursivos. Domínios de Lingugem 16:4 ► pp. 1436 ff.
Rocha, Bruno, Tommaso Raso, Heliana Mello & Lucia Ferrari
2022. Information structure in the speech of individuals with schizophrenia. CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos 9 ► pp. 217 ff.
2021. Two Types of Constructionalization Processes in Spanish and Portuguese Cleftedwh-interrogatives. Studies in Hispanic and Lusophone Linguistics 14:1 ► pp. 117 ff.
Saccone, Valentina & Chiara Trombetta
2021. Parenthetical Units and Structures in Italian and German spoken language: Prosodic and textual analysis. CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos 8 ► pp. 1 ff.
Lacheret-Dujour, Anne, Sylvain Kahane, F. Neveu, B. Harmegnies, L. Hriba, S. Prévost & A. Steuckardt
2020. Unités syntaxiques et unités intonatives majeures en français parlé : inclusion, fragmentation, chevauchement. SHS Web of Conferences 78 ► pp. 14005 ff.
Poggio, Anabella L.
2020. La Interfaz Sintaxis-Pragmática. Estudios teóricos, descriptivos y experimentales. Pragmática Sociocultural / Sociocultural Pragmatics 8:1 ► pp. 133 ff.
Santos, Giovani
2020. Designing and building SCoPE²: A spoken corpus of Brazilian Portuguese and L2-English. Research in Corpus Linguistics 8 ► pp. 49 ff.
Thaler, Verena
2020. Sí queals polyphone Struktur im gesprochenen Spanisch. Romanistisches Jahrbuch 71:1 ► pp. 305 ff.
Williams, Christopher
2020. Review of Fanego, Teresa and Paula Rodríguez-Puente eds. 2019. Corpus-based Research on Variation in English Legal Discourse. Amsterdam: John Benjamins. ISBN: 978-9-027-20235-2. https://doi.org/10.1075/scl.91. Research in Corpus Linguistics 8 ► pp. 178 ff.
Bomjardim da Silva Carmo, Crysna
2019. Cláusulas relativas na fala espontânea. Domínios de Lingugem 13:3 ► pp. 946 ff.
2023. The role of prosody for the expression of illocutionary types. The prosodic system of questions in spoken Italian and French according to Language into Act Theory. Frontiers in Communication 8
2018. MINI-CORPUS del español para IPIC. CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos 5:2 ► pp. 197 ff.
Sabio, Frédéric
2018. On the syntax of spoken French. Revue Romane. Langue et littérature. International Journal of Romance Languages and Literatures 53:1 ► pp. 6 ff.
Correia, Liliana & Cristina Flores
2017. The Role of Input Factors in the Lexical Development of European Portuguese as a Heritage Language in Portuguese–German Bilingual Speakers. Languages 2:4 ► pp. 30 ff.
Ferrari, Angela
2017. Leggere la virgola. Una prima ricognizione. CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos 4:2 ► pp. 145 ff.
Pecorari, Filippo
2017. Puntini di sospensione e mimesi del parlato. Le facce del rapporto tra punteggiatura e prosodia. CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos 4:2 ► pp. 175 ff.
Stark, Elisabeth
2017. Pertinence de l’analyse grammaticale en linguistique variationnelle. Langage et société N° 160-161:2 ► pp. 283 ff.
Roque Amaral, Eduardo Tadeu, Wiltrud Mihatsch, F. Neveu, G. Bergounioux, M.-H. Côté, J.-M. Fournier, L. Hriba & S. Prévost
2016. Le nom françaispersonneen comparaison avec le portugais brésilienpessoaet l’allemandPerson– des noms en voie de pronominalisation ?. SHS Web of Conferences 27 ► pp. 12015 ff.
Avanzi, Mathieu & Laure Anne Johnsen
2015. Asyndètes temporelles. Langages N° 200:4 ► pp. 103 ff.
GAUCHER, DAMIEN
2015. Sémantique temporelle et accord du participe passé en français parlé: une analyse variationniste. Journal of French Language Studies 25:1 ► pp. 65 ff.
2021. The Appendix of Comment according to Language into Act Theory. CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos 8 ► pp. 45 ff.
Cresti, Emanuela
2025. The affective foundation of speech. Journal of Speech Sciences 14 ► pp. e025005 ff.
De Cesare Greenwald, Anna-Maria & Margarita Borreguero Zuloaga
2014. La problématique de la liaison entre prédications à la lumière de la distinction entre construction et énoncé : intégration versus insertion. Langue française n° 182:2 ► pp. 59 ff.
2022. Le unità di informazione Parentetiche alla periferia destra del Comment nella Teoria della Lingua in Atto. DILEF. Rivista digitale del Dipartimento di Lettere e Filosofia :1 ► pp. 88 ff.
2013.
Annette Gerstenberg, Generation und Sprachprofile im höheren Lebensalter, Frankfurt am Main, Klostermann, 2011, XII + 372 p.. zrph 129:3 ► pp. 785 ff.
Garrido, Juan María, David Escudero, Lourdes Aguilar, Valentín Cardeñoso, Emma Rodero, Carme de-la-Mota, César González, Carlos Vivaracho, Sílvia Rustullet, Olatz Larrea, Yesika Laplaza, Francisco Vizcaíno, Eva Estebas, Mercedes Cabrera & Antonio Bonafonte
2013. Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan. Language Resources and Evaluation 47:4 ► pp. 945 ff.
Valentini, Cristina
2013. Phrasal verbs in Italian dubbed dialogues: a multimedia corpus-based study. Perspectives 21:4 ► pp. 543 ff.
ÅGREN, MALIN & JOOST VAN DE WEIJER
2013. Input frequency and the acquisition of subject-verb agreement in number in spoken and written French. Journal of French Language Studies 23:3 ► pp. 311 ff.
Abeillé, Anne & Danièle Godard
2012. La Grande Grammaire du Français et la variété des données. Langue française n°176:4 ► pp. 47 ff.
Deulofeu, Henri-José & Jeanne-Marie Debaisieux
2012. Une tâche à accomplir pour la linguistique française du XXI e siècle : élaborer une grammaire des usages du français. Langue française n°176:4 ► pp. 27 ff.
Raso, Tommaso & Heliana Mello
2012. The C-ORAL-BRASIL I: Reference Corpus for Informal Spoken Brazilian Portuguese. In Computational Processing of the Portuguese Language [Lecture Notes in Computer Science, 7243], ► pp. 362 ff.
Williams, Geoffrey
2012. Corpora: French‐Language. In The Encyclopedia of Applied Linguistics,
AVANZI, MATHIEU & ELISABETH DELAIS-ROUSSARIE
2011. Introduction: Regards croisés sur la prosodie du français – des données à la modélisation. Journal of French Language Studies 21:1 ► pp. 1 ff.
Cresti, Emanuela, Massimo Moneglia & Ida Tucci
2011. Annotation de l'entretien d'Anita Musso selon la Théorie de la langue en acte. Langue française n°170:2 ► pp. 95 ff.
De Cesare, Anna-Maria
2011. L'italien ecco et les français voici , voilà . Regards croisés sur leurs emplois dans les textes écrits. Langages n° 184:4 ► pp. 51 ff.
2011. Peut-on établir un système de ponctuation des transcriptions de textes oraux linguistiquement fondé ? Les propositions du groupe Rhapsodie. Langue française n°172:4 ► pp. 115 ff.
Lauwers, Peter & Claude Duée
2011. From aspect to evidentiality: The subjectification path of the French semi-copula se faire and its Spanish cognate hacerse. Journal of Pragmatics 43:4 ► pp. 1042 ff.
Marcol, Lucyna
2010. Semantica lessicale dei verbi sintagmatici in italiano. Analisi intra-/interlinguistica del VS mettere fuori. Neophilologica 22 ► pp. 82 ff.
Mihatsch, Wiltrud
2010. Les approximateurs quantitatifs entre scalarité et non-scalarité. Langue française n° 165:1 ► pp. 125 ff.
2009. 2009 Seventh Brazilian Symposium in Information and Human Language Technology, ► pp. 179 ff.
Baude, Olivier
2007. Aspects juridiques et éthiques de la conservation et de la diffusion des corpus oraux. Revue française de linguistique appliquée XII:1 ► pp. 85 ff.
Blanche-Benveniste, Claire
2007. Corpus de langue parlée et description grammaticale de la langue. Langage et société n° 121-122:3 ► pp. 129 ff.
Dostie, Gaétane & Claus D. Pusch
2007. Présentation. Les marqueurs discursifs. Sens et variation. Langue française n° 154:2 ► pp. 3 ff.
Pusch, Claus D.
2007. Faut dire : variation et sens d'un marqueur parenthétique entre connectivité et (inter)subjectivité. Langue française n° 154:2 ► pp. 29 ff.
Bambini, Valentina
2005. Corpus di italiano parlato: Vol. 1: Introduzione, Vol. 2: Corpora, CD-Rom. Journal of Pragmatics 37:6 ► pp. 949 ff.
2020. References. In Introduction to Corpus Linguistics, ► pp. 233 ff.
This list is based on CrossRef data as of 3 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.