In:In Search of Basic Units of Spoken Language: A corpus-driven approach
Edited by Shlomo Izre'el, Heliana Mello, Alessandro Panunzi and Tommaso Raso
[Studies in Corpus Linguistics 94] 2020
► pp. 309–326
Chapter 1Segmentation and analysis of the two English excerpts
The Brazilian team proposal
Published online: 18 June 2020
https://doi.org/10.1075/scl.94.10ras
https://doi.org/10.1075/scl.94.10ras
Abstract
This paper has a tripartite focus: (1) to establish the best
segmentation for two American English texts according to inter-rater agreement measurements.
By doing this, we differentiate the behavior of experts and non-experts annotators. The
experts’ annotation constitute the basis for the analysis; (2) to capture, measure, and
analyze the phonetic features that correlate with boundaries, as they are marked by the
expert annotators; (3) to informationally annotate prosodic units according to the Language
into Act Theory, and analyze their corresponding information structure: in order to do this,
we make and justify decisions in marking the reference units and assigning informational
value to prosodic units; additionally we further discuss some cases of major
disagreements.
Article outline
- 1.Introduction
- 2.Inter-rater agreement in the segmentation
- 3.Phonetic measurements
- 4.Reference unit and IUs: A functional analysis
- 4.1The reference unit
- 4.2Intonation units and information pattern
- 5.Final remarks
Acknowledgements Notes References Appendix
References (36)
Barbosa, P. A. (2013). Semi-automatic
and automatic tools for generating prosodic descriptors for prosody
research. In B. Bigi & D. Hirst (Eds.), TRASP
2013
Proceedings (Vol. 13, pp. 86–89). Aix-en-Provence: Laboratoire Parole et Langage.
Barbosa, P., & Raso, T. (2018). Spontaneous
speech segmentation: Functional and prosodic aspects with applications for automatic
segmentation. Revista de Estudos da
Linguagem, 26(4), 1361–1396.
Barth-Weingarten, D. (2016). Intonation
units revisited. Cesura in
talk-in-interaction. Amsterdam: John Benjamins.
Bossaglia, G., Mello, H., & Raso, T. (this
volume). Illocution as a unit of reference for spontaneous
speech: An account for insubordinated adverbial clauses in Brazilian
Portuguese. In S. Izre’el, H. Mello, A. Panunzi, & T. Raso (Eds.), In
search of basic units of spoken language: A corpus-driven
approach. Amsterdam: John Benjamins.
Campbell, W. N. (1992). Syllable-based
segmental duration. In G. Bailly, C. Benoıt, & T. R. Sawallis (Eds.), Talking
machines: Theories, models, and
designs (pp. 221–224). Amsterdam: North-Holland.
Cavalcante, F. (2016). The
topic unit in spontaneous American English (Unpublished
Master’s thesis). Universidade Federal de Minas Gerais, Brazil.
(2018). The Information Unit of Topic: A Crosslinguistic, Statistical Study Based on Spontaneous Speech Corpora (Unpublished PhD Dissertation). Universidade Federal de Minas Gerais, Brazil.
(2009). La Stanza: un'unità di costruzione testuale del parlato. In A. Ferrari (Ed.), Sintassi storica e sincronica dell’italiano. Subordinazione, coordinazione, giustapposizione. Atti del X Congresso della Società Internazionale di Linguistica e Filologia Italiana, vol 2 (pp. 713–732). Firenze: Firenze University Press.
(2011). The
definition of focus in Language into Act Theory
(L-AcT). In H. Mello, A. Panunzi, & T. Raso, Pragmatics
and prosody, illocution, modality, attitude, information patterning and speech
annotation (pp. 39–82). Firenze: Firenze University Press.
(2018). The
illocution-prosody relationship and the information pattern in spontaneous speech
according to the Language into Act
Theory. In M. Heinz & M. C. Moroni (Eds.), Prosody:
Grammar, information structure, interaction. Linguistik
Online, 88(1), 33–62.
(this
volume). The pragmatic analysis of speech and its
illocutionary classification according to Language into Act
Theory. In S. Izre’el, H. Mello, A. Panunzi, & T. Raso (Eds.), In
search of basic units of spoken language: A corpus-driven
approach. Amsterdam: John Benjamins.
Cresti, E., & Moneglia, M. (this
volume). Some notes on the excerpts according to
L-AcT. In S. Izre’el, H. Mello, A. Panunzi, & T. Raso (Eds.), In
search of basic units of spoken language: A corpus-driven
approach. Amsterdam: John Benjamins.
Du Bois, J. W., Chafe, W. L., Meyer, C., & Thompson, S. A. (2000–2005). Santa
Barbara corpus of spoken American English, Part
1–4. Philadelphia, PA: Linguistic Data Consortium.
Fleiss, J. L. (1971). Measuring
nominal scale agreement among many raters. Psychological
Bulletin, 76, 378–382.
Frosali, F. F. (2008). L’unità
di informazione di ausilio dialogico: Valori percentuali, caratteri intonativi, lessicali
e morfo-sintattici in un corpus di italiano parlato
(C-ORAL-ROM). In E. Cresti (Ed.), Prospettive
nello studio del lessico
italiano (pp. 417–424). Firenze: Firenze University Press.
Giani, D. (2004). Una
strategia di costruzione del testo parlato: L’introduttore
locutivo. In F. A. Leoni, F. Cutugno, M. Pettorino, & R. Savy (Eds.), Atti
del convegno “Il parlato italiano” Napoli, 13–15.02
2003 (pp. 84–97). Napoli: M. D‟Auria.
t’Hart, J. (1981). Differential
sensitivity to pitch distance, particularly in
speech. Journal of the Acoustical Society of
America, 69(3), 811–821.
Maia Rocha, B., & Raso, T. (2011). A
unidade informacional de introdutor locutivo no português do Brasil: Uma primeira
descrição baseada em corpus. Domínios de
Linguagem, 5(1), 327–343.
Mittmann, M., & Barbosa, P. A. (2016). An
automatic speech segmentation tool based on multiple acoustic
parameters. CHIMERA: Romance Corpora and Linguistic
Studies, 3(2), 133–147.
Moneglia, M. (2005). The
C-ORAL-ROM resource. In E. Cresti & M. Moneglia (Eds.), C-ORAL-ROM:
Integrated reference corpora for spoken Romance
languages (pp. 1–70). Amsterdam: John Benjamins.
Moneglia, M., & Raso, T. (2014). Notes
on language into act
theory. In T. Raso & H. Mello (Eds.), Spoken
corpora and linguistic
studies (pp. 468–495). Amsterdam: John Benjamins.
Quené, H. (2007). On
the just noticeable difference for tempo in speech. Journal
of
Phonetics, 35(3), 353–362.
Raso, T. (2014). Prosodic
constraints for discourse
markers. In T. Raso & H. Mello (Eds.), Spoken
corpora and linguistic
studies (pp. 412–467). Amsterdam: John Benjamins.
Raso, T., Cavalcante, F., & Mittmann, M. (2017). Prosodic
forms of the topic information unit in a cross-linguistic perspective: A first
survey. In A. De Meo & F. M. Dovetto, La
comunicazione parlata/Spoken
communication (pp. 473–498). Roma: Aracne.
Raso, T., Ferrari, L. Uso dei Segnali Discorsivi in corpora di parlato spontaneo italiano e brasiliano. In: Ferroni, R., Birello, M. (forthcoming). La competenza discorsiva a lezione di lingua straniera. Roma: Aracne
Raso, T., Mittmann, M., & Oliveira Mendes, A. C. (2015). O
papel da pausa na segmentação prosódica de corpora de
fala. Revista de Estudos da
Linguagem, 23(3), 883–922.
Raso, T., & Rocha, B. (2017). Illocution
and attitude: On the complex interaction between prosody and pragmatic
parameters. Journal of Speech
Science, 5, 5–27.
Rocha, B., & Raso, T. (2016). The
interaction between illocution and attitude, and its consequences for the empirical study
of illocutions. In C. Bardel & A. De Meo (Eds.), Parler
les langues romanes: Proceedings of the international GSCP conference (Stockholm,
2014) (pp. 69–88). Napoli: Università degli Studi L’Orientale.
Schneider, S. (2007). Reduced
parenthetical clauses as mitigators: A corpus study of spoken French, Italian and
Spanish. Amsterdam: John Benjamins.
Senn, P., Kompis, M., Vischer, M., & Haeusler, R. (2005). Minimum
audible angle, just noticeable interaural differences and speech intelligibility with
bilateral cochlear implants using clinical speech
processors. Audiology and
Neurotology, 10, 342–352.
Silber-Varod, V. (2011). The
SpeeCHain perspective: Prosody-syntax interface in spontaneous spoken
Hebrew (Unpublished doctoral
dissertation). Tel- Aviv University, Israel.
Teixeira, B. H. (2018). Correlatos
fonético-acústicos de fronteiras prosódicas na fala
espontânea (Unpublished Master’s
thesis). Universidade Federal de Minas Gerais, Belo Horizonte, Brazil.
Teixeira, B. H., Barbosa, P., & Raso, T. (2018). Automatic
detection of prosodic boundaries in Brazilian Portuguese spontaneous
speech. In A. Villavicencio, V. Moreira, A. Abad, H. Caseli, P. Gamallo, C. Ramisch, H. G. Oliveira, & G. H. Paetzold (Eds.), Computational
processing of the Portuguese
language (pp. 429–437). New York, NY: Springer.
Cited by (3)
Cited by three other publications
Majhenič, Simona, Mitja Beras & Janez Križaj
Barbosa, Plínio A.
2020. Cross-linguistic comparison of automatic detection of speech breaks in read and
narrated speech in four languages. In In search of basic units of spoken language [Studies in Corpus Linguistics, 94], ► pp. 285 ff.
Izre’el, Shlomo, Heliana Mello, Alessandro Panunzi & Tommaso Raso
2020. In search of a basic unit of spoken language. In In search of basic units of spoken language [Studies in Corpus Linguistics, 94], ► pp. 1 ff.
This list is based on CrossRef data as of 1 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
