In:Rhapsodie: A prosodic and syntactic treebank for spoken French
Edited by Anne Lacheret-Dujour, Sylvain Kahane and Paola Pietrandrea
[Studies in Corpus Linguistics 89] 2019
► pp. 127–146
Chapter 7Annotation tools for syntax
Published online: 6 June 2019
https://doi.org/10.1075/scl.89.08ger
https://doi.org/10.1075/scl.89.08ger
Abstract
This chapter is devoted to the presentation of the tools and methods used for the different steps of the semi-automatic syntactic annotation: automatic preprocessing; microsyntactic parsing with the FRMG tool, correction of the parsing with the Arborator tool, agreement analysis, post-validation correction, and development of the final format of the Rhapsodie syntactic treebank. As FRMG is a parser for written French that was not configured to analyze disfluencies and reformulation, we used our manual pile marking to unfold the piles and produce a series of simplified “sentences” with only government relations. Despite having two annotators plus a validator for the corrections, we found a substantial number of errors in the post-validation procedure by using a set of rules to determine the well-formedness of the trees.
Article outline
- 1.Introduction
- 2.Parsers for written and spoken French
- 2.1Parsers for French
- 2.2The difficulty of parsing spoken language
- 3.Segmentation and choice of a formalism
- 4.Manual annotation with Pilepilot
- 5.Unfolding-Refolding
- 6.Parsing with FRMG
- 7.Integration of FRMG into Rhapsodie’s annotation process
- 8.Correction with Arborator
- 9.Agreement analysis
- 10.Post-validation correction
- 11.The distributed treebank format
- 12.Conclusion
Notes
