Article published In: Lingvisticæ Investigationes
Vol. 47:1 (2024) ► pp.97–120
Contrastive study of verbal structures in medical and general corpora in French
Published online: 31 October 2024
https://doi.org/10.1075/li.00105.ivc
https://doi.org/10.1075/li.00105.ivc
Summary
This study examines the syntactic features of French medical and general texts to clarify their complexity and
implications for comprehension. By analysing corpora from both domains, we found significant differences in the use of passive
voice, present participles, negation and gerund constructions. In medical texts, the passive voice and present participles are
more frequent, reflecting specialized discourse and precision. In contrast, negation and gerunds are more common in general texts,
emphasizing the diversity of syntactic structures, and specific stylistic and argumentative effects. Our findings underline the
need for clear communication in medical texts and provide empirical evidence for the simplification.
Article outline
- 1.Introduction
- 2.Related work
- 2.1Readability of texts
- 2.2Simplification of texts
- 3.Objectives
- 4.Description of the data analyzed
- 4.1Corpora
- 4.1.1General-Language corpus
- 4.1.2Medical-Language corpus
- 4.2Syntactic structures
- 4.2.1Passive voice
- 4.2.2Present participle
- 4.2.3Gerund
- 4.2.4Syntactic negation
- 4.1Corpora
- 5.Approach for the contrastive analysis of corpora
- 5.1Identification of sentences with the four syntactic structures
- 5.1.1Passive voice
- 5.1.2Gerund
- 5.1.3Present participle
- 5.1.4Syntactic negation
- 5.2Analysis and evaluation
- 5.1Identification of sentences with the four syntactic structures
- 6.Results on extraction of syntactic structures and their comparison
- 6.1Preprocessing
- 6.2Passive voice
- 6.3Gerund
- 6.4Present participle
- 6.5Syntactic negation
- 6.6Comparison between syntactic structures
- 7.Conclusion and future work
- Notes
References
References (38)
Abdaoui, A., Tchechmedjiev, A., Digan, W., Bringay, S. & Jonquet, C. (2017). French
ConText: Détecter la négation, la temporalité et le sujet dans les textes cliniques
français. In SIIM 2017 — 4ème Symposium sur l’Ingénierie de
l’Information Médicale, Toulouse, France.
Ahlfeldt, H., Borin, L., Daumke, P., Grabar, N., Hallett, C., Hardcastle, D., Kokkinakis, D., Mancini, C., Marko, K., Merkel, M., Pietsch, C., Power, R., Scott, D., Silvervarg, A., Gronostaj, M. T., Williams, S. & Willis, A. (2006). Literature
Review on Patient-Friendly Documentation Systems. Rapport interne, NoE Semantic Mining. WP 27, Del. 1.
Arnavielle, T. (2010). Centre
and periphery in conjugation. the case of the french gerund. Echo des etudes
romanes, 6(1), 129–142.
Audiau, A. (2009). L’information
pour tous. Règles européennes pour une information facile à lire et à comprendre. Rapport interne, Nous aussi, UNAPEI.
Baker, L. & Gollop, C. (2004). Medical
textbooks: Can lay people read and understand them? Library
trends, 531, 336–347.
Brouwers, L., Bernhard, D., Ligozat, A.-L. & François, T. (2014). Syntactic
sentence simplification for French. In PITR
workshop, 47–56.
Brunato, D., Dell’Orletta, F., Venturi, G. & Montemagni, S. (2014). Defining
an annotation scheme with a view to automatic text
simplification. In CLICIT, 87–92.
Canut, E. & Husianycia, M. (2023). Référentiel
pour l’analyse linguistique de textes et la rédaction intelligible de types d’ecrits (Guide
ALT&RITE). Rapport interne, AsFoRel.
Cartoni, B. & Deléger, L. (2011). Découverte
de patrons paraphrastiques en corpus comparable: une approche basée sur les
n-grammes. In Traitement Automatique des Langues Naturelles
(TALN), 1–6.
Dalloux, C., Claveau, V., Grabar, N. & Moro, C. (2018). Portée
de la négation : détection par apprentissage supervisé en français et portugais
brésilien. In TALN
2018, 1–6.
Dalloux, C., Claveau, V., Grabar, N., Oliveira, L., Moro, C., Gumiel, Y. & Carvalho, D. (2020). Supervised
learning for the detection of negation and of its scope in French and Brazilian Portuguese biomedical
corpora. Natural Language Engineering
journal, 27(2), 181–201.
Deléger, L. & Zweigenbaum, P. (2008). Paraphrase
acquisition from comparable medical corpora of specialized and lay
texts. In Ann Symp Am Med Inform Assoc
(AMIA), 146–50.
Gala, N., François, T., Javourey-Drevet, L. & Ziegler, J. C. (2018). La
simplification de textes, une aide à l’apprentissage de la lecture. Langue
française, 199(3), 123–131.
Grabar, N. & Cardon, R. (2018). Clear
— simple corpus for medical French. In Workshop on Automatic Text
Adaption (ATA), 1–11.
Grabar, N., Dalloux, C. & Claveau, V. (2020). CAS:
corpus of clinical cases in French. Journal of BioMedical
Semantics, 11(1), 1–7.
Honnibal, M., Montani, I., Van Landeghem, S. & Boyd, A. (2020). spaCy:
Industrial-strength natural language processing in python.
Ibrahim, H. & Idrus, H. (2021). Investigating
the syntactic structures of patient information leaflets. SHS Web of
Conferences, 1241, 01004.
Javourey-Drevet, L., Dufau, S., François, T., Gala, N., Ginestié, J. & Ziegler, J. C. (2022). Simplification
of literary and scientific texts to improve reading fluency and comprehension in beginning readers of
French. Applied
Psycholinguistics, 431, 485–512.
Kauchak, D. & Leroy, G. (2016). Moving
beyond readability metrics for health-related text simplification. IT
Professional, 18(3), 45–51.
Leroy, G., Kauchak, D. & Coster, W. (2012). A Systematic
grammatical analysis of easy and difficult medical text. American Medical Informatics Association (AMIA) Fall Symposium.
L’Homme, M.-C. (2002). What
can verbs and adjectives tell us about terms ? Terminology and Knowledge Engineering, Nancy, France.
Meisner, C., Robert-Tissot, A. & Stark, E. (2015). L’absence et la présence du ne de négation. In Encyclopédie Grammaticale du Français, en ligne : encyclogram.fr.
Millar, N. & Budgell, B. S. (2019). The
passive voice and comprehensibility of biomedical texts: An experimental study with 2 cohorts of chiropractic
students. Journal of Chiropractic
Education, 33(1), 16–20. Epub 2018 Aug 2.
Pearson, K. (1900). On
the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that
it can be reasonably supposed to have arisen from random sampling. Philosophical
Magazine, 50(302), 157–17.
Rihs, A. (2009). Gérondif,
participe présent et expression de la cause. Nouveaux cahiers de linguistique
française, 291, 197–214.
Ruel, J., Kassi, B., Moreau, A. & Mbida-Mballa, S. (2011). Guide
de rédaction pour une information
accessible. Gatineau: Pavillon du Parc.
Sackett, D., Rosenberg, W., Gray, J., Haynes, R. & Richardson, W. (1996). Evidence
based medicine: what it is and what it
isn’t. BMJ, 312(7023), 71–2.
Saggion, H. (2017). Automatic
Text
Simplification, volume 321 of Synthesis
Lectures on Human Language Technologies. University of Toronto: Morgan & Claypool.
Seretan, V. (2012). Acquisition
of syntactic simplification rules for French. In N. Calzolari et al. (Eds.), Proceedings
of the Eighth International Conference on Language Resources and Evaluation
(LREC’12), 4019–4026, Istanbul, Turkey.
Street, R. L. J., Makoul, G., Arora, N. K. & Epstein, R. M. (2009). How
does communication heal? pathways linking clinician-patient communication to health
outcomes. Patient Education and
Counseling, 74(3), 295–301. Epub 2009 Jan 15.
Tchami, O. W. (2014). The
descriptive approaches of the verb in linguistics, terminology and nlp (les modèles de description du verbe dans les travaux
de linguistique, terminologie et tal) [in
french]. In JEP/TALN/RECITAL.
