Article published In: Languages in Contrast
Vol. 21:2 (2021) ► pp.298–322
Paraphrase and parallel treebank for the comparison of French and Chinese syntax
Published online: 20 April 2021
https://doi.org/10.1075/lic.20002.poi
https://doi.org/10.1075/lic.20002.poi
Abstract
This paper proposes to study the contrastive syntax of French and Chinese through the lens of syntactic mismatches, and by making use of parallel treebanks. A syntactic mismatch is the non-similarity between the syntactic structures of one linguistic unit and its translation. Syntactic mismatches are formalized using the notion of paraphrase from the Meaning-Text Theory, which allows for capturing mismatches at different levels of the linguistic description (e.g. Semantic, Deep-Syntactic, and Surface-Syntactic). In this paper, we report in details on the types of paraphrases found in the seed corpus used, demonstrating that the Deep-Syntactic paraphrases constitute the best starting point for our study. Then, we show how, starting from the seed corpus, we semi-automatically constructed a multi-layer parallel treebank with the alignment and annotation of paraphrases.
Article outline
- 1.Introduction
- 2.Background and theoretical framework
- 2.1Contrastive syntax and syntactic mismatches
- 2.2The formalization of syntactic mismatches
- 2.3The Meaning-Text Theory
- 2.3.1Representation levels
- 2.3.2Actancial relations in the Meaning-Text Model
- 2.3.3The modules of the MTM
- 3.French-Chinese paraphrases in a literary corpus
- 3.1Definition of a paraphrase
- 3.2Semantic paraphrase
- 3.2.1Semantic-Propositional paraphrase
- a.Expansion/reduction
- b.Addition/substraction
- 3.2.2Semantic-Communicative paraphrase
- 3.2.1Semantic-Propositional paraphrase
- 3.3Deep-Syntactic paraphrase
- 3.3.1Synonymy: L ≡ Syn(L)
- 3.3.2Antonymy: L ≡ Anti(L) + NOT
- 3.3.3Conversion: L ≡ Convijkl(L)
- 3.3.4Derivation: L ≡ Der(L)
- 3.4Surface-Syntactic paraphrase
- 3.4.1Different realization of a DSyntA
- 3.4.2Pronominalization
- 4.French-Chinese multi-layer parallel treebank construction
- 4.1Automatic annotation of surface and deep syntactic layers
- 4.2Deep-Syntactic paraphrases annotation
- 5.Conclusion
- Acknowledgements
- Notes
References
References (55)
Barnett, J., Mani, I., Martin, P. and Rich, E. 1991. Reversible Machine Translation: What to Do When the Languages Don’t Line up. Proceedings of the Workshop on Reversible Grammars in NLP (ACL ’91). Berkeley, USA, 17 June 1991. Association for Computational Linguistics. 61–70.
Bohnet, B. 2010. Top Accuracy and Fast Dependency Parsing is not a Contradiction. Proceedings of the Twenty-Third International Conference on Computational Linguistics (COLING ’10). Beijing, China, 23–27 August 2010. Tsinghua University Press. 89–97.
Buch-Kromann, M., Korzen, I. and Müller, H. 2009. Uncovering the ‘Lost’ Structure of Translations with Parallel Treebanks. In Methodology, Technology and Innovation in Translation Process Research, F. Alves, S. Göpferich and I. Mees (eds), 199–224. Copenhagen: Samfundslitteratur.
Choi, J. D., Tetreault, J. and Stent, A. 2015. It Depends: Dependency Parser Comparison Using a Web-Based Evaluation Tool. Proceedings of the Fifty-Third Annual Meeting of the Association for Computational Linguistics and the Seventh International Joint Conference on Natural Language Processing. Beijing, China, 26–31 July 2015. Association for Computational Linguistics. 387–396.
Dorr, B. J. 1994. Machine Translation Divergences: A Formal Description and Proposed Solution. Computational Linguistics 20(4): 597–633.
François, J. 1973. La notion de métataxe chez Tesnière. Analyse critique sur la base de trois travaux de sémantique générative. Documentation et Recherche en Linguistique Allemande Vincennes (DRLAV) 51: 1–45.
Gast, V. 2012. Contrastive Linguistics: Theories and Methods. In Dictionaries of Linguistics and Communication Science: Linguistics Theory and Methodology, B. Kortmann and J. Kabatek (eds). Berlin: Mouton de Gruyter.
2015. On the Use of Translation Corpora in Contrastive Linguistics: A Case Study of Impersonalization in English and German. Languages in Contrast 15(1): 4–33.
Granger, S. 2003. The Corpus Approach: A Common Way Forward for Contrastive Linguistics and Translation Studies? In Corpus-Based Approaches to Contrastive Linguistics and Translation Studies, S. Granger, J. Lerot and S. Petch-Tyson (eds), 17–29. Amsterdam: Rodopi.
2010. Comparable and Translation Corpora in Cross-Linguistic Research. Design, Analysis and Application. Journal of Shanghai Jiaotong University 21: 14–21.
Granger, S. and Lefer, M.-A. 2020. Introduction: A Two-Pronged Approach to Corpus-Based Crosslinguistic Studies. Languages in Contrast 20(2): 167–183.
Iordanskaja, L. and Mel’čuk, I. 2017. Le mot français dans le lexique et dans la phrase. Paris: Hermann.
Johansson, S. 2007. Seeing through Multilingual Corpora: On the Use of Corpora in Contrastive Studies. Amsterdam: John Benjamins.
Kameyama, M., Ochitani, R. and Peters, S. 1991. Resolving Translation Mismatches with Information Flow. Proceedings of the Twenty-Ninth Annual Meeting of the Association for Computational Linguistics (ACL ’91). Berkeley, USA, 18–21 June 1991. Association for Computational Linguistics. 193–200.
Knittel, M.-L. 2009. Le statut des compléments du nom [de NP]. Canadian Journal of Linguistics 21: 255–299.
Koch, P. 2003. Metataxe bei Lucien Tesnière. In Dependent und Valenz. Eininternationales Handbuch zeitgenössischer Forschung, V. Ágael (ed.), 144–159. Berlin: De Gruyter.
Li, C. N. and Thompson, S. A. 1981. Mandarin Chinese: A Functional Reference Grammar. Berkeley: University of California Press.
Li, F. 1997. Cross-Linguistic Lexicalization Patterns: Diachronic Evidence from Verb-Complement Compounds in Chinese. Sprachtypologie und Unversalienforschung 31: 229–252.
Liu, M. 1997. Conceptual Basis and Categorial Structure: A Study of Mandarin VR Compounds as a Radial Category. Chinese Language and Linguistics 41: 425–451.
2009. Dependency in Natural Language. In Dependency in Linguistic Description, A. Polguère and I. Mel’čuk (eds), 1–100. Amsterdam: John Benjamins.
2012. Semantics: From Meaning to Text. Vol. 11. Amsterdam: John Benjamins.
2013. Semantics: From Meaning to Text. Vol. 21. Amsterdam: John Benjamins.
2014. The East/South-East Asian Answer to the European Passive. Acta Linguistica Petropolitana 10(3): 451–472.
Mel’čuk, I. and Savvina, E. 1978. Toward a Formal Model of Alutor Surface Syntax: Predicative and Completive Constructions. Language Special Issue: 5–39.
Mel’čuk, I. and Wanner, L. 2001. Towards a Lexicographic Approach to Lexical Transfer in Machine Translation. Machine Translation 16(1): 21–87.
Miao, J. 2012. Approches textométriques de la notion de style du traducteur. PhD Thesis, University of Sorbonne Nouvelle.
Milićević, J. 2006. A Short Guide to the Meaning-Text Linguistic Theory. Journal of Koralex 81: 187–233.
Mille, S., Belz, A., Bohnet, B. and Wanner, L. 2018. Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation. Proceedings of the Eleventh International Conference on Natural Language Generation (INLG ’18). Tilburg, Netherlands, 5–8 November 2018. Association for Computational Linguistics. 199–209.
Nguyen, V. T. É. 2006. Unité lexicale et morphologie en chinois mandarin. Vers l’élaboration d’un Dictionnaire Explicatif et Combinatoire du chinois. PhD Thesis, Montreal University.
Nivre, J., de Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajič, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., Silveira, N., Tsarfaty, R. and Zeman, D. 2016. Universal Dependencies v1: A Multilingual Treebank Collection. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC ’16). Portorož, Slovenia, 23–28 May 2016. European Language Resources Association (ELRA). 1659–1666.
Peyraube, A. 2006. Motion Events in Chinese: A Diachronic Study of Directional Complements. In Space in Language: Linguistic Systems and Cognitive Categories, M. Hickmann and S. Robert (eds), 121–135. Amsterdam: John Benjamins.
Poiret, R. and Liu, H. 2019. Les dépendants adnominaux prépositionnels en français : Relations syntaxiques de surface dans le syntagme N→SP. Le français moderne 87(2): 259–280.
Polguère, A. 2011. Perspective épistémologique sur l’approche linguistique Sens-Texte. Mémoires de la Société Linguistique de Paris XX1: 79–114.
2014. Rection nominale : Retour sur les constructions évaluatives. Travaux de linguistique 68(1): 83–102.
Samuelsson, Y. and Volk, M. 2006. Phrase Alignment in Parallel Treebanks. Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories (LTT ’06). Prague, Czech Republic, 1–2 December 2006. 91–102.
Schmied, J. 2004. Translation Corpora in Contrastive Research, Translation and Language Learning. Tradterm 101: 83–115.
2009. Contrastive Corpus Studies. In Corpus Linguistics. An International Handbook, A. Lüdeling and M. Kytö (eds), 1140–1159. Berlin: Mouton de Gruyter.
Schubert, K. 1987. Metataxis. Contrastive Dependency Syntax for Machine Translation. Distributed Language Translation 2. Dordrecht: Foris.
Shi, W. and Wu, Y. 2014. Which Way to Move: The Evolution of Motion Expressions in Chinese. Linguistics 521: 1237–1292.
Shi, W., Yang, W. and Su, H. 2018. The Typological Change of Motion Expressions in Chinese Revisited: Motion Events in Old Chinese and its Modern Chinese Translation. Studies in Language 42(4): 847–885.
Sun, Y. 2012. Étude contrastive des ordres des mots et des propositions en français et en chinois. PhD Thesis, Wuhan University.
Talmy, L. 2000. Toward a Cognitive Semantics, Vol. 2, Typology and Process in Concept Structuring. Cambridge, MA: MIT Press.
Tiedemann, J. 2012. Parallel Data, Tools and Interfaces in OPUS. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC ’12). Istanbul, Turkey, 21–27 May 2012. European Language Resources Association (ELRA). 2214–2218.
Cited by (1)
Cited by one other publication
This list is based on CrossRef data as of 26 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
