Article published In: International Journal of Corpus Linguistics
Vol. 12:3 (2007) ► pp.335–374
The corpus, its users and their needs
A user-oriented evaluation of COMPARA
Published online: 16 October 2007
https://doi.org/10.1075/ijcl.12.3.03san
https://doi.org/10.1075/ijcl.12.3.03san
COMPARA is a bidirectional parallel corpus of English and Portuguese, currently with 3 million words. The corpus was launched in 2000 and at present it is possibly the largest edited parallel corpus publicly available on the Web, with roughly 6,000 corpus queries per month. This paper summarizes an analysis of six years of corpus use. We begin by looking at user studies for language resources, especially corpora, and then we provide a snapshot of COMPARA’s users and their behaviour based on log analysis. Particular emphasis is given to the language interface preferred by users (Portuguese and English are possible), the choice between the Simple and Complex Search modes, the reasons underlying null-results and behaviour after restricted output. The data has pointed us to cases where COMPARA’s Web interface can be improved, and provided insights about our users and the problems they face, although further studies that distinguish between different kinds of users remain necessary.
Keywords: Portuguese, English, parallel corpora, usability, evaluation, log analysis, interface design, error analysis
Cited by (9)
Cited by nine other publications
Liu, Kanglong, Yanfang Su, Chun Lai & Tan Jin
Liu, Kanglong, Yanfang Su & Dechao Li
ÖNAL, Erdem & Bahtiyar MAKAROĞLU
Arhar Holdt, Špela, Kaja Dobrovoljc & Nataša Logar
Frankenberg-Garcia, Ana
Frankenberg-Garcia, Ana
Pérez-Paredes, Pascual, María Sánchez-Tornel, Jose María Alcaraz Calero & Pilar Aguado Jiménez
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
