The corpus, its users and their needs: A user-oriented evaluation of COMPARA

Santos, Diana; Frankenberg-Garcia, Ana

doi:10.1075/ijcl.12.3.03san

Article published In: International Journal of Corpus Linguistics
Vol. 12:3 (2007) ► pp.335–374

Get fulltext from our e-platform

Download PDF

The corpus, its users and their needs

A user-oriented evaluation of COMPARA

Diana Santos | SINTEF ICT, Norway

Ana Frankenberg-Garcia | Instituto Superior de Línguas e Administração (ISLA), Portugal

Published online: 16 October 2007

https://doi.org/10.1075/ijcl.12.3.03san

COMPARA is a bidirectional parallel corpus of English and Portuguese, currently with 3 million words. The corpus was launched in 2000 and at present it is possibly the largest edited parallel corpus publicly available on the Web, with roughly 6,000 corpus queries per month. This paper summarizes an analysis of six years of corpus use. We begin by looking at user studies for language resources, especially corpora, and then we provide a snapshot of COMPARA’s users and their behaviour based on log analysis. Particular emphasis is given to the language interface preferred by users (Portuguese and English are possible), the choice between the Simple and Complex Search modes, the reasons underlying null-results and behaviour after restricted output. The data has pointed us to cases where COMPARA’s Web interface can be improved, and provided insights about our users and the problems they face, although further studies that distinguish between different kinds of users remain necessary.

Keywords: Portuguese, English, parallel corpora, usability, evaluation, log analysis, interface design, error analysis

Cited by (9)

Cited by nine other publications

Order by:

Liu, Kanglong, Yanfang Su, Chun Lai & Tan Jin

2024. How do students engage with parallel corpora in translation? A multiple case study approach. International Journal of Applied Linguistics 34:4 ► pp. 1746 ff.

Liu, Kanglong, Yanfang Su & Dechao Li

2023. How Do Students Perform and Perceive Parallel Corpus Use in Translation Tasks? Evidence from an Experimental Study. In Corpora and Translation Education [New Frontiers in Translation Studies, ], ► pp. 135 ff.

ÖNAL, Erdem & Bahtiyar MAKAROĞLU

2021. A Lexicological Approach to Look-up Frequency of Turkish Sign Language Dictionary Users. Dil Eğitimi ve Araştırmaları Dergisi 7:1 ► pp. 193 ff.

Arhar Holdt, Špela, Kaja Dobrovoljc & Nataša Logar

2019. Simplicity matters: user evaluation of the Slovene reference corpus. Language Resources and Evaluation 53:1 ► pp. 173 ff.

Frankenberg-Garcia, Ana

2012. Raising teachers' awareness of corpora. Language Teaching 45:4 ► pp. 475 ff.

Frankenberg-Garcia, Ana

2020. Combining user needs, lexicographic data and digital writing environments. Language Teaching 53:1 ► pp. 29 ff.

Warren, Martin

2012. Corpora: Specialized. In The Encyclopedia of Applied Linguistics,

Pérez-Paredes, Pascual, María Sánchez-Tornel, Jose María Alcaraz Calero & Pilar Aguado Jiménez

2011. Tracking learners' actual uses of corpora: guided vs non-guided corpus consultation. Computer Assisted Language Learning 24:3 ► pp. 233 ff.

Nebot, Esther Monzó

2008. Corpus-based Activities in Legal Translator Training. The Interpreter and Translator Trainer 2:2 ► pp. 221 ff.

This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.