Exploitation and Evaluation of an Arabic-English Composite Learner Translator Corpus




Arabic multimodal parallel learner corpus, process and product-oriented translation, triangulation


This paper describes in depth the data collection and exploitation stages in constructing the undergraduate learner translator corpus (ULTC), a 75 million-word sentence-aligned bidirectional parallel corpus of Arabic, English, and French, with Arabic as its central language. We focus on the methodological challenges, and describe the compilation process and problems encountered in the first phase of the project. Our aim is to inform future compilers of similar projects that integrate learner corpus research (LCR) and corpus-based translation studies (CBTS). In the first part, we present design considerations, data collection criteria, and the exploitation of the corpus, and in the second part, we evaluate the systems we used and possible improvements

Author Biographies

Reem F.  Alfuraih, Princess Norah bint Abdul Rahman University, Saudi Arabia

 (Lecturer in Applied Linguistics) 

Department of Applied Linguistics, College of Languages

Princess Norah bint Abdul Rahman University, Saudi Arabia

Noha M. El-Jasser, Princess Noura bint Abdulrahman University, Saudi Arabia

  (Lecturer in Translation)

Department of Translation, College of Languages

Princess Noura bint Abdulrahman University, Saudi Arabia


