Logo do repositório
 
A carregar...
Miniatura
Publicação

Towards error annotation in a learner corpus of Portuguese

Utilize este identificador para referenciar este registo.
Nome:Descrição:Tamanho:Formato: 
ecp16130002.pdf446.65 KBAdobe PDF Ver/Abrir

Orientador(es)

Resumo(s)

In this article, we present COPLE2, a new corpus of Portuguese that encompasses written and spoken data produced by foreign learners of Portuguese as a foreign or second language (FL/L2). Following the trend towards learner corpus research applied to less commonly taught languages, it is our aim to enhance the learning data of Portuguese L2. These data may be useful not only for educational purposes (design of learning materials, curricula, etc.) but also for the development of NLP tools to support students in their learning process. The corpus is available online using TEITOK environment, a web-based framework for corpus treatment that provides several built-in NLP tools and a rich set of functionalities (multiple orthographic transcription layers, lemmatization and POS, normalization of the tokens, error annotation) to automatically process and annotate texts in xml format. A CQP-based search interface allows searching the corpus for different fields, such as words, lemmas, POS tags or error tags. We will describe the work in progress regarding the constitution and linguistic annotation of this corpus, particularly focusing on error annotation.

Descrição

Palavras-chave

Contexto Educativo

Citação

Río, Iria del; Antunes, Sandra; Mendes, Amália & Janssen, Maarten (2016). Towards error annotation in a learner corpus of Portuguese. 5th NLP4CALL and 1st NLP4LA workshop in Sixth Swedish Language Technology Conference (SLTC). Umeå University, Sweden, 17-18 November.

Projetos de investigação

Projeto de investigaçãoVer mais

Unidades organizacionais

Fascículo

Editora

Linköping University Electronic Press

Licença CC