Logo do repositório
 
Publicação

A Portuguese Native Language Identification Dataset

dc.contributor.authordel Río, Iria
dc.contributor.authorZampieri, Marcos
dc.contributor.authorMalmasi, Shervin
dc.date.accessioned2018-05-24T10:58:38Z
dc.date.available2018-05-24T10:58:38Z
dc.date.issued2018
dc.description.abstractIn this paper we present NLI-PT, the first Portuguese dataset compiled for Native Language Identification (NLI), the task of identifying an author’s first language based on their second language writing. The dataset includes 1,868 student essays written by learners of European Portuguese, native speakers of the following L1s: Chinese, English, Spanish, German, Russian, French, Japanese, Italian, Dutch, Tetum, Arabic, Polish, Korean, Romanian, and Swedish. NLI-PT includes the original student text and four different types of annotation: POS, fine-grained POS, constituency parses, and dependency parses. NLI-PT can be used not only in NLI but also in research on several topics in the field of Second Language Acquisition and educational NLP. We discuss possible applications of this dataset and present the results obtained for the first lexical baseline system for Portuguese NLI.pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.citationdel Río, Iria; Zampieri, Marcos; Malmasi, Shervin (2018): A Portuguese Native Language Identification Dataset in "The Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications". The Association for Computational Linguistics: New Orleanspt_PT
dc.identifier.urihttp://hdl.handle.net/10451/33644
dc.language.isoengpt_PT
dc.publisherThe Association for Computational Linguisticspt_PT
dc.relationDETECÇÃO E CORREÇÃO AUTOMÁTICA DE ERROS EM PORTUGUÊS SEGUNDA LÍNGUA/LÍNGUA ESTRANGEIRA
dc.titleA Portuguese Native Language Identification Datasetpt_PT
dc.typeconference object
dspace.entity.typePublication
oaire.awardNumberSFRH/BPD/109914/2015
oaire.awardTitleDETECÇÃO E CORREÇÃO AUTOMÁTICA DE ERROS EM PORTUGUÊS SEGUNDA LÍNGUA/LÍNGUA ESTRANGEIRA
oaire.awardURIinfo:eu-repo/grantAgreement/FCT/OE/SFRH%2FBPD%2F109914%2F2015/PT
oaire.citation.conferencePlaceNew Orleanspt_PT
oaire.citation.titleThe Thirteenth Workshop on Innovative Use of NLP for Building Educational Applicationspt_PT
oaire.fundingStreamOE
project.funder.identifierhttp://doi.org/10.13039/501100001871
project.funder.nameFundação para a Ciência e a Tecnologia
rcaap.rightsopenAccesspt_PT
rcaap.typeconferenceObjectpt_PT
relation.isProjectOfPublicationa663ff8b-f624-4c4d-af6f-69ab41fcbe3b
relation.isProjectOfPublication.latestForDiscoverya663ff8b-f624-4c4d-af6f-69ab41fcbe3b

Ficheiros

Principais
A mostrar 1 - 1 de 1
A carregar...
Miniatura
Nome:
A Portuguese Native Language Identification Dataset.pdf
Tamanho:
97.93 KB
Formato:
Adobe Portable Document Format
Licença
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
license.txt
Tamanho:
1.2 KB
Formato:
Item-specific license agreed upon to submission
Descrição: