Logo do repositório
 
Publicação

Shallow Processing of Portuguese: From Sentence Chunking to Nominal Lemmatization

dc.contributor.advisorBranco, Antóniopor
dc.contributor.authorSilva, Joãopor
dc.date.accessioned2009-02-10T13:12:58Zpor
dc.date.accessioned2014-11-14T16:22:18Z
dc.date.available2009-02-10T13:12:58Zpor
dc.date.available2014-11-14T16:22:18Z
dc.date.issued2007-06por
dc.description.abstractThis dissertation proposes a set of procedures for the computational processing of Portuguese. Five tasks are covered: Sentence Segmentation, Tokenization, Part-of-Speech Tagging, Nominal Featurization and Nominal Lemmatization. These are some of the initial steps producing linguistic information Ñ such as POS categories or lemmas Ñ that is important to most subsequent processing (e.g. syntactic and semantic analysis). I follow a shallow processing approach, where linguistic information is associated to text based on local information (i.e. using the word itself or perhaps a limited window of context containing just a few words). I begin by identifying and describing the key problems raised by each task, with special focus on the problems that are speci?c to Portuguese. After an overview of existing approaches and tools, I describe the solutions I followed to the issues raised previously. I then report on my implementation of these solutions, which are found either to yield state-of-the-art performance or, in some cases, to advance the state-of-the-art. The major result of this dissertation is thus threefold: A description of the problems found in NLP of Portuguese, a set of algorithms and the corresponding tools to tackle those problems, together with their evaluation resultspor
dc.identifier.urihttp://hdl.handle.net/10451/14016por
dc.identifier.urihttp://repositorio.ul.pt/handle/10455/3095por
dc.language.isoporpor
dc.publisherDepartment of Informatics, University of Lisbonpor
dc.relation.ispartofseriesdi-fcul-tr-07-16por
dc.subjectNatural language processingpor
dc.subjectShallow processingpor
dc.subjectSentence segmentation, Tokenizationpor
dc.subjectMorphosyntatcic annotationpor
dc.subjectMorphological analysispor
dc.subjectLemmatizatipor
dc.titleShallow Processing of Portuguese: From Sentence Chunking to Nominal Lemmatizationpor
dc.typemaster thesis
dspace.entity.typePublication
rcaap.rightsopenAccesspor
rcaap.typemasterThesispor

Ficheiros

Principais
A mostrar 1 - 1 de 1
Miniatura indisponível
Nome:
07-16.pdf
Tamanho:
1.29 MB
Formato:
Adobe Portable Document Format