Please use this identifier to cite or link to this item: http://hdl.handle.net/10451/31093
Title: Revising the Annotation of a Broadcast News Corpus: a Linguistic Approach
Author: Cabarrão, Vera
Moniz, Helena
Batista, Fernando
Ribeiro, Ricardo
Mamede, Nuno
Meinedo, Hugo
Trancoso, Isabel
Mata, Ana Isabel
Matos, David
Keywords: Speech annotation
Metadata
Broadcast news
Issue Date: 2014
Publisher: European Language Resources Association (ELRA)
Citation: Cabarrão, V., Moniz, H., Batista, F., Ribeiro, R., Mamede, N., Meinedo, H., Trancoso, I., Mata, A. I. & de Matos, D. (2014) Revising the Annotation of a Broadcast News Corpus: a Linguistic Approach, in Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), European Language Resources Association (ELRA), 3908-3913.
Abstract: This paper presents a linguistic revision process of a speech corpus of Portuguese broadcast news focusing on metadata annotation for rich transcription, and reports on the impact of the new data on the performance for several modules. The main focus of the revision process consisted on annotating and revising structural metadata events, such as disfluencies and punctuation marks. The resultant revised data is now being extensively used, and was of extreme importance for improving the performance of several modules, especially the punctuation and capitalization modules, but also the speech recognition system, and all the subsequent modules. The resultant data has also been recently used in disfluency studies across domains.
URI: http://hdl.handle.net/10451/31093
Appears in Collections:FL - CLUL - Livros de Actas

Files in This Item:
File Description SizeFormat 
10178.pdf887,58 kBAdobe PDFView/Open


FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpace
Formato BibTex MendeleyEndnote 

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.