Loading...
Research Project
COPAS - Contrast and Parallelism in Speech
Funder
Authors
Publications
Extending AuToBI to prominence detection in European Portuguese
Publication . Moniz, Helena; Mata, Ana Isabel; Hirschberg, Julia; Batista, Fernando; Rosenberg, Andrew; Trancoso, Isabel
This paper describes our exploratory work in applying the Automatic ToBI annotation system (AuToBI), originally developed for Standard American English, to European Portuguese. This work is motivated by the current availability of large amounts of (highly spontaneous) transcribed data and the need to further enrich those transcripts with prosodic information. Manual prosodic annotation, however, is almost impractical for extensive data sets. For that reason, automatic systems such as AuToBi stand as an alternate solution. We have started by applying the AuToBI prosodic event detection system using the existing English models to the prediction of prominent prosodic events (accents) in European Portuguese. This approach achieved an overall accuracy of 74% for prominence detection, similar to state-of-the-art results for other languages. Later, we have trained new models using prepared and spontaneous Portuguese data, achieving a considerable improvement of about 6% accuracy (absolute) over the existing English models. The achieved results are quite encouraging and provide a starting point for automatically predicting prominent events in European Portuguese.
Stylistic variation in the intonation of European Portuguese teenagers and adults
Publication . Mata, Ana Isabel; Moniz, Helena; Batista, Fernando
The present study aims to investigate intonation contours in phrase-final position, in a corpus of spontaneous and prepared unscripted presentations from teenagers (14-15 years old) and adults, collected in a school context. Taking into account the differences between phrasing levels (ToBI breaks 3 and 4), we show that the frequency of low/falling vs. high/rising contours – mainly (H+)L* L and (L+)H* H – varies across oral presentation types. Adults and teenagers follow distinct strategies, though cross-gender differences are also a source of variation. We interpret these changes as an adaptation effect to the speaking styles specifically required at school, which call for the speaker´s effort to speak clearly and to keep the listeners attention, and ultimately as “intelligibility-oriented” speaking style changes.
Teenage and Adult Speech in School Context: Building and Processing a Corpus of European Portuguese
Publication . Mata, Ana Isabel; Moniz, Helena; Batista, Fernando; Hirschberg, Julia
We present a corpus of European Portuguese spoken by teenagers and adults in school context, CPE-FACES, with an overview of the differential characteristics of high school oral presentations and the challenges this data poses to automatic speech processing. The CPE-FACES corpus has been created with two main goals: to provide a resource for the study of prosodic patterns in both spontaneous and prepared unscripted speech, and to capture inter-speaker and speaking style variations common at school, for research on oral presentations. Research on speaking styles is still largely based on adult speech. References to teenagers are sparse and cross-analyses of speech types comparing teenagers and adults are rare. We expect CPE-FACES, currently a unique resource in this domain, will contribute to filling this gap in European Portuguese. Focusing on disfluencies and phrase-final phonetic-phonological processes we show the impact of teenage speech on the automatic segmentation of oral presentations. Analyzing fluent final intonation contours in declarative utterances, we also show that communicative situation specificities, speaker status and cross gender differences are key factors in speaking style variation at school.
Prosodic, Syntactic, Semantic Guidelines for Topic Structures Across Domains and Corpora
Publication . Mata, Ana Isabel; Moniz, Helena; Móia, Telmo; Gonçalves, Anabela; Silva, Fátima; Batista, Fernando; Duarte, Inês; Oliveira, Fátima; Falé, Isabel
This paper presents the annotation guidelines applied to naturally occurring speech, aiming at an integrated account of contrast and parallel structures in European Portuguese. These guidelines were defined to allow for the empirical study of interactions among intonation and syntax-discourse patterns in selected sets of different corpora (monologues and dialogues, by adults and teenagers). In this paper we focus on the multilayer annotation process of left periphery structures by using a small sample of highly spontaneous speech in which the distinct types of topic structures are displayed. The analysis of this sample provides fundamental training and testing material for further application in a wider range of domains and corpora. The annotation process comprises the following time-linked levels (manual and automatic): phone, syllable and word level transcriptions (including co-articulation effects); tonal events and break levels; part-of-speech tagging; syntactic-discourse patterns (construction type; construction position; syntactic function; discourse function), and disfluency events as well. Speech corpora with such a multi-level annotation are a valuable resource to look into grammar module relations in language use from an integrated viewpoint. Such viewpoint is innovative in our language, and has not been often assumed by studies for other languages.
Marcação Explícita de Tópicos com a Locução Prepositiva quanto a e Afins
Publication . Móia, Telmo; Gonçalves, Anabela; Duarte, Inês
In this paper, a general semantic and syntactic characterisation of the topics introduced by connectives
such as quanto a – the Portuguese counterpart of English as for – will be attempted. The grammatical
differences between these topics and those that are merely signaled via syntactic ordering or intonational
devices will be highlighted. These differences include both semantic properties (e.g. the type of reference
chains allowed) and syntactic properties (e.g. recursion, position, sensitivity to islands constraints). A
connection with discourse mechanisms – of the cohesion-coherence type – in the constructions with
lexically explicit topic markers will also be underlined.
Organizational Units
Description
Keywords
Contributors
Funders
Funding agency
Fundação para a Ciência e a Tecnologia
Funding programme
3599-PPCDT
Funding Award Number
PTDC/CLE-LIN/120017/2010
