FreP

Frequency Patterns of Phonological Objects in Portuguese - Research and Applications

 
Workshop on Frequency in Phonology

Organized by the FreP Project (PTDC/LIN/70367/2006)
Laboratório de Fonética
Centro de Linguística da Universidade de Lisboa
http://labfon.letras.ulisboa.pt/FreP

October, 26
Faculdade de Letras da Universidade de Lisboa
Room 5.2

Abstracts

Joan Bybee (University of New Mexico)
A typology of sound change: phonetic properties and frequency effects

From a theoretical point of view, the significance of frequency patterns comes from their impact on cognitive representations and the consequent effect cognitive representation has on the structure of language. When we see high or low frequency items changing we can use that information to form hypotheses about the causes of linguistic change. Frequency data allows us to distinguish between articulatory and perceptual motivation for sound change and variation as well as to identify change based on structural or lexical analogy. The presentation provides a typology of sound change (applicable also to synchronic variation) which is based on the underlying causes, as identified from the patterns of diffusion from high to low frequency or the opposite, and the nature of the phonetic properties that change.

Niels Schiller (Universiteit Leiden)
Psycholinguistic studies of frequency effects in language production: Behavioral evidence

In this talk, I will first give an overview of the language production process and word form encoding, also called phonological encoding. Then I will present examples of behavioral studies on frequency effects in single word production. Besides word frequency, I will focus on syllable frequency. Syllable frequency turned out to be an important factor independent of word (and segment) frequency for word production latencies. I will show that in Germanic languages such as English, Dutch or German speakers cover a large proportion of their speech output with a relatively small set of syllables, the basic units of articulatory-motor behavior. However, while syllable frequency has been reliably shown to affect speech onset latencies, supporting the idea of a mental syllabary, i.e. a store of prefab articulatory-motor programs, the representation of syllables is still under debate due to contradicting findings from priming studies.

Marina Vigário & Fernando Martins (Universidade de Lisboa)
The FreP tool

Frequency information available for phonological units in Portuguese is still scarce, non-replicable, corpus dependent, and hard to obtain due to the non-existence of a free tool for public use. FreP is a new electronic tool that provides frequency counts of phonological units at the word-level and below from Portuguese written text. The FreP approach is largely based on descriptions of the lexical phonology of the language. Frequency information is computed for segmental features, segments, major classes of segments, syllables and syllable types, phonological clitics, clitic type and size, prosodic words and their shape, and word stress location. Segmental features, segments and syllable types are also computed by position within the word and/or status relative to word stress. FreP provides a phonetic transcription output from the written text, together with a lexical frequency list (type and token). It also generates txt files separating prosodic words and clitics.

Marisa Cruz, Sónia Frota & Marina Vigário (Universidade de Lisboa)
The FrePOP database

Information on the frequency of phonological units is relevant for fundamental research on areas such as the phonology of particular languages, linguistic universals/general trends, language change, acquisition and development of phonology, L2 learning, or bilingual studies. Frequency is also an obvious source for establishing regular and deviating patterns in language use. Knowledge of such patterns is essential in domains like the diagnostic, evaluation and therapy of deviant/pathological speech, evaluation of proficiency of first and second language learners, or forensics. The FrePOP (Frequency of Phonological Objects in Portuguese - http://frepop.fl.ul.pt/) database is a database of frequency information of phonological objects in different types of corpora, which may be used as reference information. Frequency values for different regions (including the varieties of Portuguese spoken in Brazil and in Africa) and for specific groups of population (by age, sex, education and profession) may be obtained, as well as for different historical periods (from the XVI century onwards) and types of corpora (e.g. written, spoken, adult speech, child directed speech, child speech). The FrePOP includes information on prosodic words, clitics, syllables, segments and stress patterns. The basis of the FrePOP is a set of corpora with over 3.5 million orthographic words, and the frequency data in the FrePOP were obtained with the FreP tool.

Marina Vigário, Sónia Frota & Fernando Martins (Universidade de Lsiboa)
Frequency in language acquisition: tokens or types?

In this talk, we examine the frequency of a number of phonological units and patterns in European Portuguese adult speech, computed over tokens and over types, and compare it with the frequency and/or order of emergence of those units and patterns in children’s early speech. We conclude that, whenever frequency information based on tokens and on types does not converge, it is always the frequency computed over tokens that correlates with the frequency patterns and/or order of emergence of those units/patterns in child speech. This investigation contributes to the understanding of the role of frequency in language acquisition, in addition to providing new frequency data for Portuguese.

Sónia Frota, Charlotte Galves, Marina Vigário & Verónica Gonzalez-Lopez (Universidade de Lisboa, Universidade de Campinas)
Frequency and the phonology of rhythm from Classical to Modern Portuguese

It is known that in the history of Portuguese, between the 17th and 19th centuries, there was both a syntactic and a prosodic change: the former gave rise to a new pattern in clitic pronoun location, and the latter yield modern Portuguese pronunciation. The syntactic change seems to have taken place at the beginning of the 18th century; the prosodic change is much harder to locate and evidence from grammarians reports is scarce and contradictory. Building on proposals that see language rhythm as a constellation of phonological properties, we examine the frequency patterns of some of these properties (distribution of syllable types, proportion of C and V, stress properties and word size) from 1500 onwards. Frequency was computed with FreP over texts from the Tycho Brahe corpus (http://www.ime.usp.br/~tycho/corpus). Changes in the frequency patterns indicate that Portuguese rhythm has evolved to integrate aspects of accentual timing, and that the prosodic change was progressive with different properties changing at different moments in time (the first change occurring at the beginning of the 17th century).

Sónia Frota, Marina Vigário, Fernando Martins, Marisa Cruz, Nuno Matos & Nuno Paulino (Universidade de Lsiboa)
Frequency patterns of European Portuguese: spoken/written language, modality, region, education, age and gender

Using the FrePOP database, we examine the frequency patterns of prosodic words, clitics, syllables, segments and stress patterns by corpus size, corpus type (spoken, written, modality), and speaker-related factors such as region, education, age and gender. The presentation has two goals: to establish which frequency patterns are more robust, i.e. less affected by factors like corpus size/type and dialectal/sociolectal factors, and thus part of the language profile and most useful for cross-language comparison; to describe the properties that may best indicate particular within-language groupings and establish their patterns of variation according to the factors observed.