It Boarnsterhim Korpus
Syngrony en diagrony yn taal
It Boarnsterhim Korpus (BHC) is in korpus foar ûndersikers en studinten dy't de fonetyk, fonology, fariaasje en feroaring yn it sprutsen Frysk bestudearje wolle. It BHC sil yn in gruttere database fan Fryske korpora ynbêde wurde om it mooglik te meitsjen de effekten fan leksikale frekwinsje te bestudearjen. De earste gegevens wurde yn 2018 publisearre en it korpus wurdt yn 'e kommende jierren útwreide.
Frysk: Eke Born, Kobe Flapper, Renske Hooijenga, Hilde de Jong BA, Dik Nauta, Wytse Willem Pel, Janneke Spoelstra MA, Tineke Tamminga, Helga Zandberg
Nederlânsk: Grietje Keizer-Heeringa, Theresia Schreiber, Edmee Valk-Boon BA, Rick Weggen
Andrea Garcia Ariza MA, Tessa Hummel BA, Mirthe Koppenberg
Does Frisian move towards Dutch? That question has often been asked and some evidence seems to support that idea. To study whether the sound system of Frisian was really changing towards Dutch, The Boarnsterhim Corpus (henceforth BHC) was recorded in 1982-1984, under supervision of Tony Feitsma. The studies that followed from this suggest that the Frisian sound system was stable. In some respects, the distinction between Frisian and Dutch became even stronger. To further investigate whether this trend continues, the BHC2 is recorded in 2017-2019. Recordings and analyses of four generations of speech provides the opportunity to investigate the stability, variation, and change of the Frisian sound system over 100 years.
In both periods, speakers of three generations of the same families were recorded: grandmother, mother, and daughter; or grandfather, father, and grandson. The two younger generations of the first period overlap with the oldest two generations of the second period. A unique property of this corpus is that as far as possible, half of the overlapping generations in the BHC1 and the BHC2 consists of speech of the same individuals.
All speakers were recorded twice. One time they were recorded in Frisian with a native interviewer to ensure informal Frisian speech. The other time they were recorded in Dutch with a monolingual Dutch interviewer to avoid Frisian. Each recording consists of 20 read sentences, a read story (2-3 minutes), and an interview of about 40 minutes about the speaker’s use of Frisian, language attitude, and daily life activities. In the BHC1, data were recorded on cassette tapes which were digitalized in 2016. The BHC2 is a replication of the BHC1, with the same number of speakers and same age groups.
With the help of research assistants, interns, and volunteers, the data are annotated in Praat speech processing software. This separates the phrases, words, and sounds (with an accuracy of milliseconds). There are separate tiers (levels) for:
- phonetic realization
- deletion of speech sounds
- specific phonological processes
The corpus size of the complete BHC is nearly 125 hours of speech for both West Frisian and Dutch.
Embedding and tools
Since lexical frequency plays an important role in language variation and change is lexical frequency, the BHC offers frequency information for Frisian. The size of the BHC is limited however, and so is the variety of topics in the recordings. That may bias lexical frequency counts. We therefore connect the BHC to the other Frisian corpora at the Fryske Akademy, like The Frisian Audio Mining Enterprise (FAME!) (Yilmaz et al. 2016) which consists of more than 2600 hours of West Frisian Radio Broadcast from 1950-2016. We will provide both token and lemma frequency of words. The technical details are found in (Sloos, Drenth & Heeringa 2018).
This corpus is highly suitable for research in the following fields:
- bilingualism and code-switching
- long term language change
- especially in bilingualism
- and minority languages
- the phonetics and phonology of Frisian
- real-time vs. apparent time studies into language change
- studies into the development of reading competences of Frisian
- frequency effects in language
- language and ageing
- language attitude over time
Sloos, Marjoleine, Eduard Drenth & Wilbert Heeringa (in press). The Boarnsterhim Corpus: A Bilingual Frisian-Dutch Panel and Trend Study. In Proceedings of the 11th edition of the Language Resources and Evaluation Conference, 7-12 May 2018, Miyazaki (Japan).
Feitsma, Antonia. 1989. Changes in the pronunciation of Frisian under the influence of Netherlandic. In Deprez, K. (ed.), Language and Intergroup Relations in Flanders and in the Netherlands, 181-193. Dordrecht: Foris.
Meekma, Irénke. 1989. Frouljuspraat en it lytse ferskil. Oer útspraakferoaring yn 'e sandhi by froulju en manlju. It Beaken 51, 115-29.
Feitsma, Tony, Els van der Geest, Frits J. van der Kuip & Irénke Meekma. 1987. Variations and development in Frisian sandhi phenomena. International Journal of the Sociology of Language 64, 81-94.
van der Kuip, Frits J. 1986. Syllabisearring yn it Frysk en it Hollânsk fan Fryskpraters. Tydskrift foar Fryske Taalkunde 2, 69-92.