Mercator/ Fryske Akademy researchers Charlie Robinson-Jones and Ydwine Scarse have written a report on language technology resources for Frisian, commissioned by the European Language Equality (ELE) project. The report ' Report on the West Frisian Language (Language Technology Support of Europe's Languages in 2020/2021 - European Language Equality project ) ' is free to download.
Oer European Language Equality (ELE)
With a large and all-encompassing consortium consisting of 52 partners covering all European countries, research and industry and all major pan-European initiatives, the European Language Equality (ELE) project developed a strategic research, innovation and implementation agenda as well as a roadmap for achieving full digital language equality in Europe by 2030. This report is part of (currently) 39 language reports for it ELE-projekt .
In this report, we provide an overview of existing Language Technology (LT) for West Frisian, with reference to the language data, tools, and services listed in the European Language Grid. Since LT is central in our everyday lives, such as in spell checkers, search engines, translation software, and virtual assistant technology, it is essential that not only national or majority languages, but also regional or minority languages enjoy digital equality and a high level of digital vitality – a core component of language maintenance and revitalisation. At risk of not achieving digital language equality is West Frisian, an autochthonous minority language spoken in the officially bilingual province of Fryslân (Friesland) in the north of the Netherlands. In an effort to improve the LT situation for West Frisian, we present an analysis of the current data and resources, as well as identify the gaps and challenges that require urgent attention to ensure West Frisian is not negatively affected by digital diglossia (i. e., its exclusion from digital contexts in favour of Dutch, the national language).
Our analysis of current LT resources for West Frisian revealed that there have been significant developments regarding machine translation, spell checkers, monolingual and multimodal corpora, and lexical resources in particular. Crowdsourcing efforts for initiatives such as Mozilla Common Voice have been relatively successful in collecting data that can be utilised in future voice technology projects for West Frisian. There is, however, a lack of bilingual and multilingual text corpora (parallel data), language models, computational grammars, human-computer interaction services, and social media data, among others, as well as of transcribed materials that can be used for the development of speech recognisers and other resources.
Furthermore, although there have been LT programmes for Dutch in recent years, such as the STEVIN programme, regional or minority languages in the Netherlands (e. g., West Frisian) have thus far not received any attention in these projects. There is also no journal or event dedicated specifically to West Frisian LT or artificial intelligence, whereas these are available for Dutch. Moreover, we identified gaps and challenges that highlight the need for increased funding and improved training and retention of skilled LT developers in the province of Fryslân. Overall, despite the impressive progress that has been made to date, there is an evident lack of West Frisian LT resources compared to Dutch and other (official EU) languages.
To improve West Frisian’s digital vitality and LT resources, we recommend, among others, that
- (a) a centre or network should be created by core LT stakeholders in Fryslân to safeguard West Frisian in the digital age,
- (b) focus should now shift towards collecting data for the development of more advanced tools and services, such as automatic subtitling and voice recognition software, as well as screen readers for visually impaired West Frisian speakers, and
- (c) West Frisian LT stakeholders should develop initiatives to raise younger generations’ awareness of the resources available and to increase the opportunities to use West Frisian LT in everyday life, with the aim of reducing the impact of digital diglossia.
We further propose that our recommendations should be integrated into a long-term, national digital language strategy that seeks to facilitate the development of more (multilingual) LT resources, enhance the quality of current West Frisian LT, and enable improved cooperation between the main LT stakeholders in the province of Fryslân, which will collectively contribute to achieving digital language equality for West Frisian.