Linguatec-IA - Cross-border network for technological cooperation in artificial intelligence applied to language for the construction of a trans-Pyrenean linguistic infrastructure
Specific programme: Interreg VI-A Spain-France-Andorra POCTEFA 2021-2027
Project title: Cross-border network for technological cooperation in artificial intelligence applied to language for the construction of a trans-Pyrenean linguistic infrastructure
Acronym: Linguatec-IA
Objective: The objective of the project is to develop knowledge in artificial intelligence (AI) on new neural language models applicable to languages with few resources, to advance the digitalisation of the languages of the POCTEFA territory (Aragonese, Catalan, Basque and Occitan) and the construction of a cross-border linguistic and intelligent infrastructure that facilitates communication between speakers of the different languages and multilingual access to information.
Research team:
- Patxi Xabier Arregi Iparragirre (Investigador Principal)
- Oier López de Lacalle Lecuona
- Adrián Núñez Marcos
- María Jesús Aranzabe Urruzola
- José María Arriola Egurrola
- Ander Barrena Madinabeitia
- Nerea Ezeiza Ramos
- Itziar Aduriz Agirre
Partner Entities:
- Elhuyar Fundazioa (Project Leader)
- Lo Congrès Permanent de la Lenga Occitana
- University of the Basque Country/Euskal Herriko
- Unibertsitatea Université de Toulouse 2 Jean Jaures
- Université de Perpignan Via Domitia
- Center National de Recherche Scientifique General
- Directorate of Linguistic Policy of the Government of Aragon
- Universitat from Lleida
Total budget: €1,545,953.42
Budget FEDER: €1,004,867.00
Project Start: 01/01/2024
Project End: 31/12/2026
Total Project Duration: 2 years 11 months and 30 days
Project Summary: The “LINGUATEC IA” project contributes to two Challenges of the POCTEFA territory: (1) Increase the effort in innovation, investing in applied research in Artificial Intelligence in Natural Language Processing (NLP) and (2) Contribute to the social and cultural articulation of the cross-border territory, reinforcing a key element of local culture, languages. The aim is to develop AI knowledge on new autoregressive language models applicable to resource-poor languages and their use to advance the digitalisation of the languages of the POCTEFA territory (Aragonese, Catalan, Basque and Occitan) and the construction of a cross-border intelligent linguistic infrastructure that facilitates communication between speakers of different languages and multilingual access to information. The main results of the project will be: - New algorithms and neural architectures to generate autoregressive language models adapted to limited computing regimes and linguistic resources. - Improvements in transcription, neural machine translation and voice synthesis systems for Basque, Catalan, Occitan, Aragonese and their dialectal variants, combined with French and Spanish. - Development of a multilingual language platform for automatic subtitling and dubbing.- Online platform-repository of resources, technologies and applications for the languages of the Pyrenees. - Consolidation of the “Cross-border Network of Excellence in Language Technologies”. The main beneficiaries will be (1) researchers and professionals working in the field of languages and their digitalisation (2) public and private entities that will be able to improve their services and make them accessible in different languages and (3) citizens who will be able to communicate more easily in a multilingual environment.
The LINGUATEC-IA project has been 65% co-financed by the European Union through the Interreg VI-A Spain-France-Andorra Programme (POCTEFA 2021-2027). The objective of POCTEFA is to strengthen the economic and social integration of the Spain-France-Andorra border area.
More information about the programme:
https://www.poctefa.eu/
https://www.europarl.europa.eu/factsheets/en/sheet/95/european-federal-regional-development-fund