News
April 18, 2024
Veps and Karelian open corpus VepKar is an essential tool for preserving the ethnic languages

Fundamental problems of linguistics and the tasks of language corpus studies were discussed at a RAS Presidium meeting in Moscow. Scientists from the Institute of Linguistics, Literature and History (ILLH) KarRC RAS Irma Mullonen and Irina Novak presented the main results of the work on the Veps and Karelian Open Corpus (VepKar), which has been underway in Karelia for 15 years. At present, its database contains 6 thousand texts of varying size.
On April 9, the RAS Presidium met to discuss the basic problems of linguistics and the tasks of language corpus studies. Researchers from the Institute of Linguistics, Literature and History (ILLH) KarRC RAS Irma Mullonen and Irina Novak made a presentation on the results of the work on the Veps and Karelian Open Corpus, VepKar. The making of the Corpus was initiated 15 years ago to preserve and systemically study the languages of Balto-Finnic peoples of Karelia. The related programing is done by specialists from the Institute of Applied Mathematical Research (IAMR) KarRC RAS.

VepKar fulfills several main tasks. In addition to research, it is preserving and accumulating written texts and samples of spoken Karelian and Veps speech. At present, it contains 6 thousand texts with 2 million word uses.

— Anyone can use VepKar as a digital library and as a full-fledged electronic dictionary. Besides, applications with a simpler interface, such as the Multimedia Dictionary of Karelian, are being developed for a wide range of users on the basis of the corpus data. Thus, the corpus is a tool for preserving the Karelian and Veps languages and provides great opportunities for their learners. One can find a word, check how it sounds, how it is correctly spelled, what grammatical characteristics it has, — explained Irma Mullonen, RAS Corresponding Fellow, Chief Researcher at Linguistics Section ILLH KarRC RAS.

The keynote lecture at the session dealt with the current stage in the development of corpus linguistics. In his talk, RAS Academician Vladimir Plungian paid special attention to the terminology and methodology in this field. The presentation included a brief overview of the history of corpus linguistics development in Russia and worldwide, and outlined the current priorities of this field of research. The speaker also informed about the high demand for the main project of the Russian corpus linguistics — Russian National Corpus.
Read more on the topics discussed at the session on RAS website.

— More than one ethnic language corpus is currently being created in Russia. However, the VepKar, which is being created at KarRC RAS, has advanced more than others both in content stuffing and in grammatical and semantical markup. It can now work as a full-fledged platform for scientific research. Apparently, that is why we were invited to make the presentation. As a take-away from the meeting, the RAS Presidium emphasized the need to support corpus studies in our country. One of the upcoming concrete support measures is the Russian Science Foundation's thematic competition for projects to create corpus resources on languages of Russia. We hope such a special RSF program will soon be announced and we will take part in it,— Irma Mullonen summarized.

See also:

July 28, 2025
Researchers monitor the state of Lake Onego under climate change and human impact

Staff of the Northern Water Problems Institute KarRC RAS are back from an expedition that covered larger bays and deep-water regions of Lake Onego. The multidisciplinary studies both provide new data on the wellbeing of the lake ecosystem through modern analysis methods and permit tracing the changes relying on over 60-year-long own observation series. Annual monitoring is especially important for evaluating the combined effect of climate change and human impact on the lake.
July 24, 2025
Karelian scientists assess the consequences of human impact on Solovki ecosystem

In 2025, KarRC RAS scientists continued field surveys in Blagopoluchiia Bay, Solovetsky Archipelago, White Sea. Two expeditions have already taken place – in winter and in summer, and one more is coming in August. Scientists explore the human impact on the archipelago’s nature, both in the bay waters and on adjacent land. They have detected an unusual ice structure, collected sediment cores to assess the pollution level, described the composition of the flora in the intertidal zone and the coastal meadows.