Research news

Thesaurus of Modern Slovene

Publish Date: 03.12.2018

Category: Outstanding research achievements

The Thesaurus of Modern Slovene is the most comprehensive open-source and automatically generated collection of Slovene synonyms, developed with advanced computational methods that are innovative in the field of lexicography not just in Europe, but globally as well.

 Authors: Simon Krek, Adam Cyprian Laskowski, Marko Robnik Šikonja, Iztok Kosem, Špela Arhar Holdt, Apolonija Gantar, Jaka Čibej, Vojko Gorjanc, Bojan Klemenc, Kaja Dobrovoljc

The Thesaurus of Modern Slovene is the result of a research project which led to an advanced online application and an eponymous open-access database: the Thesaurus of Modern Slovene (https://viri.cjvt.si/sopomenke/slv/). It was made by researchers from the Faculty of Arts of the University of Ljubljana and from the Faculty of Computer and Information Science of the University of Ljubljana (Simon Krek, Adam Cyprian Laskowski, Marko Robnik Šikonja, Iztok Kosem, Špela Arhar Holdt, Apolonija Gantar, Jaka Čibej, Vojko Gorjanc, Bojan Klemenc, Kaja Dobrovoljc).

The Thesaurus has been compiled in line with the concept of open science. It introduced a new type of dictionary called a responsive dictionary, which is characterized by the fact that the initial dictionary database is generated with advanced computational methods, which ensures that the language community is provided with a large amount of relevant, albeit still somewhat noisy language data immediately after generating the database. It is key that the database is open-access and that it enables the entire language community to contribute to an improved, cleaner database. This is the most important characteristic of this new type of dictionary: it is never complete, and the data in the entries can quickly be adapted if language use changes. All changes can be tracked through the timestamp in individual dictionary entries and by archiving all versions of the database. The name responsive dictionary comes from the fact that the data in the database constantly responds to the views of the participating language community as well as to changes in language use arising from texts produced by the community. In its most basic foundation, the Thesaurus is a dictionary by the community for the community.

References: Krek S., Laskowski C., Robnik-Šikonja M. From translation equivalents to synonyms: creation of a Slovene thesaurus using word co-occurrence network analysis. In: Kosem I. (ur.) et al., Proceedings of eLex 2017: Lexicography from Scratch, 19-21 September 2017, Leiden, Netherlands (2017).

Arhar Holdt Š., Čibej J., Dobrovoljc K., Gantar P., Gorjanc V., Klemenc B., Kosem I., Krek S., Lasowski C., Robnik-Šikonja M. Thesaurus of Modern Slovene: By the Community for the Community. In: Čibej J. (ed.) et al., Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, Ljubljana University Press, Faculty of Arts, Ljubljana (2018).

Gorjanc V., Gantar P., Kosem I., Krek S. (ed.) Glossary of Contemporary Slovene: Problems and Solutions, Scientific Publishing House of the Faculty of Arts, Ljubljana (2015). Partially translated into: GORJANC, V., GANTAR, P., KOSEM, I., KREK, S. (ed.) Dictionary of modern Slovene: problems and solutions, Ljubljana University Press, Faculty of Arts, Ljubljana (2017). 

 Home page of the Thesaurus of Modern Slovene <br />
Image source: https://viri.cjvt.si/sopomenke/slv/

Home page of the Thesaurus of Modern Slovene 

Image source: https://viri.cjvt.si/sopomenke/slv/

back to list