Advanced search in
Research products
arrow_drop_down
Searching FieldsTerms
Any field
arrow_drop_down
includes
arrow_drop_down
Include:
181 Research products, page 1 of 19

  • Publications
  • Research software
  • Other research products
  • Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave

10
arrow_drop_down
Date (most recent)
arrow_drop_down
  • Open Access English
    Authors: 
    Maja Bitenc; Marko Stabej; Nataša Gliha Komac; Matejka Grgič; Monika Kalin Golob; Karmen Kenda-Jež; Albina Nećak Lük; Sonja Novak Lukanovič; Krištof Savski;
    Publisher: Znanstvena založba Filozofske fakultete Univerze v Ljubljani (Ljubljana University Press, Faculty of Arts)

    Zapis posveta o aktualnih sociolingvističnih izzivih in prednostnih raziskovalnih tematikah, ki sta ga organizirala doc. dr. Maja Bitenc in red. prof. dr. Marko Stabej z Oddelka za slovenistiko in je potekal v ponedeljek, 27. 9. 2021, na Filozofski fakulteti Univerze v Ljubljani in s prenosom preko Zooma. V prvem delu so vabljene strokovnjakinje in strokovnjaki predstavili svoje poglede ob izhodiščnih vprašanjih, v drugem je sledila razprava vseh sodelujočih. Zapis posnetka so govornice in govorniki uredili po lastni presoji, načeloma s čim manj intervencijami, iz razprave pa so za branje prilagojene in objavljene vsebinsko tehtnejše replike.

  • Open Access
    Authors: 
    Eva Trivunović;
    Publisher: University of Ljubljana

    Prispevek prinaša pregled variant in modifikacij sedmih (iz)biblijskih frazemov v sodobni slovenščini ter njihove prisotnosti v sodobnem jeziku. Ugotovitve so primerjane z obravnavo frazemov v obstoječih slovarjih, kjer se kaže velik razkorak med slovarskim prikazom in stanjem, ki ga izkazuje korpusno gradivo. Za zanesljivejše ugotavljanje, v katerih primerih lahko govorimo o že ustaljeni variantnosti, so bili v raziskavi uporabljeni trije zvrstno različni korpusi: Gigafida 2.0, Janes in slWaC. Poleg ustaljenih variant so predstavljene neustaljene modifikacije, poseben poudarek je na prenovitvah, vendar se je jasno zastavljena tipologija mestoma izkazala za preveč togo, saj pri nekaterih mejnih primerih ni bilo mogoče nedvoumno ločiti ustaljenih variant od neprenovitvenih modifikacij ter neprenovitvenih modifikacij od prenovitvenih. Vsi izbrani frazemi in njihove prenovitve so najpogostejši v korpusu Janes, kar dokazuje nujnost vključevanja večjega števila raznovrstnih korpusov v jezikoslovne raziskave.

  • Publication . Article . 2021
    Open Access English
    Authors: 
    Nikola Ljubešić; Nataša Logar; Iztok Kosem;
    Publisher: Znanstvena založba Filozofske fakultete Univerze v Ljubljani (Ljubljana University Press, Faculty of Arts)

    Collocations play a very important role in language description, especially in identifying meanings of words. Modern lexicography’s inevitable part of meaning deduction are lists of collocates ranked by some statistical measurement. In the paper, we present a comparison between two approaches to the ranking of collocates: (a) the logDice method, which is dominantly used and frequency-based, and (b) the fastText word embeddings method, which is new and semantic-based. The comparison was made on two Slovene datasets, one representing general language headwords and their collocates, and the other representing headwords and their collocates extracted from a language for special purposes corpus. In the experiment, two methods were used: for the quantitative part of the evaluation, we used supervised machine learning with the area-under-the-curve (AUC) ROC score and support-vector machines (SVMs) algorithm, and in the qualitative part the ranking results of the two methods were evaluated by lexicographers. The results were somewhat inconsistent; while the quantitative evaluation confirmed that the machine-learning-based approach produced better collocate ranking results than the frequency-based one, lexicographers in most cases considered the listings of collocates of both methods very similar.

  • Open Access
    Authors: 
    Mojca Stritar Kučuk;
    Publisher: University of Ljubljana

    Redno vpisani tuji študenti Univerze v Ljubljani, ki se v prvem letu študija v okviru modula Leto plus učijo slovensko, se v drugem semestru na posebni delavnici podrobneje spoznajo s spletnimi jezikovnimi viri in tehnologijami za slovenščino. V prispevku je opisana izvedba te delavnice v študijskem letu 2019/20, ko je zaradi pandemije koronavirusa potekala na daljavo, v obliki interaktivnih videoposnetkov z nalogami za preverjanje razumevanja snovi. Drugi del prispevka se osredotoča na mnenje študentov o tovrstnih jezikovnih virih. S spletno anketo sem analizirala stališča in izkušnje študentov dveh generacij: študenti generacije 2018/19 so spletna orodja spoznavali v razredu, študenti generacije 2019/20 pa na daljavo. Sodeč po rezultatih ankete, mlajša generacija študentov jezikovne vire na spletu uporablja pogosteje. Študenti obeh skupin najpogosteje uporabljajo Googlov Prevajalnik, ki mu sledijo Sloleks, pregibnik Besana, Fran in Pons. Kot argumente za uporabo teh virov izpostavljajo predvsem hitrost oz. enostavnost uporabe in navajenost na določen vir.

  • Open Access
    Authors: 
    Magdalena Gapsa;
    Publisher: University of Ljubljana

    Poročilo o dveh pomembnih leksikografskih konferencah, in sicer o sedmi bienalni konferenci združenja Electronic lexicography in the 21st century (na kratko: eLex), ki je potekala med 5. in 7. julijem 2021, ter devetnajsti bienalni konferenci Evropskega leksikografskega združenja (European Association for Lexicography, EURALEX), ki je potekala med 7. in 9. septembrom 2021.

  • Open Access English
    Authors: 
    Darinka Verdonik; Simona Majhenič; Špela Antloga; Sandi Majninger; Marko Ferme; Kaja Dobrovoljc; Simona Pulko; Mira Krajnc Ivič; Natalija Ulčnik;
    Publisher: Znanstvena založba Filozofske fakultete Univerze v Ljubljani (Ljubljana University Press, Faculty of Arts)

    The paper describes three types of challenges that were detected in teaching Slovene as a mother tongue at schools. First, a number of orthographic and grammatic mistakes can be detected in pupils’ writings (see Kosem et al., 2012; Križaj in Bester Turk, 2018; Gomboc, 2019). Second, low phraseological literacy was noticed and the pupils often have problems understanding phrasemes (Vorsic, 2018). Third, the challenges of communicative competence were addressed, referring to production and interpretation of different written, spoken as well as multimedia genres, as only appropriate genre literacy enables efficient use of different genres (Nidorfer Siskovic, 2013). To address these challenges, we have developed a complex e-learning environment for improving writing and communication skills of Slovene pupils – “Slovenscina na dlani”. The developed environment is divided into four general topics – orthography, grammar, phrasemes and texts. Each topic covers a number of subtopics, and for each sub-topic a number of exercises is available, along with explanations. We have used the most up-to-date language technologies and programming solutions in order to automatise the e-environment. The user’s knowledge is automatically evaluated, and based on this s/he is automatically guided through the environment in a way to improve her/his writing and communication skills. The e-environment has also a special user interface for teachers which enables easy way to assign tasks as well as to track the performance of each pupil individually or a group of pupils as a whole. The gamification and professional graphic design fulfil the user experience. The “Slovenscina na dlani” will be freely available at https://slo-na-dlani.si from September 2021 on.

  • Open Access English
    Authors: 
    Darja Fišer; Tomaž Erjavec; Ajda Pretnar;
    Publisher: Znanstvena založba Filozofske fakultete Univerze v Ljubljani (Ljubljana University Press, Faculty of Arts)
  • Open Access
    Authors: 
    Lucia Vlášková; Hana Strachoňová;
    Publisher: University of Ljubljana

    As a growing field of study within sign language linguistics, sign language lexicography faces many challenges that have already been answered for audio-oral language material. In this paper, we present some of these challenges and methods developed to help navigate the complex lexical classification field. The described methods and strategies are implemented in the first Czech sign language (ČZJ) online dictionary, a part of the platform Dictio, developed at Masaryk University in Brno. We cover the topic of lemmatisation and how to decide what constitutes a lexeme in sign language. We introduce four types of expressions that qualify for a dictionary entry: a simple lexeme, a compound, a derivative, and a set phrase. We address the question of the place of classifier constructions and shape and size specifiers in a dictionary, given their peculiar semantic status. We maintain the standard classification of classifiers (whole entity and holding classifiers) and size and shape specifiers (SASSes; static and tracing specifiers). We provide arguments for separating the category of specifiers from the category of classifiers. We discuss the proper treatment of mouthings and mouth gestures concerning citation forms, derivation and translation. We show why it is difficult in sign language to distinguish synonyms from variants and how our proposed phonological criteria can help. We explain how to construct a semantic definition in a sign language and what is the solution for multiple meanings of one form. We offer simple guidelines for forming proper examples of use in a sign language. And finally, we briefly comment on the process of the translation between sign and spoken languages. We conclude the paper with a summary of roles that Dictio plays in the ČZJ-signing community.

  • Open Access English
    Authors: 
    Matej Ulčar; Anka Supej; Marko Robnik-Šikonja; Senja Pollak;
    Project: EC | EMBEDDIA (825153), EC | EMBEDDIA (825153)

    In recent years, the use of deep neural networks and dense vector embeddings for text representation have led to excellent results in the field of computational understanding of natural language. It has also been shown that word embeddings often capture gender, racial and other types of bias. The article focuses on evaluating Slovene and Croatian word embeddings in terms of gender bias using word analogy calculations. We compiled a list of masculine and feminine nouns for occupations in Slovene and evaluated the gender bias of fastText, word2vec and ELMo embeddings with different configurations and different approaches to analogy calculations. The lowest occupational gender bias was observed with the fastText embeddings. Similarly, we compared different fastText embeddings on Croatian occupational analogies.

  • Open Access
    Authors: 
    Jakob Lenardič; Darja Fišer;
    Publisher: University of Ljubljana

    This paper first presents a comparative analysis of modal adverbs in doctoral theses in the humanities and social sciences on the one hand, and in natural and technical sciences on the other from the 1.7-billion-token corpus of Slovenian academic texts KAS (Erjavec et al., 2019a). Using a randomized concordance analysis, we observe the epistemic and non-epistemic usage of the modal adverbs and show that epistemic adverbs are more characteristic of the humanities and social sciences theses. We also show that the non-epistemic dispositional meaning of possibility, which is most commonly used in natural and technical sciences theses, is not used as a hedging device. In the second part of the paper we compare the usage of a selected set of modals in bachelor’s, master’s and doctoral theses in order to chart how researchers’ approach to stance-taking changes at different proficiency levels in academic writing, showing that the observed increase in hedging devices in doctoral theses seems to be less a function of an increased proficiency level in academic writing as such and more the result of conceptual differences between undergraduate and postgraduate theses, only the latter of which are original research contributions with extensive discussion of the results.

Send a message
How can we help?
We usually respond in a few hours.