LANGUAGE TECHNOLOGIES & DIGITAL HUMANITIES CONFERENCE 2024

September 19-20, 2024
Faculty of Electrical Engineeering, University of Ljubljana

 

The Slovenian Language Technologies Society (SDJT), the Centre for Language Resources and Technologies at the University of Ljubljana (CJVT), and the research infrastructures CLARIN.SI and DARIAH-SI are organising the biennial conference “Language Technologies and Digital Humanities”. The conference has more than 20 years of tradition, and was thematically expanded in 2016 to include digital humanities. This year, the organizational committee of the conference is led by ZRC SAZU. The event is organized in collaboration with the Faculty of Electrical Engineering of the University of Ljubljana, which will host the event on September 19 and 20, 2024.

  1. Thematic areas
  2. Important dates
  3. Invited speakers
  4. Pre-conference events
  5. Panels
  6. Instructions for authors
  7. Organisation

Thematic areas of the conference

The conference aims to bring together researchers from various backgrounds and methodological frameworks. The main topics will include but are not limited to:

  • Speech and other mono- and multilingual language technologies
  • Digital linguistics: translation studies, corpus linguistics, lexicology and lexicography, standardisation
  • Digital humanities and historical studies, ethnology, literary studies, musicology, cultural heritage, archaeology, and fine arts
  • Digital humanities in education and digital publishing

We welcome submissions that present guidelines, research, good practices, projects and results in these areas. The conference will also include invited lectures, a student section, and roundtables on topics related to the conference. The official languages of the conference will be Slovene and English.


Invited speakers


Barbara McGillivray

“Exploring language change computationally: lessons from interdisciplinary collaborations”

 

Abstract: Advanced computational methods allow us to analyse vast datasets and uncover previously inaccessible patterns. However, few natural language processing algorithms properly account for the dynamic nature of language, particularly its semantics, which is crucial to humanistic inquiry. Efforts are underway to improve AI systems’ understanding of historical context and language dynamics, such as in the automatic detection of semantic change, but human annotation and interpretation is still needed to capture the nuances of language and its cultural context. In this talk I will report on a collaborative project involving digital humanists, computational linguists, software engineers and library curators to analyse the effects of mechanisation on the English language of the nineteenth century. I will discuss the challenges and insights gained from combining voluntary crowdsourcing for historical language annotation with algorithms and design experiments. Integrating these approaches allows us to reach a nuanced understanding of language evolution in response to mechanization and, more broadly, contribute to interdisciplinary research at the intersection of AI and the humanities.

Bio: Barbara McGillivray is Lecturer in Digital Humanities and Cultural Computation in the Department of Digital Humanities of King’s College London and Turing fellow at The Alan Turing Institute. She is Editor in Chief of the Journal of Open Humanities Data and convenor of the MA programme in Digital Humanities at King’s, as well as president of the ACL Special Interest Group on language technologies for the socio-economic sciences and humanities and convenor of the Turing special interest group “Humanities and data science”. Her research focusses on computational methods for the study of language change in both historical languages and contemporary data. She has been co-Investigator of the Living with Machines project, a very large collaboration involving The Alan Turing Institute and the British Library and aimedinc to investigate the effects of mechanisation via the analysis of British historical newspaper collections. Her most recent book is “Applying Language Technology in Humanities Research. Design, Application, and the Underlying Logic“ (co-authored with Gábor Mihály Tóth, Palgrave Macmillan 2020).



Simon Dobnik

“Beyond pixels and words”

 

Abstract: Words are not used in isolation. When we communicate we relate them to our background knowledge, the intent of interaction – what is the purpose of what we want to say, who is our partner, what has been said before – our common ground, our senses and perception of the physical world and situations around us. Speech is also not the only way to convey information with: we interact in writing, symbols, with different kinds of texts, with eye-gaze, gestures and other properties of our bodies. Language models in language technology extract meaning primarily from text and sometimes a few other modalities such as images and acoustic signal. This poses two questions: (i) to what extent can these modalities be a proxy for representing semantic knowledge for different natural language processing tasks and applications; and (ii) how can we port semantic knowledge captured in these modalities to different modalities – how can we bring  large language models to the real world and take them for a walk? In this talk I will describe our research towards answering these questions and outline our challenges awaiting ahead.

Bio: Simon Dobnik is a Professor of Computational Linguistics at the Department of Philosophy, Linguistics and Theory of Science (FLoV) at University of Gothenburg, Sweden. He is a member of the Centre for Language Technology (CLT) and the Centre for Linguistic Theory and Studies in Probability (CLASP) where he leads the Cognitive Systems research group. He has worked on (i) data models and machine learning of meaning representations for language, action and perception, (ii) semantic models for language, action and perception (computational semantics), (iii) representation learning in language, inference and interpretability, (iv) interpretation and generation of spatial descriptions and reference, (v) interactive learning with small data, (vi) data bias and privacy, and (vii) multimodal dialogue, robotics and related topics. 

Pre-conference events

Two pre-conference events will be held on the 18th September 2024:

Preliminary schedule:
09:00-13:00 CLASSLA-Express
09:00-13:00 An Introduction to LaTeX for Humanities Scholars
14:00-15:00 Round table on LLMs and corpus linguistics
15:00-17:00 Networking of researchers from the South Slavic space and the ReLDI and CLASSLA centres
 

The final stop of CLASSLA-Express – a series of workshops on investigating South Slavic corpora using CLARIN.SI concordancers (Skopje TBA, Zagreb 19 Apr, Rijeka 26 Apr, Belgrade 29 May, Ljubljana 18 Sep). The workshop will be conducted in English, and registration is required to participate. More information about the workshop can be found here: https://www.clarin.si/info/wp-content/uploads/2024/05/Call-for-participation_CLASSLA-Express_LJ.docx.pdf.


No More Document-Editing Nightmares: An Introduction to LaTeX for Humanities Scholars

This hands-on tutorial serves as an introduction to LaTeX, which is a high-quality typesetting system that is being increasingly used and promoted as a document editing program in digital humanities and linguistics. LaTeX facilitates efficient bibliography management, as it makes the handling of complex document structure easy by continuously and automatically updating in-document references. It also enables the seamless adaptation of your document to the existing formatting templates of conferences and journals. LaTeX, like many other widely used document editing systems, can be used in environments that support collaborative work.

The tutorial is primarily aimed at beginners who have no or little experience with the system. The goal is to teach the participants how to create a document in LaTeX from the ground up. We will show the core basics in document preparation as well as introduce bibliography management, but will also tailor the tutorial to the participants’ research needs.


A round table on the usage of large language models in corpus-linguistic research – a crucial question of today’s corpus linguists identified during the first two stops of the CLASSLA-Express workshop.


Networking of researchers from the CLASSLA knowledge centre for South Slavic languages and the ReLDI Centre Belgrade, intended for discussing future organisational, infrastructural and research directions of both organisations, as well as for general networking of researchers interested in the South Slavic language group.

Panels

Frontiers in Speech Communication Research

Chaired by Darinka Verdonik

The realm of speech communication research spans traditional linguistic disciplines and cutting-edge communication technologies, creating a rich tapestry of exploration and innovation. This panel convenes active researchers from computational linguistics, speech technologies, corpus linguistics and traditional linguistic disciplines to discuss the latest advancements and challenges, the motives that underpin their research goals, and how speech communication research can address the societal challenges that confront us today. Join us for a comprehensive journey through the frontiers of speech communication research, where theoretical insights meet practical applications, illuminating the future of this dynamic field.


Important dates

  • March 1, 2024: First call for papers
  • May 17, 2024: Deadline for abstract/paper submission
  • May 31, 2024: Extended deadline for abstract/paper submission
  • July 5, 2024: Notification of acceptance
  • August 23, 2024: Final abstract/paper submission
  • August 23, 2024: Registration deadline
  • September 18, 2024: Pre-conference events and workshops
  • September 19 & 20, 2024: JTDH 2024 Conference

Instructions for authors

The authors are invited to submit either a full paper or an extended abstract. The extended abstract will be published in the book of abstracts and the full papers in the conference proceedings, both of which will be published on the conference website under the Creative Commons license at the beginning of the conference. We leave it up to the authors whether to submit their contributions anonymized or not.

The official languages of the conference are Slovene and English.

Full papers should contain 4000 to 6500 words, while extended abstracts should contain 2000 to 3000 words. For submissions in English, please use the Word template, the LaTeX template [.zip] or the Overleaf LaTeX template (Note that the .zip file and Overleaf template contain LaTeX templates for both Slovene and English). The Slovene Word template is available on the Slovene version of the site.

Please submit your paper on the EasyChair platform by clicking on this link.

The student authors of (full) papers should indicate if it is a student contribution by adding “student paper” to the list of keywords. All the co-authors of student papers should be students (PhD, Master’s). These papers will be presented in a separate student session and will be eligible for the best student paper award.

Organisation

Organisation committee

  • Jerneja Fridl (OC chair, DARIAH-SI), Research centre of the Slovenian Academy of Sciences and Arts (ZRC SAZU)
  • Mojca Šorn (DARIAH-SI), Institute for Contemporary History
  • Ana Cvek (DARIAH-SI), Institute for Contemporary History
  • Simon Dobrišek (CJVT), Faculty of Electrical Engineering, University of Ljubljana
  • Katja Meden (CLARIN.SI), “Jožef Stefan” Institute
  • Kaja Dobrovoljc (SDJT), Faculty of Arts, University of Ljubljana
  • Miha Peče (DARIAH-SI), Research centre of the Slovenian Academy of Sciences and Arts (ZRC SAZU)
  • Miha Seručnik (DARIAH-SI), Research centre of the Slovenian Academy of Sciences and Arts (ZRC SAZU)

Programme committee

Steering committee

  • Špela Arhar Holdt (chair, CJVT), Faculty of Arts and Faculty of Computer and Information Science, University of Ljubljana
  • Slavko Žitnik (SDJT), Faculty of Computer and Information Science, University of Ljubljana
  • Tomaž Erjavec (CLARIN.SI), Dept. of Knowledge Technologies, Jožef Stefan Institute
  • Jakob Lenardič (DARIAH.SI), Institute for Contemporary History
  • Matej Klemen (Student Section), Faculty of Computer and Information Science, University of Ljubljana
  • Tina Munda (Student Section), Faculty of Arts, University of Ljubljana
  • David Bordon (Student Section), Faculty of Arts, University of Ljubljana

Members of the programme committee

  • Saša Babič, Institute of Slovenian Ethnology, ZRC SAZU
  • Petra Bago, Faculty of Humanities and Social Sciences, University of Zagreb
  • Vuk Batanović, Innovation Center of the School of Electrical Engineering in Belgrade
  • Narvika Bovcon, Faculty of Computer and Information Science, University of Ljubljana
  • Václav Cvrček, Institute of the Czech National Corpus, Charles University in Prague
  • Jaka Čibej, Faculty of Computer and Information Science, University of Ljubljana
  • Simon Dobrišek, Faculty of Electrical Engineering, University of Ljubljana
  • Helena Dobrovoljc, Fran Ramovš Institute of the Slovenian Language, ZRC SAZU
  • Kaja Dobrovoljc, Faculty of Arts, University of Ljubljana
  • Jerneja Fridl, ZRC SAZU
  • Polona Gantar, Faculty of Arts, University of Ljubljana
  • Vojko Gorjanc, Faculty of Arts, University of Ljubljana
  • Jurij Hadalin, Institute of Contemporary History
  • Ivo Ipšić, University of Rijeka
  • Mateja Jemec Tomazin, Fran Ramovš Institute of the Slovenian Language, ZRC SAZU
  • Alenka Kavčič, Faculty of Computer Science, University of Ljubljana
  • Iztok Kosem, Faculty of Arts, University of Ljubljana
  • Simon Krek, Faculty of Arts & Faculty Computer and Information Science, University of Ljubljana
  • Drago Kunej, Institut of Ethnomusicology, ZRC SAZU
  • Nikola Ljubešić, Department of Knowledge Technologies, Jožef Stefan Institute
  • Nataša Logar, Faculty of Social Sciences, University of Ljubljana
  • Matija Marolt, Faculty of Computer and Information Science, University of Ljubljana
  • Sanda Martinčić Ipšić, University of Rijeka
  • Mirjam Sepesy Maučec, Faculty of Electrical Engineering and Computer Science, University of Maribor
  • Maja Miličević Petrović, University of Bologna
  • Dunja Mladenić, Artificial Intelligence Laboratory, Jožef Stefan Institute
  • Andrej Pančur, Institute of Contemporary History
  • Matevž Pesek, Faculty of Computer Science, University of Ljubljana
  • Karmen Pižorn, Faculty of Education, University of Ljubljana
  • Senja Pollak, Department of Knowledge Technologies, Jožef Stefan Institute
  • Ajda Pretnar, Institute of Contemporary History
  • Marko Robnik Šikonja, Faculty of Computer and Information Science, University of Ljubljana
  • Tanja Samardžić, University of Zurich
  • Miha Seručnik, Milko Kos Historical Institute, ZRC SAZU
  • Marko Stabej, Faculty of Arts, University of Ljubljana
  • Janez Štebe, Faculty of Social Sciences, University of Ljubljana
  • Mojca Šorn, Institute of Contemporary History
  • Daniel Vasić, University of Mostar
  • Darinka Verdonik, Faculty of Electrical Engineering and Computer Science, University of Maribor
  • Jerneja Žganec Gros, Alpineon d.o.o.
  • Andrej Žgank, Faculty of Electrical Engineering and Computer Science, University of Maribor
  • Aleš Žagar, Faculty of Computer and Information Science, University of Ljubljana
  • Branko Žitko, Faculty of Science, University of Split