LANGUAGE TECHNOLOGIES & DIGITAL HUMANITIES CONFERENCE 2024 | SDJT – Slovensko društvo za jezikovne tehnologije

September 19-20, 2024
Faculty of Electrical Engineering, University of Ljubljana

https://www.sdjt.si/jtdh-2024-en

🇸🇮 Slovenska stran:

https://www.sdjt.si/jtdh-2024

The Slovenian Language Technologies Society (SDJT), the Centre for Language Resources and Technologies at the University of Ljubljana (CJVT), and the research infrastructures CLARIN.SI (Jožef Stefan Institute) and DARIAH-SI (Institute of Contemporary History and ZRC SAZU) organised the biennial conference “Language Technologies and Digital Humanities”. The conference has more than 20 years of tradition, and was thematically expanded in 2016 to include digital humanities. This year, the organizational committee of the conference was led by ZRC SAZU. The event was organized in collaboration with the Faculty of Electrical Engineering of the University of Ljubljana, which hosted the event on September 19 and 20, 2024.

Conference contributions
Thematic areas
Timetable of the conference
Invited speakers
- Barbara McGillivray: “Exploring language change computationally: lessons from interdisciplinary collaborations”
- Simon Dobnik: “Beyond pixels and words”
Pre-conference events
Expert panel
Organisation
Photo gallery

Thematic areas of the conference

The conference aims to bring together researchers from various backgrounds and methodological frameworks. The main topics will include but are not limited to:

Speech and other mono- and multilingual language technologies
Digital linguistics: translation studies, corpus linguistics, lexicology and lexicography, standardisation
Digital humanities and historical studies, ethnology, literary studies, musicology, cultural heritage, archaeology, and fine arts
Digital humanities in education and digital publishing

We welcome submissions that present guidelines, research, good practices, projects and results in these areas. The conference will also include invited lectures, a student section, and roundtables on topics related to the conference. The official languages of the conference will be Slovene and English.

Registration

Registration form
Registration fee
- Regular attendees: 100 EUR
- Students, co-authors and participants without contribution: free of charge

Information for authors

Presentations:

Full paper presentations will have 15 minutes + 5 minutes for questions.
Extended abstract presentations will have 10 minutes + 5 minutes for questions.

Presentations should be in the same language as the paper.

Posters:

Maximum poster size is A0 (84.1 x 118.9 cm)
Poster layout: Portrait format

Due to the size of the poster holders (90 x 150 cm), Landscape posters cannot be installed.

Invited speakers

Barbara McGillivray

“Exploring language change computationally: lessons from interdisciplinary collaborations”

Abstract: Advanced computational methods allow us to analyse vast datasets and uncover previously inaccessible patterns. However, few natural language processing algorithms properly account for the dynamic nature of language, particularly its semantics, which is crucial to humanistic inquiry. Efforts are underway to improve AI systems’ understanding of historical context and language dynamics, such as in the automatic detection of semantic change, but human annotation and interpretation is still needed to capture the nuances of language and its cultural context. In this talk I will report on a collaborative project involving digital humanists, computational linguists, software engineers and library curators to analyse the effects of mechanisation on the English language of the nineteenth century. I will discuss the challenges and insights gained from combining voluntary crowdsourcing for historical language annotation with algorithms and design experiments. Integrating these approaches allows us to reach a nuanced understanding of language evolution in response to mechanization and, more broadly, contribute to interdisciplinary research at the intersection of AI and the humanities.

Bio: Barbara McGillivray is Lecturer in Digital Humanities and Cultural Computation in the Department of Digital Humanities of King’s College London and Turing fellow at The Alan Turing Institute. She is Editor in Chief of the Journal of Open Humanities Data and convenor of the MA programme in Digital Humanities at King’s, as well as president of the ACL Special Interest Group on language technologies for the socio-economic sciences and humanities and convenor of the Turing special interest group “Humanities and data science”. Her research focusses on computational methods for the study of language change in both historical languages and contemporary data. She has been co-Investigator of the Living with Machines project, a very large collaboration involving The Alan Turing Institute and the British Library and aimedinc to investigate the effects of mechanisation via the analysis of British historical newspaper collections. Her most recent book is “Applying Language Technology in Humanities Research. Design, Application, and the Underlying Logic“ (co-authored with Gábor Mihály Tóth, Palgrave Macmillan 2020).

Simon Dobnik

“Beyond pixels and words”

Abstract: Words are not used in isolation. When we communicate we relate them to our background knowledge, the intent of interaction – what is the purpose of what we want to say, who is our partner, what has been said before – our common ground, our senses and perception of the physical world and situations around us. Speech is also not the only way to convey information with: we interact in writing, symbols, with different kinds of texts, with eye-gaze, gestures and other properties of our bodies. Language models in language technology extract meaning primarily from text and sometimes a few other modalities such as images and acoustic signal. This poses two questions: (i) to what extent can these modalities be a proxy for representing semantic knowledge for different natural language processing tasks and applications; and (ii) how can we port semantic knowledge captured in these modalities to different modalities – how can we bring large language models to the real world and take them for a walk? In this talk I will describe our research towards answering these questions and outline our challenges awaiting ahead.

Bio: Simon Dobnik is a Professor of Computational Linguistics at the Department of Philosophy, Linguistics and Theory of Science (FLoV) at University of Gothenburg, Sweden. He is a member of the Centre for Language Technology (CLT) and the Centre for Linguistic Theory and Studies in Probability (CLASP) where he leads the Cognitive Systems research group. He has worked on (i) data models and machine learning of meaning representations for language, action and perception, (ii) semantic models for language, action and perception (computational semantics), (iii) representation learning in language, inference and interpretability, (iv) interpretation and generation of spatial descriptions and reference, (v) interactive learning with small data, (vi) data bias and privacy, and (vii) multimodal dialogue, robotics and related topics.

Pre-conference events

Two pre-conference events will be held on the 18th September 2024:

Preliminary schedule:

09:00-13:00 CLASSLA-Express

09:00-13:00 An Introduction to LaTeX for Humanities Scholars

14:00-15:00 Round table on LLMs and corpus linguistics

15:00-17:00 Networking of researchers from the South Slavic space and the ReLDI and CLASSLA centres

The CLASSLA-Express workshop:

The final stop of CLASSLA-Express – a series of workshops on investigating South Slavic corpora using CLARIN.SI concordancers (Skopje TBA, Zagreb 19 Apr, Rijeka 26 Apr, Belgrade 29 May, Ljubljana 18 Sep). The workshop will be conducted in English, and registration is required to participate. More information about the workshop can be found here: https://www.clarin.si/info/wp-content/uploads/2024/05/Call-for-participation_CLASSLA-Express_LJ.docx.pdf.

No More Document-Editing Nightmares: An Introduction to LaTeX for Humanities Scholars

This hands-on tutorial serves as an introduction to LaTeX, which is a high-quality typesetting system that is being increasingly used and promoted as a document editing program in digital humanities and linguistics. LaTeX facilitates efficient bibliography management, as it makes the handling of complex document structure easy by continuously and automatically updating in-document references. It also enables the seamless adaptation of your document to the existing formatting templates of conferences and journals. LaTeX, like many other widely used document editing systems, can be used in environments that support collaborative work.

The tutorial is primarily aimed at beginners who have no or little experience with the system. The goal is to teach the participants how to create a document in LaTeX from the ground up. We will show the core basics in document preparation as well as introduce bibliography management, but will also tailor the tutorial to the participants’ research needs.

A round table on the usage of large language models in corpus-linguistic research – a crucial question of today’s corpus linguists identified during the first two stops of the CLASSLA-Express workshop.

Networking of researchers from the CLASSLA knowledge centre for South Slavic languages and the ReLDI Centre Belgrade, intended for discussing future organisational, infrastructural and research directions of both organisations, as well as for general networking of researchers interested in the South Slavic language group.

Expert panel

Frontiers in Speech Communication Research

Chaired by Darinka Verdonik

[video]

The realm of speech communication research spans traditional linguistic disciplines and cutting-edge communication technologies, creating a rich tapestry of exploration and innovation. This panel convenes active researchers from computational linguistics, speech technologies, corpus linguistics and traditional linguistic disciplines to discuss the latest advancements and challenges, the motives that underpin their research goals, and how speech communication research can address the societal challenges that confront us today. Join us for a comprehensive journey through the frontiers of speech communication research, where theoretical insights meet practical applications, illuminating the future of this dynamic field.

The panel will be multilingual, the panel speakers may use their mother tongue or English.

Panel speakers:

- Simon Dobnik, Uni Gotheburg (computational models of language and perception, human-robot interaction, situated spoken dialogue systems, and computational representations of meaning)
- Tanja Samardžić, University of Zurich (computational linguistics, computational text processing methods)
- Nikola Ljubešić, IJS (natural language processing, computational linguistics and computational social science)
- Andrej Žgank, UM (automatic speech recognition, speech signal processing)
- Danila Zuljan Kumar, ZRC SAZU (dialectology)
- Kaja Dobrovoljc, UL (corpus linguistics)

Important dates

~~March 1, 2024: First call for papers~~
~~May 17, 2024: Deadline for abstract/paper submission~~
~~May 31, 2024: Extended deadline for abstract/paper submission~~
~~July 5, 2024: Notification of acceptance~~
~~August 23, 2024: Final abstract/paper submission~~
~~September 8, 2024: Registration deadline~~
~~September 18, 2024: Pre-conference events and workshops~~
~~September 19 & 20, 2024: JTDH 2024 Conference~~

Instructions for authors

The authors are invited to submit either a full paper or an extended abstract. The extended abstract will be published in the book of abstracts and the full papers in the conference proceedings, both of which will be published on the conference website under the Creative Commons license at the beginning of the conference. We leave it up to the authors whether to submit their contributions anonymized or not.

The official languages of the conference are Slovene and English.

Full papers should contain 4000 to 6500 words, while extended abstracts should contain 2000 to 3000 words. For submissions in English, please use the Word template, the LaTeX template [.zip] or the Overleaf LaTeX template (Note that the .zip file and Overleaf template contain LaTeX templates for both Slovene and English). The Slovene Word template is available on the Slovene version of the site.

Please submit your paper on the EasyChair platform by clicking on this link.

The student authors of (full) papers should indicate if it is a student contribution by adding “student paper” to the list of keywords. All the co-authors of student papers should be students (PhD, Master’s). These papers will be presented in a separate student session and will be eligible for the best student paper award.

Conference venue

The pre-conference workshops (18 September 2024) and the conference (19-20 September 2024) will be held at the Faculty of Electrical Engineering, Tržaška cesta 25, 1000 Ljubljana.

How to get there?

BicikeLJ
Public transport (LPP bus): get off at the Hajdrihova bus stop – lines 1, 6 and 6b
The conference venue is about 20 minutes’ walk from the city centre.

Wednesday 18 September 2024 (pre-conference workshops):

From 13:00 to 14:00 lunch will be served in the UL FE Restaurant.

Thursday 19 and Friday 20 September 2024

Coffee breaks and lunches: Lunches and coffee breaks will be served on site
Lunch will be served daily from 12:30 to 14:00 in the UL FE Restaurant
Conference Dinner: Thursday 19:00 to 21:00 at the Breg Restaurant

Organisation

Organisation committee

Jerneja Fridl (OC chair, DARIAH-SI), Research centre of the Slovenian Academy of Sciences and Arts (ZRC SAZU)
Mojca Šorn (DARIAH-SI), Institute for Contemporary History
Ana Cvek (DARIAH-SI), Institute for Contemporary History
Simon Dobrišek (CJVT), Faculty of Electrical Engineering, University of Ljubljana
Katja Meden (CLARIN.SI), “Jožef Stefan” Institute
Kaja Dobrovoljc (SDJT), Faculty of Arts, University of Ljubljana
Miha Peče (DARIAH-SI), Research centre of the Slovenian Academy of Sciences and Arts (ZRC SAZU)
Miha Seručnik (DARIAH-SI), Research centre of the Slovenian Academy of Sciences and Arts (ZRC SAZU)

Programme committee

Steering committee

Špela Arhar Holdt (chair, CJVT), Faculty of Arts and Faculty of Computer and Information Science, University of Ljubljana
Slavko Žitnik (SDJT), Faculty of Computer and Information Science, University of Ljubljana
Tomaž Erjavec (CLARIN.SI), Dept. of Knowledge Technologies, Jožef Stefan Institute
Jakob Lenardič (DARIAH.SI), Institute for Contemporary History
Matej Klemen (Student Section), Faculty of Computer and Information Science, University of Ljubljana
Tina Munda (Student Section), Faculty of Arts, University of Ljubljana
David Bordon (Student Section), Faculty of Arts, University of Ljubljana

Members of the programme committee

Saša Babič, Institute of Slovenian Ethnology, ZRC SAZU
Petra Bago, Faculty of Humanities and Social Sciences, University of Zagreb
Vuk Batanović, Innovation Center of the School of Electrical Engineering in Belgrade
Narvika Bovcon, Faculty of Computer and Information Science, University of Ljubljana
Václav Cvrček, Institute of the Czech National Corpus, Charles University in Prague
Jaka Čibej, Faculty of Computer and Information Science, University of Ljubljana
Simon Dobrišek, Faculty of Electrical Engineering, University of Ljubljana
Helena Dobrovoljc, Fran Ramovš Institute of the Slovenian Language, ZRC SAZU
Kaja Dobrovoljc, Faculty of Arts, University of Ljubljana
Jerneja Fridl, ZRC SAZU
Polona Gantar, Faculty of Arts, University of Ljubljana
Vojko Gorjanc, Faculty of Arts, University of Ljubljana
Jurij Hadalin, Institute of Contemporary History
Ivo Ipšić, University of Rijeka
Mateja Jemec Tomazin, Fran Ramovš Institute of the Slovenian Language, ZRC SAZU
Alenka Kavčič, Faculty of Computer Science, University of Ljubljana
Iztok Kosem, Faculty of Arts, University of Ljubljana
Simon Krek, Faculty of Arts & Faculty Computer and Information Science, University of Ljubljana
Drago Kunej, Institut of Ethnomusicology, ZRC SAZU
Nikola Ljubešić, Department of Knowledge Technologies, Jožef Stefan Institute
Nataša Logar, Faculty of Social Sciences, University of Ljubljana
Matija Marolt, Faculty of Computer and Information Science, University of Ljubljana
Sanda Martinčić Ipšić, University of Rijeka
Mirjam Sepesy Maučec, Faculty of Electrical Engineering and Computer Science, University of Maribor
Maja Miličević Petrović, University of Bologna
Dunja Mladenić, Artificial Intelligence Laboratory, Jožef Stefan Institute
Andrej Pančur, Institute of Contemporary History
Matevž Pesek, Faculty of Computer Science, University of Ljubljana
Karmen Pižorn, Faculty of Education, University of Ljubljana
Senja Pollak, Department of Knowledge Technologies, Jožef Stefan Institute
Ajda Pretnar, Institute of Contemporary History
Marko Robnik Šikonja, Faculty of Computer and Information Science, University of Ljubljana
Tanja Samardžić, University of Zurich
Miha Seručnik, Milko Kos Historical Institute, ZRC SAZU
Marko Stabej, Faculty of Arts, University of Ljubljana
Janez Štebe, Faculty of Social Sciences, University of Ljubljana
Mojca Šorn, Institute of Contemporary History
Daniel Vasić, University of Mostar
Darinka Verdonik, Faculty of Electrical Engineering and Computer Science, University of Maribor
Jerneja Žganec Gros, Alpineon d.o.o.
Andrej Žgank, Faculty of Electrical Engineering and Computer Science, University of Maribor
Aleš Žagar, Faculty of Computer and Information Science, University of Ljubljana
Branko Žitko, Faculty of Science, University of Split

September 19-20, 2024Faculty of Electrical Engineering, University of Ljubljana