Field matters

FieldMatters+SigTyp 2025 workshop program

Vienna, CEST (UTC+2)

August 1

08:50 – 09:00 Welcome (10 min)

Morning Plenary – Keynotes

Block 1: Language documentation, archives and algorithms

09:00 – 09:40 Keynote 1: Archives, Algorithms, and Alliances: Grounding NLP in the Realities of Language Documentation – Alexis Michaud (40 min)

09:40 – 10:00 Oral 1: Searchable Language Documentation Corpora: DoReCo meets TEITOK (20 min)

10:00 – 10:20 Oral 2: Automatic Phone Alignment of Code-switched Urum–Russian Field Data (20 min)

10:20 – 10:30 Buffer / transition (10 min)

10:30 – 11:00 Coffee break

Block 2: LLMs and multilinguality

11:00 – 11:40 Keynote 2: A Few Good Texts: How Small Sets of High-Quality Linguistic Data Power Massive Multilinguality in Language Models – Eduardo Sanchez (40 min)

11:40 - 13:00 Poster time (Both workshops)

13:00 – 14:00 Lunch (on your own)

14:00 – 14:20 Oral 3: High-Dimensional Interlingual Representations of Large Language Models (20 min)

14:20 – 14:40 Oral 4: A Practical Tool to Help Automate Interlinear Glossing: A Study on Mukrī Kurdish (20 min)

Block 3: Typology and cross-linguistic data

14:40 – 15:20 Keynote 3: Connecting the Dots – Growing an Eco-system for Cross-linguistic Data – Robert Forkel (40 min)

15:20 – 15:40 Oral 5: Are Translated Texts Useful for Gradient Word Order Extraction? (20 min)

15:40 – 16:00 Coffee break

16:00 – 16:40 Keynote 4: (L)LMs and Language Theory – Lisa Bylinina (40 min)

16:40 – 17:00 Buffer time

17:00 – 17:20 Oral 6: Beyond the Data: The Impact of Annotation Inconsistencies in UD Treebanks on Typological Universals and Complexity Assessment (20 min)

17:20 – 17:40 Oral 7: Beyond Cognacy (20 min)

17:40 – 18:00 Oral 8: Token-level Semantic Typology without a Massively Parallel Corpus (20 min)

18:00 The End