Field linguistics plays a crucial role in the development of linguistic theory and universal language modelling, as it provides uncontested, the only way to obtain structural data about the rapidly diminishing diversity of natural languages.
The Field matters workshop aims to bring together the urgent needs of field linguists and the vast community of NLP practitioners, developing up-to-date NLP tools for easier, faster, more reliable data collection and annotation.
The workshop will take place at EACL 2023
Lane Schwartz, Associate Professor of Computer Science, University of Alaska-Fairbanks
Dr. Schwartz centers on computational linguistics for endangered languages, with a focus on St. Lawrence Island Yupik. This includes work in polysynthetic lang modelling, cognitively-motivated unsupervised grammar induction, compiler-based deep learning, and machine translation. He is one of the original developers of Joshua, an open source toolkit for tree-based statistical machine translation, and was a frequent contributor to Moses, the de-facto standard for phrase-based statistical machine translation. Lane already joined us at the last Workshop at COLING in 2022.
Emmanuel Schang, an associate professor at in the Department of Linguistics of the University of Orléans (France).
Dr. Schang is an expert in creole languages and their documentation. He is a the Primary Investigator of the CREAM project (machine-assisted creole languages documentation) and the coordinator of The International Research Group on Structure, Emergence and Evolution of Pidgin and Creole Languages.
is an NLP Researcher. Oleg now writes his PhD thesis at HSE University, his main points of interest are under-resourced languages ASR, under-resourced languages modelling and linguistic interpretation of language models. He co-organized Field Matters 2022, SIGTYP 2021 and LowResourceEval 2021 shared tasks on under-resourced languages ASR.
(Institute of Linguistics RAS, HSE University,
is a PhD student at HSE. Her main points of interest are Tungusic languages, which she has been studying during her fieldwork, as well as low-resource NLP. She co-organized the SIGMORPHON 2020-2021 shared tasks on morphological reinflection, LowResourceEval 2019, 2021 shared tasks on NLP for field linguistic data and SigTyp 2021 shared task on under-resourced languages ASR.
(Indiana University, HSE University,
has a huge experience with under-resourced languages processing. His reseach interests include modelling the grammar of polysyntetic languages, and application of finite-state methods to NLP. Francis is one of the core contributors of Apertium machine translation project. He has an experience of co-organizing workshops and shared tasks, including SIGMORPHON 2020-2021, CoNLL 2018.
Is an expert in multidisciplinary applications of NLP and the methodology of science. Her research focus is on the evaluation of the language models following her PhD specifics.
(University of Melbourne,
is a Lecturer and a Postdoctoral Fellow at the University of Melbourne. Her research is focused on modelling of morphology and computational approaches to linguistics typology. She is the president of SIGTYP, co-organized the SIGTYP 2019-2021 workshops and the SIGMORPHON 2017-2021 shared tasks on morphological reinflection.
Éric Le Ferrand
(Université Grenoble Alpes, Université d’Orléans,
is a post-doctoral researcher. His main point of interest is speech recognition for under-resourced languages. His current research project is about the exploration of modern speech recognition methods applied to Creole languages. He built linguistic fieldwork expertise during his PhD, deploying speech recognition-based transcription tools in Australian Aboriginal Country.
is a researcher pursuing her masters degree in NLP. Her research interests are mainly computational and quantitive approaches to language description and modelling.
is an independent researcher. Her research interests focus on field linguistics and under-resourced languages documentation, particularly Tungusic languages.
is a student-researcher at HSE University. Her research interests are in approaches to automatic speech, especially field speech, processing, as well as evaluating the quality of this processing.
We invite both archival and non-archival submissions. Non-archival submissions are 2-page abstracts that could present already published work or work in progress. Archival submissions should be either 4- or 8-pages long.
Dual submissions with the main conference are allowed, but authors must declare dual submission by entering the paper’s main conference submission id. The reviews for the submission for the main conference will be automatically forwarded to the workshop and taken into consideration when your paper is evaluated. Authors of dual-submission papers accepted to the main conference should retract them from the workshop by March 13.
Papers posted to preprint servers such as arxiv can be submitted without any restrictions on when they were posted.
Authors of accepted archival papers should upload the final version of their paper to the submission system by the camera-ready deadline. Authors may use one extra page to address reviewer comments, for a total of nine pages.
Field matters 2023 adheres to the ACL Anti-Harassment Policy.
We encourage diversity in all forms. Workshop organizers make it their top priority the freedom of thought and expression, as well as respectful scientific debate. On behalf of the organizing team, we are committed to the principles of gender and sociodemographic diversity and are guided by these principles in the consideration of the workshop team, including the selection of invited speakers and PC. We will also make sure that the ACL Anti-Harassment Policy is respected during the organization and execution of the event.