Field matters

Workshop on NLP Applications to Field Linguistics

Field Matters 2023

Field Matters 2022

Field Matters

Field linguistics plays a crucial role in the development of linguistic theory and universal language modelling, as it provides uncontested, the only way to obtain structural data about the rapidly diminishing diversity of natural languages.

The Field matters workshop aims to bring together the urgent needs of field linguists and the vast community of NLP practitioners, developing up-to-date NLP tools for easier, faster, more reliable data collection and annotation.

Invited speakers


Lane Schwartz (University of Alaska-Fairbanks)

Dr. Schwartz centers on computational linguistics for endangered languages, with a focus on St. Lawrence Island Yupik. This includes work in polysynthetic lang modelling, cognitively-motivated unsupervised grammar induction, compiler-based deep learning, and machine translation. He is one of the original developers of Joshua, an open source toolkit for tree-based statistical machine translation, and was a frequent contributor to Moses, the de-facto standard for phrase-based statistical machine translation. Lane already joined us at the last Workshop at COLING in 2022.

Emmanuel Schang (University of Orléans (France))

Dr. Schang is an expert in creole languages and their documentation. He is a the Primary Investigator of the CREAM project (machine-assisted creole languages documentation) and the coordinator of The International Research Group on Structure, Emergence and Evolution of Pidgin and Creole Languages.


Antonios Anastasopoulos (George Mason University)

Antonis Anastasopoulos is an assistant professor at George Mason Computer Science Natural Language Processing Group. His interests include various aspects of multilingual Natural Language Processing and Machine Learning, with the main focus being Machine Translation and Speech Recognition for endangered languages and low-resource settings in general. He completed his Computer Science PhD at the University of Notre Dame, with a dissertation on “Computational Tools for Endangered Language Documentation”. He has been involved with documentation efforts on Griko, an endangered Greek dialect spoken in south Italy. He co-organized the workshop on Language Technology for Language Documentation and Revitalization, hosted in Pittsburgh in August 2019.

Steven Bird (Charles Darwin University)

Steven Bird is conducting social and technological experiments in the future evolution of the world’s languages. Together with his students and colleagues, he is developing scalable methods for preserving disappearing words and worldviews for future generations of speakers and scholars. He is collaborating with speech communities in diasporas and ancestral homelands to design new approaches to language maintenance and revitalisation.

Steven studied computer science at the University of Melbourne before completing a PhD in computational linguistics at the University of Edinburgh. He has conducted fieldwork on endangered languages in West Africa, South America, Central Asia, Melanesia, and Australia. He has held academic positions at the Universities of Edinburgh, Pennsylvania, Melbourne, and UC Berkeley. He holds a secondary appointment as Senior Research Scientist at the International Computer Science Institute, UC Berkeley. He serves as Linguist at the Nawarddeken Academy in West Arnhem.

Steven is leading the Top End Language Lab

Program Committee (participants since ‘22)