Field matters

Workshop on NLP Applications to Field Linguistics

Call for Papers

Field Matters 2024

Field Matters 2023

Field Matters 2022

Field Matters

Field linguistics plays a crucial role in the development of linguistic theory and universal language modelling, as it provides uncontested, the only way to obtain structural data about the rapidly diminishing diversity of natural languages.

The Field matters workshop aims to bring together the urgent needs of field linguists and the vast community of NLP practitioners, developing up-to-date NLP tools for easier, faster, more reliable data collection and annotation.

Invited speakers


Emily Prud’hommeaux (Boston College)

Dr. Emily Prud’hommeaux is a researcher focused on NLP in low-resource setting, with a particular focus on endangered languages and the language of individuals with conditions impacting communication and cognition. Her latest research includes ASR for low-resorce languages, as well as for field data. She is the Gianinno Family Sesquicentennial Assistant Professor in the Department of Computer Science at Boston College.

Genta Indra Winata (Bloomberg LP)

Genta Indra Winata a Senior Research Scientist at Bloomberg LP. He is interested primarly includes Language Model, Multilingual, Cross-lingual, Code-Switching, Dialogue System, and Speech. His research includes several projects on NLP for languages of South-East Asia. Among with Alham Fikri Aji, he was a chair ow workshop on SEA NLP which was held at AACL in 2023, and made an ACL tutorial on the Current Status of NLP in South East Asia.

Alham Fikri Aji (MBZUAI)

Alham Fikri Aji an assistant professor at MBZUAI. His research focuses on multilingual and low-resource NLP, particularly for Indonesian and the languages of South-East Asia. Among with Genta Indra Winata, he was a chair ow workshop on SEA NLP which was held at AACL in 2023, and made an ACL tutorial on the Current Status of NLP in South East Asia.


Lane Schwartz (University of Alaska-Fairbanks)

Dr. Schwartz centers on computational linguistics for endangered languages, with a focus on St. Lawrence Island Yupik. This includes work in polysynthetic lang modelling, cognitively-motivated unsupervised grammar induction, compiler-based deep learning, and machine translation. He is one of the original developers of Joshua, an open source toolkit for tree-based statistical machine translation, and was a frequent contributor to Moses, the de-facto standard for phrase-based statistical machine translation. Lane already joined us at the last Workshop at COLING in 2022.

Emmanuel Schang (University of Orléans, France)

Dr. Schang is an expert in creole languages and their documentation. He is a the Primary Investigator of the CREAM project (machine-assisted creole languages documentation) and the coordinator of The International Research Group on Structure, Emergence and Evolution of Pidgin and Creole Languages.


Antonios Anastasopoulos (George Mason University)

Antonis Anastasopoulos is an assistant professor at George Mason Computer Science Natural Language Processing Group. His interests include various aspects of multilingual Natural Language Processing and Machine Learning, with the main focus being Machine Translation and Speech Recognition for endangered languages and low-resource settings in general. He completed his Computer Science PhD at the University of Notre Dame, with a dissertation on “Computational Tools for Endangered Language Documentation”. He has been involved with documentation efforts on Griko, an endangered Greek dialect spoken in south Italy. He co-organized the workshop on Language Technology for Language Documentation and Revitalization, hosted in Pittsburgh in August 2019.

Steven Bird (Charles Darwin University)

Steven Bird is conducting social and technological experiments in the future evolution of the world’s languages. Together with his students and colleagues, he is developing scalable methods for preserving disappearing words and worldviews for future generations of speakers and scholars. He is collaborating with speech communities in diasporas and ancestral homelands to design new approaches to language maintenance and revitalisation.

Steven studied computer science at the University of Melbourne before completing a PhD in computational linguistics at the University of Edinburgh. He has conducted fieldwork on endangered languages in West Africa, South America, Central Asia, Melanesia, and Australia. He has held academic positions at the Universities of Edinburgh, Pennsylvania, Melbourne, and UC Berkeley. He holds a secondary appointment as Senior Research Scientist at the International Computer Science Institute, UC Berkeley. He serves as Linguist at the Nawarddeken Academy in West Arnhem.

Steven is leading the Top End Language Lab

Program Committee (participants since ‘22)