Workshop Program

Oral presentations are allotted 20 minutes followed by 8-9 minutes for Question/Answers, followed by 1-2 minutes for transitions between speakers. This grants oral presentations 5 minutes longer than in many ACL conferences. We strongly encourage presenters to use those minutes to provide background that will make your talk more accessible to audience members who belong to the target communities outside of your own (computational linguists, documentary linguists, non-academic community members).

All times are Dublin time, GMT+1

May 26

9:00-9:30
Welcome and Introduction
9:00-10:30
Session A
9:30-10:00Learning Through TranscriptionMat Bettinson, Steven Bird
10:00-10:30“CLD²: Language Documentation Meets Natural Language Processing for Revitalising Endangered Languages”Roberto Zariquiey, Arturo Oncevay, Javier Vera
10:30-11:00
Break
11:00-12:30
Session B
11:00-11:30“Developing a Part-Of-Speech tagger for te reo Māori”Aoife Finn, Peter-Lucas Jones, Keoni Mahelona, Suzanne Duncan, Gianna Leoni
11:30-12:00Using Graph-Based Methods to Augment Online Dictionaries of Endangered LanguagesKhalid Alnajjar, Mika Hämäläinen, Niko Tapio Partanen, Jack Rueter
12:00-12:30A Word-and-Paradigm Workflow for Fieldwork AnnotationMaria Copot, Sara Court, Noah Diewald, Stephanie Antetomaso, Micha Elsner
12:30-2:00
Lunch
2:00-3:00
Session C
2:00-2:30Closing the NLP Gap: Documentary Linguistics and NLP Need a Shared Software InfrastructureLuke Gessler
2:30-3:00Automated Speech tools for helping communities process restricted-access corpora for language revival effortsNay San, Martijn Bartelds, Tolulope Ogunremi, Alison Mount, Ruben Thompson, Michael Higgins, Roy Barker, Jane Helen Simpson, Dan Jurafsky
3:00-3:30
Break
3:30-5:00
Poster Session
Shallow Parsing for Nepal Bhasa Complement ClausesBorui Zhang, Abe Kazemzadeh, Brian Reese

Can We Use Word Embeddings for Enhancing Guarani-Spanish Machine Translation?
Santiago Góngora, Nicolás Giossa, Luis Chiruzzo
Morphologically annotated corpora of PomakRitván Jusúf Karahóǧa, Panagiotis G. Krimpas, Vivian Stamou, Vasileios Arampatzakis, Dimitrios Karamatskos, Vasileios Sevetlidis, Nikolaos Constantinides, NIKOLAOS KOKKAS, George Pavlidis, Stella Markantonatou

Using Speech and NLP Resources to build an iCALL platform for a minority language: the story of An Scéalaí, the Irish experience to date
Neasa Ní Chiaráin, Oisín Nolan, Madeleine Comtois, Neimhin Robinson Gunning, Harald Berthelsen, Ailbhe Ni Chasaide

Enhancing Documentation of Hupa with Automatic Speech Recognition
Zoey Liu, Justin Spence, Emily Tucker Prud'hommeaux

Recovering Text from Endangered Languages Corrupted PDF documents
Nicolas Stefanovitch
Bigalgu-Na’hik: The Language Basket. A collaborative application to support the learner-oriented collection, management, and exchange of language materials for Gathang and KtunaxaKathrin Kaiser, Gulwanyang Moran
Reusing a Multi-lingual Setup to Bootstrap a Grammar Checker for a Very Low Resource Language without DataInga Lill Sigga Mikkelsen, Linda Wiechetek, Flammie A Pirinen

Using LARA to create image-based and phonetically annotated multimodal texts for endangered languages
Branislav Bédi, Hakeem Beedar, Belinda Chiera, Nedelina Ivanova, Christèle Maizonniaux, Neasa Ní Chiaráin, Manny Rayner, John Sloan, Ghil'ad Zuckermann
New syntactic insights for automated Wolof Universal Dependency parsingBill Dyer
Faoi Gheasa: an adaptive game for Irish language learningLiang Xu, Elaine Uí Dhonnchadha, Monica Ward

Development of the Siberian Ingrian Finnish Speech Corpus
Ivan Ubaleht, Taisto-Kalevi Raudalainen



May 27

9:00-10:30
Session E
9:00-9:30“Fine-tuning pre-trained models for Automatic Speech Recognition: experiments on a fieldwork corpus of Japhug (Trans-Himalayan family)Séverine Guillaume, Guillaume Wisniewski, Cécile Macaire, Guillaume Jacques, Alexis Michaud, Benjamin Galliot, Maximin Coavoux, Solange Rossato, Minh-Châu Nguyên, Maxime Fily
9:30-10:00Challenges and Perspectives for Innu-Aimun within Indigenous Language TechnologiesAntoine Cadotte, Tan Le Ngoc, Mathieu Boivin, Fatiha Sadat
10:00-10:30Gᵢ2Pᵢ: Rule-based, index-preserving grapheme-to-phoneme transformationsAidan Pine, Patrick William Littell, Eric Joanis, David Huggins-Daines, Christopher D Cox, Fineen Davis, Eddie Antonio Santos, Shankhalika Srikanth, Delasie Torkornoo, Sabrina Yu
10:30-11:00
Break
11:00-12:00Session F
11:00-11:30One Wug, Two Wug+s: Transformer Inflection Models Hallucinate AffixesFarhan Samir, Miikka Silfverberg
11:30-12:00Corpus Development of Kiswahili"Kathleen Siminyu, Kibibi Mohamed Amran, Abdulrahman Ndegwa Karatu, Mnata Resani, Mwimbi Makobo Junior, Rebecca Ryakitimbo, Britone Mwasaru
12:00-1:30
Lunch
1:30-3:00
Special Session A
Balancing long-term technological interventions vs. immediate community needs
The Labrador Languages Preservation Database: Building a Language ArchiveNicholas Welch
A Partnership for Building Verb Conjugators for an Iroquoian LanguageAnna Kazantseva, Aidan Pine, Roland Kuhn, Akwiratékha Martin, Owennatékha Brian Maracle, Rohahí:yo Jordan Brant, Patrick William Littell
A Collaborative Approach to Developing Language Technology Interventions for Endangered LanguagesHarshita Diddee, Ishani Mondal, Pamir Gogoi, Anurag Shukla, Ananya Saxena, Kalika Bali, Vivek Seshadri, Monojit Choudhury, Tanuja Ganu, Manu Chopra, Dripta Piplai Mondal, BORNINI LAHIRI, Manjira Sinha, Sebastin Santy, Devansh Mehta
3:00-3:30
Break
3:30-5:00
Special Session B
Better integration of technology and ongoing revitalization efforts in community; Efficiently aligning and catalyzing language pedagogy design and curriculum development


How Language Data Comes to Be: Understanding Issues and Problems in Endangered Indigenous Language DatasetsAndrew Cowell, Sarah Moeller, Changbing Yang
Developing Optical Character Recognition for Kwak'wala

Daisy Rosenblum, Shruti Rijhwani, Michayla Giesbrecht, Antonios Anastasopoulos, Graham Neubig
Michif and the challenges of natural language generation in under-documented languagesPatrick William Littell, Fineen Davis, Heather Souter, Delaney Alexa Lothian
êkosi ê-nêhiyawi-pîkiskwêcik maskwacîsihk – Towards a Spoken Dictionary of Maskwacîs CreeAntti Arppe, Atticus Galvin Harrigan, Katherine Schmirler, Daniel Dacanay, Jolene Poulin, Rose Makinaw
5:00-5:15
Closing Remarks