Language Documentation

Resources & Languages

Linguistics resources for students and language enthusiasts are far and few between. They are oftentimes hard to read, access, and, even times, expensive to purchase. On average they cost over $200 each. Linguistics textbooks use inaccessible terminology and are out of print.

Lost Worlds Institute, along with preserving endangered languages, seeks to bridge the gap between high school and academia in linguistics. We achieve that through our own youth-led model and speaker forums, connecting dozens of students with professors. Additionally, we are launching our resources--both the papers we produce along with more accessible ones which we have coined "Linguistics Care Kits."

Below, take a look at the care kits our team has built out so far. More to come soon!

6 Active Languages5 Grammar Papers120+ Pages Written26,000+ Words
Status:Critically EndangeredEndangeredRevitalizing
Six Active Research Languages

The Languages We Research

1

Taa

ǃKhong · ǃXoonCritically Endangered
Location
Botswana · Namibia
Family
Khoisan (Kx’a)
Speakers
~1,000

Taa has the largest phoneme inventory of any known language on Earth - over 100 distinct consonants, including more than 20 click consonants. The clicks (ǃ, ǂ, ǁ, ǀ) are produced by pulling the tongue down to create a suction-release that functions as a normal consonant. The language is also tonal: pitch changes meaning. Spoken by the ǀKhomani San people of the Kalahari, it faces severe pressure from Tswana and Afrikaans.

2

Yuchi

Euchee · TsoyahaCritically Endangered
Location
Oklahoma, USA
Family
Language Isolate
Speakers
~5

Yuchi is one of the world’s rarest linguistic phenomena: a complete language isolate with no known relatives anywhere on Earth. It cannot be shown to share a common ancestor with any other language family. Yuchi marks gender on verbs (not nouns), and its verbs encode subject, object, spatial orientation, and direction of motion simultaneously. The Yuchi people - the Tsoyaha (“People of the Sun”) - were originally from Tennessee and Georgia before forced removal to Oklahoma.

3

Warlpiri

Walpiri · WarlbiriEndangered
Location
Northern Territory, AU
Family
Pama-Nyungan
Speakers
~3,000

Warlpiri is non-configurational - word order is nearly free because meaning is encoded through suffixes rather than position. It has a rich case system and complex tense-aspect-mood morphology. Intergenerational disruption has been a primary driver of its decline, and an emergent contact language, Light Warlpiri, is developing among children. Our Warlpiri Grammar paper is one of our most thorough - praised by mentor Kevin Morand (UMass) as “very thorough and well-written.”

4

Manx

Gaelg · GailckRevitalizing
Location
Isle of Man, UK
Family
Celtic (Goidelic)
Speakers
~2,000

In 1974, Ned Maddrell - Manx’s last native speaker - died, and the language was declared extinct. Then a community decided to bring it back. Using historical recordings and old texts, the Isle of Man government and local advocates launched a revival that has now produced hundreds of fluent speakers and thousands more at various proficiency levels. Children today grow up attending Bunscoill Ghaelgagh, the Manx-medium primary school. LWI studies Manx as a model of successful revitalization. Our research team presented at the 2026 LWI Conference on Endangered Languages at Harvard University - examining Manx phonology, orthography, morphology, and VSO syntax.

5

Wampanoag

Wôpanâak · MassachusettEndangered
Location
Massachusetts, USA
Family
Algonquian
Speakers
Active Revival

Wôpanâak (Wampanoag/Massachusett) is an Algonquian language of Massachusetts, spoken by the people who met the Mayflower. After going unspoken as a first language for over a century, it is now undergoing one of the most ambitious revivals in American history - led by Jessie Little Doe Baird, a Wampanoag Nation member and MacArthur Fellow. Today, children in the community grow up speaking Wôpanâak as a first language again. Population decline and fragmentation drove its initial decline.

6

Wukchumni

Wikchamni · YokutsCritically EndangeredNewest Initiative
Location
San Joaquin Valley, CA
Family
Yokutsan (Tule-Kaweah)
Speakers
Revitalizing

Wukchumni (also spelled Wikchamni) is a Yokuts language belonging to the Tule–Kaweah subgroup of the Foothill branch, traditionally spoken by the Wikchamni people of south-central California in the San Joaquin Valley and Sierra Nevada foothills. It is closely related to Yawelmani and Chukchansi within the Yokuts family. Like many California Indigenous languages, Wukchumni experienced a severe decline in the 20th century as English became dominant in education and employment. Marie Wilcox, who learned Wukchumni as a first language from her grandparents, became the last fluent elder speaker - and dedicated decades to creating the first comprehensive Wukchumni dictionary and audio recordings before her passing. Her daughter Jennifer Malone continues this revival work today. Linguistically, Wukchumni is remarkable for its complex Ablaut system (internal vowel shifting that marks grammatical distinctions), a three-way distinction for stop consonants, and intricate verbal morphology where a single verb base generates up to nine distinct stem types. Verbs are primarily suffixing, and a single lexical verb can encode tense, aspect, voice, valency, and grammatical categories simultaneously. Wukchumni is the newest of all LWI’s language research initiatives.

Research With Us

Join a Language Documentation Team

If you're passionate about any of these languages or communities, we want you on our team. No prior linguistics experience required - just dedication and curiosity. Research members are paired with PhD mentors from our partner universities.

ROBBIE the Ducky