Speaking, Talking, Telling

Spoken Language and Text Corpora


Bislama, Vanuatu

Bislama, also known under its earlier name in French, bichelamar, is a creole language, one of the official languages of Vanuatu. It is the first language of many of the urban ni-Vanuatu (those who live in Port Vila and Luganville), and the second language of much of the rest of the country's residents. The image here is of the late Kalsarap Namaf who is recorded telling a story in Bislama below.

Cook Islands Māori, Southern Cook Islands and diaspora populations, mainly in New Zealand and Australia

Cook Islands Māori is an East Polynesian language originating in the Southern Cook Islands. Its closest relatives are the other varieties of Cook Islands Māori found in the Northern Cook Islands, New Zealand Maori, and Tahitian. Most speakers now live in New Zealand and Australia. Not many children are learning this language at the moment, except in the Pā ꞌEnua, (smaller islands outside of Rarotonga) of the Cook Islands where the language is the strongest. Cook Islands Māori has a great literary tradition

Dalabon, Australia

Dalabon is spoken in central Arnhem Land by a dwindling population, now reduced to fewer than half a dozen fluent speakers, although there are many people of middle age or young adults who understand the language to varying extents. It belongs to the Gunwinyguan language family.

Matukar Panau, Papua New Guinea

Matukar Panau is an Oceanic language spoken near Madang in Madang Province, Papua New Guinea. Since 2010, the language is being documented in an ongoing project lead by Danielle Barth and community members. Linguistic work can be found under resources.

Murrinhpatha, Australia

Murrinhpatha is an Australian Aboriginal language spoken in a region of tropical savannah and tidal inlets on the north coast of the continent.

Nafsan (South Efate), Efate, Vanuatu

The Nafsan language, also known as South Efate, is a Southern Oceanic language spoken on the island of Efate in central Vanuatu. As of 2005, there are approximately 6,000 speakers who live in coastal villages from Pango to Eton.

Nen, Papua New Guinea

Nen (NQN), also known as Nen Ym or Nen Zi, is a language of the Yam or Morehead-Maro Family of Southern New Guinea, spoken in the Morehead District, Western Province, Papua New Guinea.

Nmbo, Papua New Guinea

Warlpiri, Australia

ANNIS Corpus Viewer

ANNIS is an open source, cross platform (Linux, Mac, Windows), web browser-based search and visualization architecture for complex multi-layer linguistic corpora with diverse types of annotation. ANNIS, which stands for ANNotation of Information Structure, was originally designed to provide access to the data of the SFB 632 - “Information Structure: The Linguistic Means for Structuring Utterances, Sentences and Texts”.

annis corpus viewer