Language documentation index

The map below is built on information produced by a group of linguists working in Vanuatu. It is a sample documentation index that provides a visualisation of what is known about each language. Note that this is not a language vitality index of the kind outlined in Harmon and Loh (2010). Leaving aside thorny questions of what constitutes a language and language name (see Good and Cysouw 2013) and choosing to use a given set of language names (that is not limited to ISO-639-3), this exercise produced a map of the languages of Vanuatu, with each language assigned an index number on a 21 point scale assigning 1-5 points for each of four categories: Grammar; Lexicon; Texts; Media corpus. The icons are colour-coded (white = 0; red = 1-5; purple = 6-10; yellow = 11-15; green =16-20). 54 languages in this list have a zero rating, indicating that virtually nothing is known about those languages.

Each language icon also provides links to further information about each language, and, in some cases, links to media so you can hear samples of the language being spoken (see the Araki or South Efate links for example). [Let me know if you can’t see the icons in the map below and I will give the system a kick, which it seems to need from time to time].

The information can be downloaded for use in google earth here (you need to change the extension to “.kmz”). As of February 2015 Google no longer supports kml in googlemaps so the map below no longer works. Sorry!

View Larger Map
This map draws from a spreadsheet (using Spreadsheet Mapper v3.1) and is updated periodically.

Building an index of what is known about over 7,000 world languages is a big task and one that would benefit from automated processes, ideally drawing on OLAC’s existing aggregated information about each language (as discussed on this blog here). There are existing sites that give such an index for particular regions, and an early index described by by Wurm (1963:137) set out a scale similar to the one discussed here, but it was only applied to a few Australian languages. McConvell and Thieberger (2001) implemented this index for Australian languages and subsequently Austlang assigns up to 16 points for each Australian language, depending on the amount and quality of each of four features: Word list; Text Collection; Grammar Audio-visual. Another example is Lynch & Crowley (2001, pp. 17-19) who provide a five-star system for documentation of languages of Vanuatu.


  • Good, Jeff & Cysouw, Michael. 2013. Languoid, Doculect, and Glossonym: Formalizing the Notion ‘Language’. LD&C 7.
  • Harmon, David and Jonathan Loh. 2010. The index of linguistic diversity: A new quantitative measure of trends in the status of the world’s languages. LD&C 4. 97-151.
  • Lynch, John & Terry Crowley. 2001. Languages of Vanuatu : a new survey and bibliography. Canberra : Pacific Linguistics.
  • McConvell, Patrick and Nicholas Thieberger. 2001. State of Indigenous languages in Australia – 2001. Australia State of the Environment Second Technical Paper Series (Natural and Cultural Heritage), Department of the Environment and Heritage, Canberra. (
  • Wurm, Stephen A 1963, Aboriginal languages, pp.125-165 in Australian Aboriginal Studies, W.E.H. Stanner and H. Sheils, eds. Australian Aboriginal Studies. A Symposium of Papers presented at the 1961 Research Conference. OUP: Melbourne.

Leave a Reply