Author Archive

Toolbox to Elan

In the spirit of solving small frustrations I offer my weekend experience of getting Toolbox files into Elan. I have over a hundred texts in Nafsan, most of which are time-aligned and interlinearised. I am working with Stefan Schnell on adding GRAID annotation to some of these texts and the preferred way of doing this is in Elan, with the GRAID annotation at the morphemic-level. I tried importing Toolbox files using the Elan ‘Import’ menu, and had listed all field markers in Toolbox, together with their internal dependencies (which should then map to Elan’s relationship between tiers). These settings are stored in an external file. Unfortunately, the import failed several times, despite changing the settings slightly after each attempt. Continue reading ‘Toolbox to Elan’ »

Songs of the Empty Place

Jimmy Weiner and Don Niles have published Songs of the Empty Place: The Memorial Poetry of the Foi of the Southern Highlands Province of Papua New Guinea. This new book contains songs recorded by Weiner between 1979 and 1995 and can be downloaded from ANU E-Press here. All audio was digitised by PARADISEC and is available in the collection JW1. The songs are organised under three main categories: 7 Women’s Sago Songs (Obedobora), 44 Men’s Songs (Sorohabora), and 7 Women’s Songs (Sorohabora) and accompanied by some 40 photographs.
Continue reading ‘Songs of the Empty Place’ »

Generating word forms

Have you ever wanted to create a list of possible words in a language you are working on? Have you started creating a dictionary but now need to find words that are not yet recorded? This could be the app for you. Word Generator is a free web service that lets you upload a list of words that you know, together with a list of consonants and vowels, like this:

Consonants: b, rd, d, k, g, j, rl, l, lh, ly, m, n, nh, ng, ny, rn, yh, r, rr, n, ng, y, th, w
Vowels: a, aa, i, ii, u, uu

[ … ]

Word Generator will generate a list of possible words based on this information. It has a number of settings you can alter to adjust the degree of probability, the number and the length of words you want to produce. You can then ask speakers to look through the list to help them think of words that are not already in the dictionary, and it could provoke useful discussion about other forms and meanings.

Please try Word Generator and post any feedback here or by email to me.

Word Generator is being written by Andreas Scherbakov as part of a project funded by ARC Future Fellowship FT140100214

Seeking your help with tool development

We are in the process of identifying gaps in tools for fieldwork and data analysis that can be filled as part of the Centre of Excellence for the Dynamics of Language. I’d like to ask for your input into the requirements for a metadata entry tool. In part, this analysis asks for your opinions on the value of existing tools (listed below) and their relative strengths and weaknesses, and asks if it may be worth putting effort into developing any of them further, rather than starting from scratch.

The high-level requirement of this tool is to make it easy to describe files created in fieldwork, to be available both off- and on-line and to deliver the description as a text file for upload to an archive. This includes capturing as much metadata from the files themselves; providing controlled vocabularies of terms to select from (preferably via drag-and-drop rather than keyboard entry); allowing the metadata to be exported in a range of formats to suit whichever archive will host the collection; allowing the metadata to be imported to the tool for use by collaborative team members; allowing controlled vocabularies to be amended to suit the local situation. This tool could also allow users to visualise the state of a collection: which media files have been transcribed, which have been interlinearised, have text files been scanned, OCRed …. what other processes have been applied, which have been archived, what the rights are for each file, also allowing the user to specify what these criteria are for their own type of collection.

These are the currently available tools, please let us know of any others (especially those created for different disciplinary fieldwork):
CMDI Maker

You can either add comments below, or else write to me separately (thien [at] with your ideas that can contribute to how we develop this tool.

Grammar writing: where are we now?

Ruth Singer recaps last week’s Linguistics in the Pub, a monthly informal gathering of linguists in Melbourne to discuss topical areas in our field.

Linguistics in the Pub on Tuesday the 24th of February, 2015 centred around the theme: grammar writing. Harriet Sheppard (Monash University) led the discussion. The announcement and short background reading are here.

The descriptive grammar although often reported to be dead is a form of scholarship that is still very much alive. And although e-grammars are said to be the way of the future, most grammars still take the form of a hard copy, whether it is a PhD thesis or published book. The discussion in this session of linguistics in the pub was kicked off with a discussion of the article by Ulrike Mosel cited below, part of a special publication of LDC on grammar writing.
Continue reading ‘Grammar writing: where are we now?’ »

Playing texts and media—EOPAS again

While I obviously like EOPAS as a model for corpus presentation (see the earlier blog post about it here), I found a renewed enthusiasm for it today as I was checking the meaning of a word in a text I was translating from South Efate. The word lunak does not appear in any of my notes nor in the dictionary, but appears a few times in a story told by the late Kalsarap Namaf. I wrote to Joel Kalpram, who is from Erakor village and speaks the language, and asked him if he knew the word.

Continue reading ‘Playing texts and media—EOPAS again’ »

Language documentation index

The map below is built on information produced by a group of linguists working in Vanuatu. It is a sample documentation index that provides a visualisation of what is known about each language. Note that this is not a language vitality index of the kind outlined in Harmon and Loh (2010). Leaving aside thorny questions of what constitutes a language and language name (see Good and Cysouw 2013) and choosing to use a given set of language names (that is not limited to ISO-639-3), this exercise produced a map of the languages of Vanuatu, with each language assigned an index number on a 21 point scale assigning 1-5 points for each of four categories: Grammar; Lexicon; Texts; Media corpus. The icons are colour-coded (white = 0; red = 1-5; purple = 6-10; yellow = 11-15; green =16-20). 54 languages in this list have a zero rating, indicating that virtually nothing is known about those languages.

Continue reading ‘Language documentation index’ »

Somali phone dictionary

On June 13th we Arnold Zable will launch a Somali-English Dictionary app for both Android and iOS platforms, using the successful Ma! Iwaidja dictionary model. Opening screen of the Somali dictionary appThis is the product of a collaboration between the School of Languages and Linguistics at the University of Melbourne and Burji Arts, a Melbourne-based Somali arts and cultural organisation. The app contains some 26,000 Somali words and English equivalents with audio for selected items, so users can hear words or phrases. This app will have the capability of accepting input from users who can contribute items and suggest alternate pronunciations.

Continue reading ‘Somali phone dictionary’ »

First Footprints, Farsd Fatbrontz, Verst Pitprands: Spelling as if the language matters

I have watched the excellent series First Footprints a couple of times. It is a great overview of the origins of human occupation of Australia, with fantastic visual effects and photography. It starts with the declaration that “First Footprints seeks to treat Indigenous cultures and beliefs with respect”. Respecting Indigenous Australian languages should involve at least treating them the way you would any other language and checking that words in Australian Indigenous languages were written accurately. Think of the times you have watched a film that had misspelled English subtitles in it and what it makes you think of the care the subtitler took. It only took me a little effort to check on the following mistakes by web-browsing and by talking to people with experience in the particular languages.
Continue reading ‘First Footprints, Farsd Fatbrontz, Verst Pitprands: Spelling as if the language matters’ »

ARC Centre of Excellence for the Dynamics of Language

We have great pleasure in announcing that the ARC has funded a Centre of Excellence for the Dynamics of Language over seven years. This project will be led by Nick Evans at ANU with a collaborative team from there, the University of Western Sydney, the University of Queensland and the University of Melbourne, and with many partners from other universities and institutions including AIATSIS and  Appen.

We want this to be a centre for collaboration, for generating  ideas and inspiration for linguistics in Australia and the world.  In the New Year we’ll be putting up a web-page to give more information, In the meantime, here’s an overview of what we are planning.

Continue reading ‘ARC Centre of Excellence for the Dynamics of Language’ »