Archive for the ‘corpora’ Category.

Merging SayMore audio snippets into a single wav file

SayMore is a piece of software developed by SIL that (among other things) allows you to annotate a primary audio file with audio annotations. This means that speakers can add information by carefully re-speaking an utterance, or giving an oral translation. However, this becomes a problem because each annotation segment is saved as a separate file, which means you have to manage or archive hundreds or even thousands of 1-2 second audio files.

Continue reading ‘Merging SayMore audio snippets into a single wav file’ »

Categories in language descriptions and linguistic typology – Melbourne Linguistics in the Pub May 2017

Stefan Schnell (University of Melbourne) recaps last month’s Linguistics in the Pub (Melbourne)

Leading the discussion was Ana Krajinović (University of Melbourne / Humboldt University)


The relationship between language-specific descriptive-analytical categories and categories figuring in cross-language comparative studies, and in particular the nature of the latter, have been subject of intensive and recurrent debate over the years, most recently in a dedicated discussion at last year’s SLE conference in Naples, and a focused discussion in the last October issue of Linguistic Typology (Vol 20, issue 2, 2016). In this LiP session, we focused on the research-practical aspects of the issue at hand from a descriptive point of view, asking questions about how researchers go about in identifying relevant categories in the languages they describe, and how they capture and describe their functions and label the categories. But what criteria and concepts do researchers apply when going about these tasks? A notoriously difficult area is research into systems of tense-mood-aspect (TMA) which illustrate some of the points during our discussion.

Continue reading ‘Categories in language descriptions and linguistic typology – Melbourne Linguistics in the Pub May 2017’ »

The challenge that language variation poses to language description – a LIP recap

The July edition of LIP was led by David Gil from the Max Planck Institute for the Science of Human History, Jena. The night was well attended with representatives from all the usual suspects: University of Melbourne, Monash, and La Trobe. Attendees this month also came from the University of New England, ANU, as well as from SOAS, London and NTNU, Norway. The evening’s discussion centred on issues related to Malay and Indonesian languages and varieties, but also included discussion of language documentation and description in general. Continue reading ‘The challenge that language variation poses to language description – a LIP recap’ »

Playing texts and media—EOPAS again

While I obviously like EOPAS as a model for corpus presentation (see the earlier blog post about it here), I found a renewed enthusiasm for it today as I was checking the meaning of a word in a text I was translating from South Efate. The word lunak does not appear in any of my notes nor in the dictionary, but appears a few times in a story told by the late Kalsarap Namaf. I wrote to Joel Kalpram, who is from Erakor village and speaks the language, and asked him if he knew the word.

Continue reading ‘Playing texts and media—EOPAS again’ »