Technology – Endangered Languages and Cultures

Redeveloped uploading for PARADISEC

29 April 2025 by Nick Thieberger

This week I received a set of six collections from Masayuki Onishi. Three were from his fieldwork, mainly in Bougainville (with Baitsi, Naasioi, and Motuna (Siwai)), two were his reworking of Douglas Oliver’s records dating back to the 1930s, one in a range of languages, and another on Siwai , and the sixth was from … Read more

Running repairs on a raspberry pi in the field

21 January 2024 by Nick Thieberger

Recall that we have been developing a way of returning materials from the PARADISEC collection to source communities using the Raspberry Pi (see the earlier posts here and here ) This work is predicated on our catalog writing a text file into an item’s directory, a text file that contains all the information in the … Read more

Converting docx to FLEx format for dictionaries

29 October 202129 October 2021 by Nick Thieberger

Following the previous blog post I had requests for more detail on how to convert a word-processor dictionary into the format needed to put the text into the software Fieldworks Language Explorer (FLEx). I’ll set out the steps below, but it does require some knowledge of regular expressions that I’ll explain as I go (you … Read more

Honiara language workshop, August 2019

25 October 201929 August 2019 by Nick Thieberger

The Solomon Islands Kulu Language Institute (KLI) organised a workshop in August this year that attracted 100 participants representing 44 languages of the Solomon Islands.

The venue was the leaf house at Saint Barnabas Anglican Cathedral Grounds, Honiara. The workshop was sponsored by the ARC Centre of Excellence for the Dynamics of Language, the Kulu Language Institute, the University of Melbourne, The Research Unit for Indigenous Language, and Islands Bible Ministries.

Merging SayMore audio snippets into a single wav file

14 March 201812 March 2018 by Sasha Wilmoth

SayMore is a piece of software developed by SIL that (among other things) allows you to annotate a primary audio file with audio annotations. This means that speakers can add information by carefully re-speaking an utterance, or giving an oral translation. However, this becomes a problem because each annotation segment is saved as a separate file, which means you have to manage or archive hundreds or even thousands of 1-2 second audio files.

Results of the metadata survey

7 June 20166 June 2016 by Nick Thieberger

Keeping track of what is recorded in the course of fieldwork is critical, both for your own future work and for longterm archiving. Recordings of dynamic performance (audio or video) are easy to misplace or misidentify and very difficult to locate once you forget what a file was named and what you recorded on a particular day. We ran a survey about how people record their metadata from January 21st to April 25th, 2016 and had 142 responses (see also the earlier blog post here). There were two multiple choice questions each allowing selection of more than one checkbox and the entry of free text responses. I can send the full results of the survey on request. This information will help inform the development of new tools for metadata entry. The responses are summarised below.

Reading HyperCard stacks in 2016

4 November 202120 February 2016 by Nick Thieberger

HyperCard (HC) was a brilliant program that came free with every Macintosh computer from 1987 and was in development until around 2004. It made it possible to create multimedia ‘stacks’ (of cards) and was very popular with linguists. For example, Peter Ladefoged produced an IPA HyperCard stack and SIL had a stacks for drawing syntactic trees or for exploring the history of Indo-European (see their listing here). Texas and FreeText created by Mark Zimmerman allowed you to create quick indexes of very large text files (maybe even into the megabytes! Remember this is the early 1990s). I used FreeText when I wrote Audiamus, a corpus exploration tool that let me link text and media and then cite the text/media in my research.

My favourite HC linguistic application was J.Randolph Valentine’s Rook that presented a speaker telling an Ojibwe story (with audio), with interlinear text linked to a grammar sketch of the language. I adapted that model for a story in Warnman, told by Waka Taylor, and produced as part of a set of HC stacks called ‘Australia’s languages’ and released in 1994.

Toolbox to Elan

1 February 2016 by Nick Thieberger

In the spirit of solving small frustrations I offer my weekend experience of getting Toolbox files into Elan. I have over a hundred texts in Nafsan, most of which are time-aligned and interlinearised. I am working with Stefan Schnell on adding GRAID annotation to some of these texts and the preferred way of doing this is in Elan, with the GRAID annotation at the morphemic-level. I tried importing Toolbox files using the Elan ‘Import’ menu, and had listed all field markers in Toolbox, together with their internal dependencies (which should then map to Elan’s relationship between tiers). These settings are stored in an external file. Unfortunately, the import failed several times, despite changing the settings slightly after each attempt.

Seeking your help with tool development

1 May 201527 March 2015 by Nick Thieberger

We are in the process of identifying gaps in tools for fieldwork and data analysis that can be filled as part of the Centre of Excellence for the Dynamics of Language. I’d like to ask for your input into the requirements for a metadata entry tool. In part, this analysis asks for your opinions on … Read more

David Nathan on EL Publishing’s first month, about Open Access, and being Open about Access

13 December 20149 September 2014 by Jane Simpson

David Nathan writes

EL Publishing is a new online publisher which was launched on 18th July and which will publish a journal, multimedia, and monographs, focussing on documentation and description of endangered languages. EL Publishing has an international editorial board and operates a fully double-blind peer-review process for all submitted materials.