Archiving – Page 2 – Endangered Languages and Cultures

Merging SayMore audio snippets into a single wav file

14 March 201812 March 2018 by Sasha Wilmoth

SayMore is a piece of software developed by SIL that (among other things) allows you to annotate a primary audio file with audio annotations. This means that speakers can add information by carefully re-speaking an utterance, or giving an oral translation. However, this becomes a problem because each annotation segment is saved as a separate file, which means you have to manage or archive hundreds or even thousands of 1-2 second audio files.

Working together to bring legacy Pacific language recordings to light

11 September 20169 September 2016 by juliam

This is the story of institutional collaboration at its best.

In 2013 Bill Palmer sent through a list of 78 rpm discs held by the National Library of Australia, summarised in their catalog as follows:

“The collection consists of two albums and 20 single sound discs, word lists, slides and photographs. Records include specimens of native languages of the British Solomon Islands Protectorate; speech of Hagen natives; gospel recordings; and titles in Fijian, Babatana, Owa Raha, Bilua, Marovo, Dobu, Ungarinyin, Hula, Tavara, Motu, Johore Malay, Western Sumatra Malay, Wedau and Police Motu. Brief typescript word lists are included with the Motu, Hula, Tavara, Dobu and Babatana sound discs. There is an English-Owa Rahan vocabluary for the Owa Raha disc.”

We sent a request to the NLA with whom PARADISEC has always had a close working relationship. They agreed in principle and then we had periodic contact about this. In July 2015 we approached the National Film and Sound Archive who have the necessary playback equipment. Further to-ing and fro-ing of emails finally resulted in agreement from the NLA in June 2016.

Mouldy Mayhem

25 August 2016 by Jodie Kell

mouldy tape Recently the call came to the Sydney office of PARADISEC that a collection of tapes had arrived in Melbourne that needed some cleaning (see the earlier post here). The tapes were from Madang in Papua New Guinea and had been recorded in the 1960s. They contained valuable and rare records of language and music of PNG.

When the tapes arrived they were visibly covered in a white mould and so the PARADISEC audio preservation team moved into action to remediate the tapes ready for digitisation.

Mould is a common form of contamination of magnetic analogue tape that creates problems as the infected tape will not give a clear signal when played back. Even a small speck of dust or mould can cause a gap between the tape and the head resulting in a drop out of sound.

Finding what is not there

25 March 201718 July 2016 by Nick Thieberger

A major part of PARADISEC’s effort goes in finding and digitising audio tapes that record performance in the many small languages of the world. As discussed in a number of posts on this blog it is becoming urgent that these tapes are digitised while they are still playable. Of the tapes described in this earlier post about tapes from Madang in PNG, some are already so badly damaged by mould that they can’t be played anymore.

In order to find more tapes we run a survey http://www.delaman.org/project-lost-found/, that, unfortunately, has only ever had sixteen responses. We have managed to negotiate with these respondents to digitise five of their collections so far (see also the earlier blogpost ‘Where are the records?‘).

A more focussed way of finding out what recordings there are is by comparing what is published about a language with what primary records are listed as being in an archive. Assuming that someone doing fieldwork and writing a grammar of a language in the past fifty years must have made some recordings then the mission (should we choose to accept it) is to find those recordings.

Pacific Manuscripts now in PARADISEC

23 June 201622 June 2016 by Nick Thieberger

After some discussion between PARADISEC and the Pacific Manuscripts Bureau (PAMBU) we now have access to linguistic records in the PAMBU microfilm collection, either for tagging in the PARADISEC catalog, or as digital versions of the microfilm in the PARADISEC collection.
Kylie Maloney at PAMBU kindly made available a list of items in PAMBU that have linguistic content (about 70 items). I sent this list to linguists interested in this field and got a priority list from them. PAMBU then entered into negotiations with their depositors to allow the microfilms to be digitised and produced as pdf files for distribution via PARADISEC’s repository.

Results of the metadata survey

7 June 20166 June 2016 by Nick Thieberger

Keeping track of what is recorded in the course of fieldwork is critical, both for your own future work and for longterm archiving. Recordings of dynamic performance (audio or video) are easy to misplace or misidentify and very difficult to locate once you forget what a file was named and what you recorded on a particular day. We ran a survey about how people record their metadata from January 21st to April 25th, 2016 and had 142 responses (see also the earlier blog post here). There were two multiple choice questions each allowing selection of more than one checkbox and the entry of free text responses. I can send the full results of the survey on request. This information will help inform the development of new tools for metadata entry. The responses are summarised below.

Chasing John Z’graggen’s records

29 February 201628 February 2016 by Nick Thieberger

This week a suitcase of audio tapes will arrive in Melbourne from Madang in PNG. While a lot of the effort of building collections in PARADISEC goes in finding tapes and encouraging people to deposit their recordings, there are some collections that stand out for the amount of work required. This is the story of one of them.

David Nathan on EL Publishing’s first month, about Open Access, and being Open about Access

13 December 20149 September 2014 by Jane Simpson

David Nathan writes

EL Publishing is a new online publisher which was launched on 18th July and which will publish a journal, multimedia, and monographs, focussing on documentation and description of endangered languages. EL Publishing has an international editorial board and operates a fully double-blind peer-review process for all submitted materials.

Open access and intimate fieldwork

15 March 201414 March 2014 by rsinger

A report on the Linguistics in the Pub discussion Tuesday 11^th March, Prince Alfred Hotel, Grattan St, Melbourne.

This Linguistics in the Pub discussion brought together fieldworkers who do research in Indigenous Australia, Africa, South Asia, Papua New Guinea and Nepal, as well as a computational linguist who has developed software to automate language documentation. The linguists were not all Australian, in fact we were lucky to have four participants who identify as European who are living in Australia, temporarily or permanently. The linguists’ experience in language documentation ranged from between 6-30 years and between them had deposited in the digital archives: DoBeS, Paradisec and ELAR. The timeliness of this discussion is demonstrated by David Nathan’s very recent ELAC post on the same topic.

Research, records and responsibility conference: Ten years of PARADISEC

11 December 2013 by Nick Thieberger

The conference celebrating ten years of PARADISEC in early December had a suitably interdisciplinary mix of presentations. Joining in the reflection on building records of the world’s languages and cultures were musicologists, linguists, and archivists from India, Hong Kong, Poland, Canada, Alaska, Hawai’i, Australia, the UK and Russia. The range of topics covered can be seen in the program: http://paradisec.org.au/RRRProgram.html

The conference ended with a discussion of what was missing in our current tools and methods. While it is clear that linguists have done pretty well at using appropriate tools for transcribing and annotating text, and building repositories to provide long-term citation and access to the material, there is still a long way to go.