Tape gumshoe

Finding tapes that need to be digitised often involves some detective work. Recently, while waiting for a dropoff of tapes (yes, in car park 3), I mused on the noir nature of the work and came up with this vignette. Perhaps the trickiest collection I’ve dealt with was one created by Fr John Z’graggen in … Read more

From film to file: historical manuscripts released by PARADISEC

We are pleased to announce the release of a number of historical manuscripts in and about languages of the Pacific. We worked with the National Library of Australia to digitise microfilms and have now made them available as pdf files for download from our catalog. This work was supported by the ARC Centre of Excellence for the Dynamics of Language

Read more

Honiara language workshop, August 2019

The Solomon Islands Kulu Language Institute (KLI) organised a workshop in August this year that attracted 100 participants representing 44 languages of the Solomon Islands.

The venue was the leaf house at Saint Barnabas Anglican Cathedral Grounds, Honiara. The workshop was sponsored by the ARC Centre of Excellence for the Dynamics of Language, the Kulu Language Institute, the University of Melbourne, The Research Unit for Indigenous Language, and Islands Bible Ministries.

Read more

50 words of Australian languages project

The Research Unit for Indigenous Language is running a project in 2019/2020 to collect and present words in as many Australian Indigenous languages as possible. Please consider contributing to this project.

This project aims to provide resources for schools to teach at least fifty words in their local language.

We are asking for contributions of at least fifty words in as many Australian Indigenous languages as possible. The typed words need to be listed in a spreadsheet, with audio file recordings attached. Full instructions on capturing the details are on this website.

Read more

Local wifi versions of paradisec?

 

Getting records back to the places they came from is a major motivation for what we do at PARADISEC. Repatriation of unique analog artefacts is an important model, and digital records should, in principle, be easier to move to any place. However, not every place has capacity for access to or storage of digital files. In the Pacific there are few reliable digital repositories and the cultural agencies I know have little capacity to store or disseminate digital files. Internet connections are usually expensive and so discourage download of large files.

Earlier I talked about using Itunes to get records back to Erakor, the village where I work in Vanuatu. The computers that held the Itunes installation eventually stopped working and were replaced, but the language files were not copied over to the new computers.

Read more

Texts and more texts: corpora in the CoEDL

Corpus development is one of the goals of the ARC Centre of Excellence for the Dynamics of Language (see this web page for more details). We have run a number of workshops on corpus-related themes (e.g. the 2017 workshop that included a day on converting early sources).

In addition to creating useable materials for the source communities (which we have a strong commitment to supporting) we are archiving records that include primary media, transcripts and associated annotations. We aim to produce from this material a subset of accessible texts for a number of languages.
Here it is worth noting that we have come up with this terminology (thanks to Jane Simpson for the formulation) to distinguish the objects we have collected:
Assemblage – all material collected, working files, early sources, multiple versions and drafts
Collection – the archived material, a subset of the above, but curated with sufficient metadata to allow the user to know what all items are
Corpus – a crafted set of texts in the language that can be used for further analysis

Read more

A WEBSITE IS NOT AN ARCHIVE!!!!!!

I had a message from the ‘pop up archive‘ to say they are closing down and I should download my data. They were a website that allowed users to upload audio files that were then meant to be prepared for searching via automated recognition of features in the file.

Leaving aside the functionality of the site (I admit I did not get it to work with my files), I want to reiterate my frustration with websites that call themselves archives (ok, so in this case the title ‘pop up’ should have been a giveaway), only to disappear at the end of a funding cycle or the retirement of the researcher.

In part this frustration is also motivated by a recent project in which I compared languages that have little representation in the OLAC listing (see the earlier discussion of this here) of holdings in the world’s language archives but have had a grammar written recently. If a linguist has worked on a language in the past thirty or so years then it would be reasonable to expect that some primary records were produced, and that they should be in an archive. They may be in a repository that is not part of OLAC, in which case we can create a record to point to that collection. If they are not in any archive, the task is to ask the linguist if they need help to get the records into an archive. At PARADISEC we have been doing this, partly through our ‘Lost and Found’ survey, which has resulted in a number of collections of analog tapes being digitised and made available.

Read more

Finding what is not there

A major part of PARADISEC’s effort goes in finding and digitising audio tapes that record performance in the many small languages of the world. As discussed in a number of posts on this blog it is becoming urgent that these tapes are digitised while they are still playable. Of the tapes described in this earlier post about tapes from Madang in PNG, some are already so badly damaged by mould that they can’t be played anymore.

In order to find more tapes we run a survey http://www.delaman.org/project-lost-found/, that, unfortunately, has only ever had sixteen responses. We have managed to negotiate with these respondents to digitise five of their collections so far (see also the earlier blogpost ‘Where are the records?‘).

A more focussed way of finding out what recordings there are is by comparing what is published about a language with what primary records are listed as being in an archive. Assuming that someone doing fieldwork and writing a grammar of a language in the past fifty years must have made some recordings then the mission (should we choose to accept it) is to find those recordings.

Read more

Pacific Manuscripts now in PARADISEC

After some discussion between PARADISEC and the Pacific Manuscripts Bureau (PAMBU) we now have access to linguistic records in the PAMBU microfilm collection, either for tagging in the PARADISEC catalog, or as digital versions of the microfilm in the PARADISEC collection.
Kylie Maloney at PAMBU kindly made available a list of items in PAMBU that have linguistic content (about 70 items). I sent this list to linguists interested in this field and got a priority list from them. PAMBU then entered into negotiations with their depositors to allow the microfilms to be digitised and produced as pdf files for distribution via PARADISEC’s repository.

Read more

Results of the metadata survey

Keeping track of what is recorded in the course of fieldwork is critical, both for your own future work and for longterm archiving. Recordings of dynamic performance (audio or video) are easy to misplace or misidentify and very difficult to locate once you forget what a file was named and what you recorded on a particular day. We ran a survey about how people record their metadata from January 21st to April 25th, 2016 and had 142 responses (see also the earlier blog post here). There were two multiple choice questions each allowing selection of more than one checkbox and the entry of free text responses. I can send the full results of the survey on request. This information will help inform the development of new tools for metadata entry. The responses are summarised below.

Read more