How can we get the material we have used in our research back to the people we recorded?

Every time I revisited my fieldsite I was asked for copies of photos or recordings and I wanted some way that these could be accessed without me having to be present. When I started visiting Erakor village in central Vanuatu there was intermittent electricity available, usually only in the evenings in the house I lived in.

Read more

Wagiman electronic dictionary

Aidan Wilson went up to Pine Creek and Kybrook Farm in the Northern Territory last week to deliver the various versions of the Wagiman electronic dictionary to the Wagiman community. You can read about it at the Project for Free Electronic Dictionaries blog.

Wunderkammer in Canberra

Dearest Canberrans, I’ll be giving a presentation of the Wunderkammer mobile phone dictionary software at the ANU in Canberra at 11 am on 18 September. If you’re interested and in the area, come by. Full details, including the exact location, can be found here.

New ELAR publications

The Endangered Languages Archive (ELAR), based at SOAS, has recently published two new articles on the Endangered Languages Project website that may be of interest to readers of this blog: Bernard Howard’s detailed review of the new Zoom H4n audio recorder. Bernard puts the machine through its paces and concludes his review with the words: … Read more

Look What They’ve Done to My Song (and other time-aligned data and analysis), Ma

At the Linguistic Society of America Summer Institute in Berkeley last week (17-19th July) the National Science Foundation sponsored Cyberling 2009, a workshop exploring how computational infrastructure (called “cyberinfrastructure” in the US, and e-Science or e-Humanities in the UK) can support linguistic research in a variety of fields. There was a panel discussion about data sharing that looked at the proposal:

“A cyberinfrastructure for linguistic data would allow unprecedented access [to] the empirical base of our field, but only if we collectively build that empirical base by contributing data. This panel addresses the benefits of data sharing and the obstacles to the widespread adoption of sharing practices, from the perspective of a variety of subfields”

But the bulk of the workshop was given over to closed discussion sessions by seven working groups looking at annotation standards, other standards, new multi-purpose software (so-called “killer apps”), data reliability and provenance, models from other fields, funding sources, and collaboration structure. The group discussions and resulting final day presentations are available on the Cyberling Wiki.
I was co-chair of Working Group 4 that was charged with discussing “protecting data reliability and provenance”, i.e. how to keep track of the creation of data and analysis and its passage through the electronic infrastructure as researchers access and use each other’s materials. As the Cyberling Wiki says, this is crucial

“for data creators (who need credit for the work they have done and the academic contribution of collecting, curating and annotating data) and the data users (who need to know where the data has come from so they can form an opinion of how much credence to give it and how to give proper credit to the originator of the data)”.

We also looked at how to establish a culture of data sharing and what mechanisms might be put in place to encourage people to share data. Clearly, for endangered language research where data are unique and fragile, these are very important issues.
After two and a half days of intense discussions our group came up with a set of proposals relating to data reliability and provenance that can be summarised as follows:

Read more

How to import a basic transcript into ELAN

The problem: you have text files and audio files, but the text files are not aligned to the audio files.
I imagine there are a few readers out there who have transcriptions of audio files that never made it past an utterance per line text file. This is a post for you, if you’d like to know how to import and time-align those files in ELAN.

Read more

Cold dead media

PARADISEC’s director Linda Barwick has been raising the alarm for years about the way media are becoming obsolete because the machines to read them are dying. So it was very sad to hear the death-rattle on the CHILDES list in this message from Brian MacWhinney Dear Colleagues, It appears that we are now just about … Read more