How much room is there in the arc(hive)?

Forty-five years ago the annual fieldwork reports of some of the researchers funded by the then Australian Institute of Aboriginal Studies (now AIATSIS) included specifications of how much research had been completed in terms of the number of feet of tapes that had been recorded during the project year (“this year was especially productive with 45 feet 3 inches of tape being recorded”). The modern measure of this kind of quantitative nonsense is the number of gigabytes of digital files (soon to be terabytes) created by the researcher. Don’t mind the quality, it’s the length/bytes that count.
My colleague David Nathan, Director of the Endangered Languages Archive (ELAR) at SOAS, has been approached on several occasions by researchers (both those funded by ELDP and those not (yet)) asking how much data they would be allowed to deposit in the archive. “Would it be OK if I deposit 500 gigabytes of data?” they ask. When you think about it for a moment or two, this is a truly odd request, but one driven by part of what David (in Nathan 2004, see also Dobrin, Austin and Nathan 2007, 2009) has termed “archivism”. This is the tendency for researchers to think that an archive should determine their project outcomes. Parameters stated in terms of audio resolution and sampling rate, file format, and encoding standards take the place of discussions of documentation hypotheses, goals, or methods that are aligned with a project’s actual needs and intentions. David’s response to such a question is usually: if the material to be deposited is “good quality” (stated in terms of some parameters (not volume!) established by the project in discussion with ELAR) then the archive will be interested in taking it.
Another quantity that comes up in this context (and in the context of grant applications as well) is the statement that “10% of the deposited archival data will be analysed”. The remainder of the archive deposit will be, in the worst case, a bunch of media files, or in the best case, media files plus transcription (and/or translation). Where does this magical 10% come from? It seems to have originated around 10 years ago with the DOBES project which established a set of guidelines for language documentation during its pilot phase in 2000. As Wittenburg and Mosel (2004:1) state:

“During a pilot year intensive discussions … took place amongst the participants. The participants agreed upon a number of basic guidelines for language documentation projects. … For some material a deep linguistic analysis should be provided such that later researchers will be able to reconstruct the (grammar of the) language”

Similarly, the guidelines for ELDP grant applications (downloadable here) include the following:

“Note that audio and video are not usable, accessible or archivable without accompanying textual materials such as transcription, annotation, or notes about content and participants. While you are encouraged to transcribe and annotate as much of the material as possible, we recognise that this is very time-consuming and you may not be able to do this for all recorded materials. However, you must provide some text indication of the content of all recordings. This does not have to be the linguistic content and could include, for example, description of the topics or events (e.g. names of songs), or names of participants, preferably with time alignment (indication of where they occur in the recording).”

No actual figure is given of how much “some material” (for DOBES) or “as much of the material as possible” (for ELDP) amounts to. In earlier published versions of advice to applicants both DOBES and ELDP did mention 10%.
Interestingly, Wittenburg (2009, slide 34) has done an analysis of the language documentation data collected by DOBES projects between 2000 and 2009, and he notes that the average project team has recorded 131 hours of media (59 hours of audio, 72 hours of video), transcribed 50 hours of this, and translated 29 hours. Linguistic analysis on average exists for 14 hours of recordings — strikingly this is exactly 10.68% of the average corpus!!
How much of the corpus needs to be linguistically annotated so that “later researchers will be able to reconstruct the (grammar of the) language” or indeed so that the rest of the corpus can be parsed? Well, it depends on a range of factors, including the nature of the language(s) being documented. Some Austronesian languages, like Sasak or Toratan, have relatively little morphology with pretty straightforward morpho-phonemics of such morphology that does exist, and so a relatively small amount of morpheme-by-morpheme glossed materials in conjunction with a lexicon would enable users to bootstrap the morphological analysis of other parts of a transcribed corpus in those languages. Other languages, like Athapaskan tongues with their fiendishly complex verb morphology, might need more annotated data to help the user deal with the whole corpus.
This is however an empirical question, and one that to my knowledge has not been addressed so far. There are now a number of documentary corpora available, with more coming on stream, and it should be possible to establish whether the “magical 10%” is a real goal to be aimed for, or just a figure that researchers have created and continue to repeat to one another.

Read more

Life among the Warlpiri

I’ve just been devouring Andrew (‘Yakajirri’) Stojanovski’s 2010 book Dog ear cafe: how the Mt Theo program beat the curse of petrol sniffing. Melbourne: Hybrid Publishers. It’s a terrific read (you can download a sample from the publisher’s webpage).
UPDATE: 2/9/2010:
This book is being launched “in conversation with Rachel Perkins” on Wednesday, September 22, 2010 6.00 for 6.30pm.
Venue: gleebooks, 49 Glebe Point Rd, Glebe
Cost: Free
RSVP: gleebooks – 9660 2333 or request a place via the gleebooks’ secure server
Or you can buy it from gleebooks here.
]
As a portrait of life among the Warlpiri, it’s up there with Yasmine Musharbash’s Yuendumu everyday: contemporary life in remote Aboriginal Australia. She talks about Yuendumu from the point of view of an anthropologist living in the single women’s camp; he does it as a community worker trying to balance his marriage with throwing himself into helping Warlpiri people work with petrol sniffers. (For other earlier excellent ethnographies see the list David Nash maintains.)
In its astonishing honesty about the author’s feelings and actions (the good, the silly and the dangerous), Dog ear cafe is up there with the honesty of Neil Murray’s autobiography, Sing for me, countryman (Rydalmere, N.S.W.: Sceptre 1993)* (and see my blogpost).
Here are some of the many things I liked about Stojanovski’s book:

  • the reflections on the intercultural teamwork needed to create Mount Theo outstation as a place to allow petrol sniffers to regain their lives.
  • the recognition that intercultural misunderstanding works both ways – most notably in the incident where a young Warlpiri boy says in shock when criticised for upsetting Andrew: “Kardiya [white people] don’t have feelings”.
  • the suggestion that compassion is a defining Warlpiri characteristic (as exemplified by the ubiquity of the “poor thing” wiyarrpa) word in modern songs). At the same time he recognises that of course not all Warlpiri show it.
  • the discussion of humbug (demand sharing) as mutual obligation, as ‘teamwork’.
  • the account of how to reconcile everyone’s need and desire for vehicles with the need for an emergency vehicle at the outstation.
  • the discussion of how hard it is for Yapa (Warlpiri people) to reconcile the obligations of family life with the impartiality demanded of workers in most Australian organisations. (He argues that whitefellas are seen as neutral like Switzerland- I’d go for ‘maybe more neutral’ rather than ‘neutral’).
  • the importance and difficulty of having D&Ms (deep&meaningful conversations) with petrol sniffers, and the generous recognition that another of his associates, Karissa Preuss, is very very good at this – in fact the book is filled with the generous recognition of the skills of his associates. No wonder the team worked well.
  • the breathtaking exuberant desire to Get Things Done, save petrol sniffers from themselves. This led the Government to award OAMs to Stojanovski and his colleagues Japangardi and Peggy Nampijinpa Brown. It also led to all sorts of things that would have him hung, drawn and quartered by all but the most enlightened ethics committee and government agency. He knows this, but justifies it from the fairly unarguable position that the alternatives would have been more harmful. (Read the book to find out more…)
  • having a glossary at the back which contains many accurately spelled Warlpiri words

The book leaves me with a great deal of admiration for what Nampijinpa, Japangardi, Stojanovski and their associates achieved, a lot of sympathy for the women and the managers and Government people in Stojanovski’s life, and above all with gratitude to him for telling the story his way.

Read more

Wunderkammer Import Package 2 final release

The final release of Wunderkammer Import Package 2 is now available for download. Check out the Wunderkammer website for more info. Thanks to everyone who pointed out bugs and made suggestions for improvement. In this release several bugs have been squished and a bit of input validation and some friendlier error messages have been added. … Read more

Dancing, naming and writing at Alekarenge – Myfany Turpin

[from Myfany Turpin, our person in the Northern Territory]
Last Sunday I was fortunate to attend the “2010 Ali Curung traditional Dance festival” in the NT organised by the Arlpwe Art and Culture centre. It appeared that the whole community turned out for the show, and staff from DesArt and Winanjjikari Music Centre no doubt worked tirelessly to put on this great event.
I arrived for the second day where a group of about 6 men sang a ceremony described by Geoffrey Small as jarda malya-malya a Warlpiri ceremony that involved a Dreaming track from Yuendumu to Hatches Creek. Following this Fanny Purrurla led Jiparanpa Yawulyu, from Warlpiri country. Then Mona Haywood led the singing of Tyarre-tyarre awelye, joined by Nancy and Trixie. This is a women’s ceremony from the Kaytetye country called Tyarre-Tyarre (more commonly spelt in the Warlpiri orthography, Jarra-Jarra). All three ceremonies had some 20 dancers, both young and old.
In between the ceremonies were break-dancing competitions for children. It took me a moment to adjust to the contrast in music, but not for the children who seamlessly moved from dancing ceremony to break-dancing in ochre. The day also involved spear-throwing and ‘flour’ races.
An interesting feature was the women’s painting preparations that went on outside earlier that day. Instead of singing, a recording of the previous night’s singing (again Mona, Nancy and Trixie) made by one of her relatives, was played on a tape recorder to accompany the painting up. With around 30 dancers to paint up, and Mona being the main singer (and she’s no spring chicken) perhaps this was to give her voice a break before the afternoon’s performance.
The last time a similar event was held at Alekarenge was at the Arlpwe Art and Culture centre opening at 2008. Before then, perhaps not since the Land Claim hearing or Purlapa Wiri in the 1980s. Those who witnessed the ceremonies at these events may have been disappointed yesterday with the numbers of singers. However I think it’s amazing that there is anyone who still knows and sings these ceremonies at all, considering some historical factors, briefly mentioned below.

Read more

ELAR launches access-enabled site – David Nathan

from David Nathan, SOAS, London 29 June 2010 On Wednesday 30 June, the Endangered Languages Archive (ELAR) at SOAS, University of London, will launch the new version of our website. The site now offers access to endangered languages (EL) resources, subject to conditions applied by depositors. ELAR implements a new approach to the archiving and … Read more

New website: Aboriginal Languages Network, Port Augusta

The Aboriginal Languages Network is a team of teachers and Aboriginal language and culture experts in Port Augusta, South Australia, and is working on language revitalisation and materials development for threatened languages spoken in northern South Australia. Mohamed Azkour at Augusta Park Primary School in Port Augusta, has developed a website of Aboriginal language materials … Read more

Amurdak inyman alamuniyi wayunan – the Amurdak language is not dead – Robert Mailhammer

from Robert Mailhammer
14 June 2010
When I started working on Amurdak in 2007, I was told that the last speaker of that language had just passed away. I wasn’t discouraged by that at all, since I had spent virtually all of my previous linguistic life examining ‘real’ dead languages, some of which we don’t even have records of. However, it soon became apparent that it was very frustrating trying to make sense of Amurdak without being able to go to a speaker and ask them about who killed whom in a particular story or what the 2nd person non-singular future tense of a particular verb was, all of which slowed down the investigation of Amurdak considerably.
However, it was known that there was at least one partial speaker of Amurdak, who lived on Croker Island, and who was also an accomplished songman of an Amurdak song series, but I never got a chance to go and work with him.
Then in late 2009, there was some indication that there might be another (partial) speaker of Amurdak in Darwin and there was also some money to go and find out. With the kind and generous help from Bruce Birch, Nick Evans and Sabine Hoeng, supported by the DobeS Iwaidja Documentation Project, plans were made to travel up to Croker Island to firstly help Bruce with some Iwaidja transcriptions and secondly to find out about this ‘new’ speaker, and thirdly see whether I could work with Charlie Mangulda, the Amurdak songman.
When I arrived in Darwin in early May 2010, Bruce and I met up with that potential last speaker and it became quite clear that I wouldn’t get very far. On top of this we received news that Charlie Mangulda wouldn’t be available for consultation, which was particularly disappointing. But we had heard that a relative of one of Bruce’s consultants supposedly could translate the stories from the text collection Rob Handelsmann and I had published a few weeks earlier* and Sabine and Bruce had distributed among the Amurdak-affiliated community into Iwaidja when she listened to the CD. So the plan was at least to see about that.
With Bruce as an extremely generous and kind host and expert mentor I set out on my first fieldtrip…
After the first session with Rae Giribug, the above-mentioned relative, it became obvious that the story was true. Much like a professional interpreter she was translating a 20-year old recording from Amurdak into Iwaidja, one of the local languages. She could say back the words in Amurdak, translate words from Iwaidja into Amurdak and I was even able to ask about specific grammatical forms! So working nearly every day, we managed to transcribe and translate three narratives, which had been previously untouched, and we also filled in some blanks in existing transcriptions. On top of that I started trying out my theoretically and passively acquired Amurdak and by the final day of my stay we had little conversations in a language that I had only known from recordings from last century. We had started the resuscitation of Amurdak as a means of communication!

Read more

Wunderkammer Import Package 2

The latest version of the Wunderkammer mobile phone dictionary software, Wunderkammer Import Package 2 Beta, is now available for download. The major advance in this distribution is a new easy to use graphical user interface. There’s also a new set of documentation to go with the new user interface. This is a beta release. We … Read more