Is Toolbox the linguistic equivalent of Nietzsche’s typewriter?

There is an aphorism (apparently derived from Maslow 1966) that goes “if all you have is a hammer, everything looks like a nail”. For some documentary linguists reliance on the Toolbox software program means that everything linguistic looks like an interlinear gloss.

Toolbox (developed originally in 1987 as Shoebox by the Summer Institute of Linguistics) is a widely used data management and analysis tool for field linguists. It is designed for researchers to take units of transcribed text (typically ssentences) and semi-automatically “gloss” them to create multi-tier interlinearised text broken into words, which are then broken into constituent morphemes with aligned annotations such as sentence translations, morphemic translations, part of speech designations, and so on (for further discussion of interlinear text models see Bow, Hughes and Bird 2003).

Because Toolbox is free, and widely recommended for use in language analysis (it is commonly taught in training courses, such as InField, or ELDP grantee training, for example), it has had a large and constraining impact on how documentary linguists think they should do their research. I would suggest that it is a tool that has had a narrowing effect, like Nietzsche’s typewriter, as described by Carr 2008:

Sometime in 1882, Friedrich Nietzsche bought a typewriter—a Malling-Hansen Writing Ball, to be precise. His vision was failing, and keeping his eyes focused on a page had become exhausting and painful, often bringing on crushing headaches. He had been forced to curtail his writing, and he feared that he would soon have to give it up. The typewriter rescued him, at least for a time. Once he had mastered touch-typing, he was able to write with his eyes closed, using only the tips of his fingers. Words could once again flow from his mind to the page.

But the machine had a subtler effect on his work. One of Nietzsche’s friends, a composer, noticed a change in the style of his writing. His already terse prose had become even tighter, more telegraphic. “Perhaps you will through this instrument even take to a new idiom,” the friend wrote in a letter, noting that, in his own work, his “‘thoughts’ in music and language often depend on the quality of pen and paper.”

“You are right,” Nietzsche replied, “our writing equipment takes part in the forming of our thoughts.” Under the sway of the machine, writes the German media scholar Friedrich A. Kittler , Nietzsche’s prose “changed from arguments to aphorisms, from thoughts to puns, from rhetoric to telegram style.”

I believe that how annotation is conceptualised in language documentation, and presented in reference works like Schultze-Bernd 2006, reflects the narrowing influence of software tools like Toolbox and the dominance of interlinear glossing as an analytical method.

An alternative, developed originally by David Nathan, that we recommend at SOAS for corpus creation, is summary or overview annotation:

An overview annotation can be considered as a kind of “roadmap” or index of a recording. It could consist of approximately time-aligned information about what is in the recording, who is participating, and other interesting phenomena. For example, you could write:

“from 1 to 3 mins Auntie Freda is singing the song called Fat frog; from 3-7 mins Harry Smith is telling a story about joining the army; from 7-10 mins there is some interesting use of applicative morphology; from 15-18 mins contains rude content that should not be used for teaching children”
This could be written as prose (as above) or, better, structured into a table.

If you are familiar with software such as Transcriber or ELAN, you can do an overview annotation by marking breaks in topics/speakers etc, and typing descriptive text into the segments between breaks. Another strategy is to simply type a number into the time-aligned segment and then create a table which links the numbers with the overview information categories.

Interlinearisation of the Toolbox type is very time consuming (see my blog post on how much time transcription and interlinear annotation takes) while overview annotation can be done rapidly and relatively richly for a whole corpus, rather than the magical 10% of it too frequently referred to in the literature on linguistic annotation. This means that potentially it is a good alternative to the restricted representations that have been affected, like Nietzsche’s typewriter, by the very tool that documenters have come to rely upon.

References

Bow, Cathy, Baden Hughes and Steven Bird. 2003. Towards a general model of interlinear text. EMELD paper. [available online at http://emeld.org/workshop/2003/bowbadenbird-paper.pdf, accessed 2012-04-21]

Carr, Nicholas. 2008. Is Google making us stupid? What the internet is doing to our brains. Atlantic Magazine July/August 2008.

Maslow, Abraham. 1966. The Psychology of Science: A reconnaisance. New York: Harper Collins.

Schultze-Bernd, Eva. 2006. Linguistic annotation. In Jost Gippert, Nikolaus P. Himmelmann and Ulrike Mosel (eds.) Essentials of Language Documentation, 213-251. Berlin: Mouton de Gruyter.

The living archive of Aboriginal languages – call for expressions of interest

CALL FOR EXPRESSIONS OF INTEREST
DEVELOP A USER-FRIENDLY SEARCH INTERFACE AND TOUCHPAD APP FOR A DIGITAL ARCHIVE OF LITERATURE IN ABORIGINAL LANGUAGES
THE LIVING ARCHIVE PROJECT

Submission date: 30 April 2012

During the era of bilingual education in the NT, books were produced in 25 Literature Production Centres in more than 16 languages. These materials are widely dispersed and endangered, and contain interesting and significant stories in indigenous Australian languages, often beautifully illustrated. This is an important collection and must be preserved for the future. We are creating a living archive of these endangered materials, in partnership with the communities of origin. This archive will be stored in the Charles Darwin University eSpace repository (http://espace.cdu.edu.au/). With permission from the language owners, materials in the archive will be accessible to Aboriginal communities, academics and the world. As some users may not have high levels of text literacy or technical ability the archive will require a user-friendly visual interface to allow searches beyond the conventional database search capabilities.

  • browse by image
    users view thumbnails of the covers of books and roll-over to see a larger image with basic metadata and select items to view
  • search by text
    users start typing a word and resources are selectively filtered to retain only those with that sequence of characters in their metadata. For example typing dja would retain books in Djapu and Djambarrpuyŋu, as well as books by Djäwa and books about djamarrkuli or with the word djamarrkuli in their title.
  • search by location
    users click on an area on a map to retrieve all materials in that language or from that region

See the attached call for expressions of interest: EoI LAAL User-friendly search interface.

Read more about the project at the living archive of Aboriginal languages

Submit applications to livingarchive AT cdu.edu.au including samples of references from clients and an estimate of cost, by 30 April 2012.
 

More stories from ELAP graduates

Back in January I wrote a post about stories we were publishing on the SOAS Endangered Languages Project website about some of our alumni and the interesting lives they are leading after having graduated from the University of London. There are now nine stories available, with several more in the pipeline. If you want some ideas about how studying about endangered languages and language documentation and support can lead to enriched careers have a look at what our former students say.

Another one bites the dust (with apologies to Queen)

The situation with projects focussed on the documentation and support of endangered and minority languages is starting to look, well, endangered, if not downright moribund.

Apparently, Unesco shut down its project on endangered languages within the intangible cultural heritage area towards the end of last year. Volkswagen Foundation held its last DoBeS grants committee meeting in February, and the project will wind up in 2014 when the current round of grants come to an end. The Sorosoro project based in Paris seems to have ground to a halt last December (the last news update on their website is 14th December 2011).

And today comes the news that after six and a half years devoted to working enthusiastically to present linguistic diversity as part of the world’s cultural pluralism Linguamón-House of Languages will cease its activity in two days time, following the decision in December last year by the Government of Catalonia and the Barcelona City Council to shut it down. As a member of the Linguamon International Scientific Committee this comes as a great disappointment.

At the upcoming 3L Summer School in Lyon there is to be a 20th Anniversary Conference (“1992-2012: twenty years of research on language endangerment”) looking at what has happened to the field of endangered language studies since the landmark 1992 symposium at the Linguistic Society of America. Looks like the participants should start penning some funeral dirges if things keep going along as they have in the last few months!

Update: On a more optimistic note, Gabriela Pérez Báez reminds me that the Smithsonian Institution’s Recovering Voices project “is actively growing and building its infrastructure and extending its reach”. I can also mention that the University of Iceland (Háskóli Íslands, Reykjavík) has established the Vigdis International Centre for Multilingualism and Intercultural Understanding as a Category 2 centre under the auspices of UNESCO and with support from the Government of Iceland. Building works have begun, and the new centre will feature space and activities for the promotion of minority and endangered languages. The Hans Rausing Endangered Languages Project at SOAS was recently given a 10-year Review and under the leadership of Professor Anne Pauwels over the next six months we will be putting in place plans for the future of the project. Watch this space.

ELAR update

As of this week the Endangered Languages Archive (ELAR) at SOAS has 52 online deposits available comprising around 51,000 files. There are 12,700 data bundles in the online collection, of which 6,000 are available to any registered user and a further 5,000 require access approval from the depositors. The number of users is now 515 with one or two people registering (via this web form) each week.

Recently we have been looking in the cupboards around SOAS and uncovering some interesting and valuable materials that we are digitising and hope to be adding to our online collection in ELAR. For example, one cupboard contained two tin cases with a set of 78rpm vinyl recordings of Zuaran Berber from eastern Lybia recorded by T.F. (Terence Frederick) Mitchell (1919–2007) in the late 1950′s. One of the speakers on the records is probably Mr. Ramadan Hadji Azzabi (cf. Mitchell 1953:28), who was T.F. Mitchell’s research assistant and who studied with him in London. Some of the recordings are conversations and these were published in Mitchell 2009. British Academy post-doctoral researcher Lameen Souag remarks that:

“Zuaran Berber is spoken only around the town of Zuwara in northeastern
Libya. While its status has recently been improved by the removal of
Qaddafi, its small population and the national and regional dominance
of Arabic should qualify it as “threatened” at the least. I’m not
aware of any work on whether the language is being retained; such
research has been quite impossible for the past fifty years or so.”

Bernard Howard, the Linguistics Department technical officer, has digitised the Zuaran Berber materials and we are investigating adding them to the collection in ELAR.

It will be interesting to see if there is material on other endangered languages in the SOAS cupboards.

References

Mitchell, T.F. 1953. Particle-Noun Complexes in a Berber Dialect (Zuara). Bulletin of the School of Oriental and African Studies 15:375-390.
Mitchell, T.F. 2009. Zuaran Berber (Libya) Grammar and Texts. Cologne: Ruediger Koeppe.

Cultural Encounters in New Guinea 1920s & 1930s

Michael Waterhouse talks about historical black-white encounters in New Guinea goldfields late 1920s to early 1930s with rare photographs and shows a short silent film Cultural Encounters from the early 1930s including possibly the earliest footage from the Sepik and footage of early Gasmata and Bulolo singsings.
Time: 6:30pm-8:45pm, April 11 Cost: $15/$10. Members/$5 Students with ID incl refreshments @ 6.30pm
Venue: Jurassic Theatre, Australian Museum, Williams St., Sydney
Enquiries: Oceanic Art Society oceanicartsociety.org.au (02) 9332 3984

Best and worst practice in language documentation: LIP discussion

Ruth Singer recaps some of the interesting points of last night’s Linguistics in the Pub, an informal gathering of linguists and language activists that is held monthly in Melbourne

The announcement for this month’s Linguistics in the Pub outlined the topic as follows:

“There is much discussion of best practice in language documentation but as we all know, no language documentation project is perfect: each is the result of collaboration between researchers and a community with restrictions on time, money and many unforeseen circumstances. There is always a gap between what we achieve and the most wonderful project of our dreams.

Come and tell us about your experiences. What aspects of your language documentation work are you most proud of? What will you do differently next time? And what are some of the great things you have planned that you just couldn’t get off the ground?”

The idea behind this discussion topic is that language documentation projects tend to aim high and this can result in those leading language documentation projects feeling disappointed. Spurred on by hearing about innovative projects, egged on by others in the language documentation field to follow best practice in an increasingly multiplying number of areas we sit at our computer concocting new language documentation projects that will create years of recordings, miles of transcripts and beautiful metadata as well as lovely outputs that suit the needs of the language speaker community. In the process we will develop wondrous collaborations with language speakers supporting them to develop the capacity to carry out language documentation work themselves and also meaningful collaborations with other academics such as musicologists, anthropologists and ethnobiologists.

Continue reading ‘Best and worst practice in language documentation: LIP discussion’ »

Registration: Digital Humanities Australasia 2012: Building, Mapping, Connecting

PARADISEC has organised a panel at this conference on ‘Fieldwork in the digital humanities’

DIGITAL HUMANITIES AUSTRALASIA 2012: Building, Mapping, Connecting

Venue: Shine Dome, Australian Academy of Science, Canberra ACT and Sir Roland Wilson Building #120, Australian National University.

The inaugural conference of the Australasian Association for Digital Humanities, Australian National University, Canberra, Australia, 28-30 March 2012.

The conference will feature papers, panels, posters and associated workshops, including presentations on digital humanities in Australia, New Zealand and internationally showcasing new research and developments in the field and/or responding to the conference theme of ‘Building, Mapping, Connecting’. Draft program at http://aa-dh.org/conference/program/.
Continue reading ‘Registration: Digital Humanities Australasia 2012: Building, Mapping, Connecting’ »

Retrofitting a collection? I’d rather not

I just had a visit from a student wanting to deposit a collection of recordings made in the course of PhD fieldwork in the PARADISEC archive. It is a great shame that they are only just now thinking about how to deposit this material, as it will need considerable work to make it archivable. If they had sought advice before doing all of the research (or looked at the PARADISEC page ‘Depositing with PARADISEC’, or looked at the RNLD pages, e.g, http://www.rnld.org/node/40) it would have been so much easier for all of us. Why?

Continue reading ‘Retrofitting a collection? I’d rather not’ »

New Guinea Between the Wars

Michael Waterhouse will be presenting a talk on New Guinea between the wars at the State Library of NSW on 21st March based on his recently published book “Not a Poor Man’s Field. The New Guinea Goldfields to 1942 – An Australian Colonial History”. It will be accompanied by a film taken by his grandfather and one of his associates in the 1930s which documents the gold dredging and associated aviation activities at a time when New Guinea led the world in commercial aviation. There is also some ethnographic footage including what is most likely the first film footage ever taken on the Sepik River. This short segment in particular is in excellent condition. There’s also a segment showing several kukukuku (Anga) who’ve been arrested early in 1931 for murdering a miner and some of his carriers.

Details are at http://www.sl.nsw.gov.au/events/events_talks/events/adventures_of_the_new_guinea_goldfields.html

Michael will be repeating the talk and film the following night (22 March) at Dymocks in the city. Details are at http://www.dymocks.com.au/LiteraryEvents/Default.aspx?s=1 [scroll down]