A workshop on Language Documentation and Language Description was held at Uppsala University (30 September – 1 October 2008) as part of the 23rd Scandinavian Conference of Linguistics. Uppsala University was established in 1477 and the town has grown around the university, famous also as the place where Carl Linnaeus taught in the late 1700s.
The keynote speakers for the workshop were Michael Noonan and Donald Stilo, and the list of presenters can be found here. Of interest to ELAC readers will be the paper by Michael Riesler and Jacquelijn Ringersma on the software tools used to annotate Kildin Saami lexical data. They are using LEXUS, a lexical database created by the MPI for their DoBES teams. It has lots of nice features if you want to create various kinds of lexicons and if you don’t want to gloss texts (this team is using Toolbox to gloss texts), but it is only an online tool at the moment. It conforms to the Lexical Markup Framework (LMF) also incorporates ViCOS (Visualising Conceptual Spaces) which provides for semantic domains and for navigating a word-net through the lexicon.
Anju Saxena, Lars Borin, Ljuba Veselinova & Santosh Negi talked about ‘GIS supported language documentation of the Himalayas’. Since so little is known about languages of this region they are building a database of geocoded information, using standards to allow the material to interoperate with other systems, and to then explore the relationships between language varieties via geographic representation of the information.
Dorothee Beermann from Trondheim Unversity showed TypeCraft, an interlinearising tool that uses an SQL relational database embedded in MediaWiki. This is a new tool that has some nice features, but is limited by not linking to a lexicon. Now if only LEXUS could talk to TypeCraft. And TypeCraft is currently only available online (World of TypeCraft?) but they are planning a standalone version.
Daniel Paul, of the University of Manchester, demonstrated the use of the SIL tool Fieldworks Language Explorer (or FLEX, see a review here). This is a global working environment for textual annotation and dictionary construction, but with no reference to primary media yet available. I have seen a few demos of FLEX and it has invariably crashed in each of them, so it was a relief to see it make it all the way through the demo without crashing once, saving itself instead for the question period. Florian Siegl of the Siberian Enets DoBES project then talked about the use of Elan. A continual theme through these presentations was the need for standards so that the data we produce can then be reused in future and this tied in to the presentation that I gave with Andrea Berez about standards and training for language documentation. This was in part a response to several recent publications which decry what they claim is an excessive focus on technology in language documentation.
Bill McGregor discussed the need for what he called ‘metadocumentation’ which refers to the apparatus that is required to understand historical linguistic records and to render them faithfully. This is especially important for languages for which the only information is in historical sources. Thus the documentation is the subject matter that needs to be addressed and then, via that effort, the language. He gave the example of Nyulnyul which now has no full speakers, but which has records including those by Bates, Elkin and McGregor himself.
The conference program lists all the papers from the workshop.