PARADISEC activity update

PARADISEC continues to grow! In the last year 63 new collections have been added and the archive has grown to 9.04TB with 12,489 items (made up of 73,496 files). We are currently reworking the catalog to make it easier to use.

We have added more items from Stephen Wurm’s (collection SAW4) and Don Laycock’s (DL2) papers.

Added collections include Gavan Breen’s written materials, transcripts and notes of vocabulary and grammar on 49 Australian languages and dialects, mainly from far north Queensland and the central Northern Territory (collections GB01-50). Almost all the languages described are now no longer spoken.

Read more

The 2014 ARC Cup

An astonishingly good ARC Cup run for Indigenous Australian languages. Onya! Good news for horses from PARADISEC, ELAC blog contributors and the new Centre of Excellence for the Dynamics of Language.

Amidst this joy, deep sympathy to the many people working in linguistics who put in terrific projects that didn’t get funded.

This is the field for Indigenous language work as I see it – if I’ve missed anyone, lemme know.

Read more

Language documentation index

The map below is built on information produced by a group of linguists working in Vanuatu. It is a sample documentation index that provides a visualisation of what is known about each language. Note that this is not a language vitality index of the kind outlined in Harmon and Loh (2010). Leaving aside thorny questions of what constitutes a language and language name (see Good and Cysouw 2013) and choosing to use a given set of language names (that is not limited to ISO-639-3), this exercise produced a map of the languages of Vanuatu, with each language assigned an index number on a 21 point scale assigning 1-5 points for each of four categories: Grammar; Lexicon; Texts; Media corpus. The icons are colour-coded (white = 0; red = 1-5; purple = 6-10; yellow = 11-15; green =16-20). 54 languages in this list have a zero rating, indicating that virtually nothing is known about those languages.

Read more

PARADISEC stats for 2014

It has been quite some time since our last update on the contents of the PARADISEC archive. Since our report on this blog two years ago, we have added 88 collections bringing the total to 265 collections. There are now 9,836 items and 60,516 digitised recordings, images and videos in the archive, which is now … Read more

ARC Centre of Excellence for the Dynamics of Language

We have great pleasure in announcing that the ARC has funded a Centre of Excellence for the Dynamics of Language over seven years. This project will be led by Nick Evans at ANU with a collaborative team from there, the University of Western Sydney, the University of Queensland and the University of Melbourne, and with many partners from other universities and institutions including AIATSIS and  Appen.

We want this to be a centre for collaboration, for generating  ideas and inspiration for linguistics in Australia and the world.  In the New Year we’ll be putting up a web-page to give more information, In the meantime, here’s an overview of what we are planning.

Read more

Research, records and responsibility conference: Ten years of PARADISEC

RRRReception

The conference celebrating ten years of PARADISEC in early December had a suitably interdisciplinary mix of presentations. Joining in the reflection on building records of the world’s languages and cultures were musicologists, linguists, and archivists from India, Hong Kong, Poland, Canada, Alaska, Hawai’i, Australia, the UK and Russia. The range of topics covered can be seen in the program: http://paradisec.org.au/RRRProgram.html

The conference ended with a discussion of what was missing in our current tools and methods. While it is clear that linguists have done pretty well at using appropriate tools for transcribing and annotating text, and building repositories to provide long-term citation and access to the material, there is still a long way to go.

Read more

Imagine … a world without PARADISEC

Imagine … a world without memories is the evocative and chilling title of a project organised by the National Committee of Australia for the UNESCO Memory of the World. Through the Australian Memory of the World Register, the Committee, mostly volunteers, are building public awareness of the importance of maintaining records and objects associated with … Read more

PARADISEC’s decade celebration conference

Announcing the conference “Research, records and responsibility (RRR): Ten years of the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC)” Dates: 2nd-3rd December 2013 Venue: University of Melbourne, Australia Keynote speaker: Shubha Chaudhuri Associate Director General (Academic) Archives and Research Centre for Ethnomusicology American Institute of Indian Studies Gurgaon, India For details … Read more

Fieldwork helper – ExSite9

ExSite9 is an open-source cross-platform tool for creating descriptions of files created during fieldwork. We have been working on the development of ExSite9 over the past year and it is now ready for download and use: http://www.intersect.org.au/exsite9 https://github.com/IntersectAustralia/exsite9/wiki/Install-packages

ExSite9 collects information about files from a directory on your laptop you have selected, and presents it to you onscreen for your annotation, as can be seen in the following screenshot. The top left window shows the filenames, and the righthand window shows metadata characteristics that can be clicked once a file or set of files is selected.The manual is here: http://bit.ly/ExSite9Manual

Researchers who undertake fieldwork, or capture research data away from their desks, can use ExSite9 to support the quick application of descriptive metadata to the digital data they capture. This also enables researchers to prepare a package of metadata and data for backup to a data repository or archive for safekeeping and further manipulation.

Scholars in the Humanities, Arts and Social Sciences (HASS) typically need to organise heterogeneous file-based information from a multitude of sources, including digital cameras, video and sound recording equipment, scanned documents, files from transcription and annotation software, spreadsheets and field notes.

The aim of this tool is to facilitate better management and documentation of research data close to the time it is created. An easy to use interface enables researchers to capture metadata that meets their research needs and matches the requirements for repository ingestion.

Read more

Counting Collections

As will be clear to regular readers of this blog, we are concerned here to encourage the creation of the best possible records of small languages. Since much of this work is done by researchers (linguists, musicologists, anthropologists etc.) within academia, there needs to be a system for recognising collections of such records in themselves as academic output. This question is being discussed more widely in academia and in high-level policy documents as can be seen by the list of references given below.

The increasing importance of language documentation as a paradigm in linguistic research means that many linguists now spend substantial amounts of time preparing corpora of language data for archiving. Scholars would of course like to see appropriate recognition of such effort in various institutional contexts. Preliminary discussions between the Australian Linguistic Society (ALS) and the Australian Research Council (ARC) in 2011 made it clear that, although the ARC accepted that curated corpora could legitimately be seen as research output, it would be the responsibility of the ALS (or the scholarly community more generally) to establish conventions to accord scholarly credibility to such products. Here, we report on some of the activities of the authors in exploring this issue on behalf of the ALS and discuss issues in two areas: (a) what sort of process is appropriate in according some form of validation to corpora as research products, and (b) what are the appropriate criteria against which such validation should be judged?

“Scholars who use these collections are generally appreciative of the effort required to create these online resources and reluctant to criticize, but one senses that these resources will not achieve wider acceptance until they are more rigorously and systematically reviewed.” (Willett, 2004)

Read more