A WEBSITE IS NOT AN ARCHIVE!!!!!!

I had a message from the ‘pop up archive‘ to say they are closing down and I should download my data. They were a website that allowed users to upload audio files that were then meant to be prepared for searching via automated recognition of features in the file.

Leaving aside the functionality of the site (I admit I did not get it to work with my files), I want to reiterate my frustration with websites that call themselves archives (ok, so in this case the title ‘pop up’ should have been a giveaway), only to disappear at the end of a funding cycle or the retirement of the researcher.

In part this frustration is also motivated by a recent project in which I compared languages that have little representation in the OLAC listing (see the earlier discussion of this here) of holdings in the world’s language archives but have had a grammar written recently. If a linguist has worked on a language in the past thirty or so years then it would be reasonable to expect that some primary records were produced, and that they should be in an archive. They may be in a repository that is not part of OLAC, in which case we can create a record to point to that collection. If they are not in any archive, the task is to ask the linguist if they need help to get the records into an archive. At PARADISEC we have been doing this, partly through our ‘Lost and Found’ survey, which has resulted in a number of collections of analog tapes being digitised and made available.

When I sent out a message to each of the authors of these grammars asking about the location of primary records, the responses were split between those who have made provision for their records in an archive that may or may not be in OLAC, those who have put some examples into an online website (and apparently consider that to be an archive), and those who do not think they need to do anything at all. The vast majority did not respond at all. The problem seems to be that most people involved in documenting languages do not prioritise archiving of their primary records.

The following is a useful guide to archives, produced by Susan Kung at the Digital Endangered Languages and Musics Archives Network (DELAMAN)Finding an Archive for your (Endangered) Language Research Data

The PARADISEC Deposit page also discusses archival formats for files.

A WEBSITE IS NOT AN ARCHIVE!!!!!!

Archives curate files by:
– applying standards for data formats both to ensure longevity and to migrate files to new formats over time
– using community-agreed metadata standards that export to the Open Language Archives Community (to increase findability)
– providing backups in several locations
– providing access conditions for the contents of the collection as specified by the depositor in a deposit agreement
– providing persistent identification of the parts of the collection
– making items available in formats suitable for web-delivery (downsampled versions)
– providing a catalog that uses language identifiers and other terms for finding participant names, their roles, the place associated with the records, when it was produced, and may also allow for parts of the catalog to be written in the language in question.

One Comment

  1. This is outrageous and I give your analysis my total support Nick. It is terribly important that the knowledge and effort that has gone into this important and highly respected body of work be retained, and I urge those involved in the so-called “pop-up” archive to reconsider their decision to remove it as a matter of urgency. Aboriginal languages are already in dire straits in terms of the prevalent attitudes towards cultural and historical amnesia and such an act entrenches this possibility.

    It is redolent of Fahrenheit 451-via digital means.

    I hope that those responsible for the maintenance of this site will come to their senses and make efforts to retain this historically important site, which will remain an important resource not for the next decade or so, but for centuries into the future.

Leave a Reply