Rights, responsibilities, and data duffers

In a recent blog post James McElvenny presents a broad-ranging discussion about copyright, in response to my earlier post about the use of materials from my published work without attribution by the PanLex project. James covers a lot of ground and brings in many different aspects, including his frustrations that he “can’t play region-coded DVDs that [he] … bought in Europe on [his] Australian DVD player” and ” ‘licence agreements’ consisting of several thousand words of legalese gibberish”. I won’t attempt to cover all the topics he mentions (indeed, I feel unqualified to do so) but I do want to go back to the more narrow issue of rights and responsibilities in relation to (linguistic) research that I originally raised, and to highlight a sociological aspect that seems to have been missed in the discussion so far.

Yesterday I was fortunate to be invited to attend a seminar organised by the UK Arts and Humanities Research Council (AHRC) Beyond Text project on the topic “Beyond Copyright”. The seminar brought together 20 scholars and researchers from the UK and elsewhere in Europe representing a range of disciplines in arts and humanities , media, archiving, and intellectual property (IP) law. I found it an incredibly stimulating day where I learned a lot, especially about the complexities of IP, copyright, and the massive economic and social changes currently going on in this area. It also helped me to clarify my thinking about what it is that I was trying to present in my original blog post published on 4th April.

Let’s start with copyright. Yesterday’s presentation by Jeremy Silver, self-described “entrepreneur, digital media adviser and thought-leader”, set out some of the basic issues that are relevant to our current discussion. He explained that copyright exists to:

  1. reward creators, and
  2. provide consumer access

through control of:

  • reproduction
  • attribution
  • renumeration
  • integrity

using mechanisms for encoding and managing:

  1. primary rights (and moral rights)
  2. secondary exploitation rights
  3. licensing

There are tensions between these three aspects that get played out through the economic system — these tensions have been exacerbated by the rise of the internet and the current rapidly changing digital landscape.

I would like to clarify that my original blog post was about copyright as a means for dealing with primary rights (and moral rights) with a particular emphasis on reproduction, attribution and integrity. I was not writing about renumeration, as I made clear in my comment responding to James’ comment on my post.1 I am not averse to PanLex reproducing Diyari materials and including them in their database2; what I object to is lack of attribution. I am also concerned that materials are being presented in an orthography that is 30 years old and does not represent the spelling that has been agreed with the Dieri Aboriginal Corporation for current use — if PanLex had consulted me, I would have suggested revising and updating the included data to the current spellings, thus improving the integrity of my and the Dieri community’s contribution to their work. (At the time I posted I also sent an email to Emily Bender who is a Board member of Utilika Foundation and she promised to take these issues up with its Director.) There may be similar problems with other data in the PanLex collection and I would encourage interested readers to check on their favourite languages from this perspective.3

Now for the sociology. Over the past several years, linguistics as a discipline has been undergoing a range of socio-cultural shifts, including changing emphases on the rôle of primary data collection and corpus curation, especially in terms of their relationship to linguistic analysis and theorisation. This is reflected in the impact of seminal works like Bird and Simons 20034, last year’s LSA motion (see also Jane Simpson’s report) on recognising the scholarly merit of language documentation, and the growth of digital language archives like ELAR at SOAS that now has terabytes of documentation materials available for registered users to browse and download (subject to the ELAR terms and conditions of use5 ). These changes and developments can be encouraged and protected so long as researchers feel that their work will be properly recognised and referenced, especially when employed by others in their own research and publications — indeed, I have heard on numerous occasions expressions of reluctance by scholars unwilling to deposit their data and analyses in an archive because of a fear of being “ripped off”. Such fears are real and need to be addressed. One way of doing so is to publicly highlight instances of apparent abuse, as I did in the Prof Parker case and Nick Thieberger did in the Stolen Grammars case. As Nick pointed out:

” there has to be a mechanism for recognising creative effort, otherwise no-one will put their work online. ‘Stolen Grammars’ did not link to existing open access resources, but copied them without proper attribution. … Linguistic archives rely on the good faith of those signing agreements about how they will use data from the archive. Depositors have a right to trust that the material they deposit will not be misused.”

So, I believe that Nick Thieberger, David Nash, and I (and possibly others6 ) are being vigilant and following up on cases of ‘data duffers’7 like these in a spirit of service to the field, aimed at ensuring that copyright and licensing agreements are respected, and that fears are allayed for existing and potential contributors/publishers about their rights being violated and materials being misused. I see it as a way to encourage more and better access to linguistic research, not what James calls “a greater sin” of “[u]sing copyright to stop or hinder other research projects”.

At yesterday’s workshop there were repeated calls for ‘creative thinking’ about IP and copyright. I applaud the President of the Utilika Foundation for setting aside “funds for legal services in the 2012 budget [that] reflects an assumption that intellectual-property issues, as well as contractual issues more generally, will likely become more complex as resource deployment progresses”8 and I encourage him to engage in creative thinking about new mechanisms for licensing and attribution arrangements for the PanLex project, rather than complaining that “creators of many resources assert rights that, taken literally, would prohibit a person reading a resource from later making use of what he or she had learned from it”. I am confident that if the right environment can be established, I and others will happily contribute to this and other projects. It’s the way of the future.


  1. In his post, James says I “would presumably not want anyone to make money from it (except his publisher, Cambridge University Press)” — in fact my 1981 book is now out of print and 20 years ago I asked CUP to cede to me the copyright on the book, which they did, so they currently have no rights in the work and renumeration is not an issue for them, or indeed, anyone.
  2. In fact I have contributed Sasak materials to the Austronesian Basic Vocabulary Database, which contribution is fully acknowledged in the relevant entry.
  3. Angela Terrill has commented to me via Facebook: “I had a quick look and found they have Lavukaleve data there, but in an orthography I have not seen. Don’t know where they got it from”.
  4. Bird, Steven and Gary Simons. 2003. Seven dimensions of portability for language documentation and description. Language 79(3): 557-582.
  5. written in 10 sentences of plain English, not “legalese gibberish”
  6. The ELAR Terms and Conditions of Use explicitly state: “I understand that ELAR may take legal action on behalf of owners of materials in the case of serious infringement of this agreement.
  7. My use of “duffers” here is intended as a pun, relying on its polysemy in Australian English. For US readers, one sense, as in Australian “cattle duffers”, corresponds to your “cattle rustlers”.
  8. see here

3 thoughts on “Rights, responsibilities, and data duffers”

  1. David Nash has reminded me that David Nathan and Nick Thieberger were involved in discussions back in 2002 with the creator of a website called “The House of Small Languages” that had incorporated copyright materials on Australian Aboriginal languages without attribution. This resulted in the shutdown of the website, however courtesy of Wayback Machine we can catch a glimpse of its dying days, expressed in rather incoherent English:

    “From the first of May until 15th. september, the monument for languages threatened with extinction: “the House of the Small Languages” had to be closed down. … the obstacles to its set up forwarded by many specialists, linguists, ethnologists and anthropologists have rendered any pursuit of the enterprise impossible.”

    Like James McElveny, the author blamed copyright, as well as “those who, by theats and legal proceedings, try to impose prohibitions”, missing the point about the importance of attribution and recognition of moral rights.

  2. I’m glad to see the discussion coming out from all sides in this series of blog posts and accompanying comments. It’s also good to see that in the past few days we’ve progressed from

    Well, as far as I can see there is no “complex[ity]” surrounding “intellectual-property issues” here — the Diyari materials (and possibly lots more on lots more languages) are copyright and subject to fair dealing. Anything else is theft.

    (original post)


    I applaud the President of the Utilika Foundation for setting aside “funds for legal services in the 2012 budget [that] reflects an assumption that intellectual-property issues, as well as contractual issues more generally, will likely become more complex as resource deployment progresses”…

    (this post)

    I still wonder if this discussion would be better conducted in terms of academic ethics and conduct rather than in terms of copyright, considering the bad associations copyright has and the outright abuses it is being used to perpetrate. This was one of the points raised in my response to the original post, which Peter Austin has chosen to ignore here, along with my discussion of attribution in the particular case of PanLex and Diyari.

    It would also perhaps be nice to hear some more discussion of the separate but related issue of rights of actual language speakers rather than just the interests of academics.

  3. David Nathan has reminded me that I omitted an important sociological point in my post above. He says: “it is not only researchers who wish to “feel that their work will be properly recognised and referenced” but also that community members want recognition, control, reporting, benefit etc – and also that these two constituencies compound each other in that (some) documenters feel that their reputation, moral stance, continued relations etc with communities depends on what they do to protect and/or manage materials on communites’ behalf. There is a way of seeing this as patronising, of course, but it is a better reality than a “system” that does not factor in these wishes and expressions at all.”

Here at Endangered Languages and Cultures, we fully welcome your opinion, questions and comments on any post, and all posts will have an active comments form. However if you have never commented before, your comment may take some time before it is approved. Subsequent comments from you should appear immediately.

We will not edit any comments unless asked to, or unless there have been html coding errors, broken links, or formatting errors. We still reserve the right to censor any comment that the administrators deem to be unnecessarily derogatory or offensive, libellous or unhelpful, and we have an active spam filter that may reject your comment if it contains too many links or otherwise fits the description of spam. If this happens erroneously, email the author of the post and let them know. And note that given the huge amount of spam that all WordPress blogs receive on a daily basis (hundreds) it is not possible to sift through them all and find the ham.

In addition to the above, we ask that you please observe the Gricean maxims:

*Be relevant: That is, stay reasonably on topic.

*Be truthful: This goes without saying; don’t give us any nonsense.

*Be concise: Say as much as you need to without being unnecessarily long-winded.

*Be perspicuous: This last one needs no explanation.

We permit comments and trackbacks on our articles. Anyone may comment. Comments are subject to moderation, filtering, spell checking, editing, and removal without cause or justification.

All comments are reviewed by comment spamming software and by the site administrators and may be removed without cause at any time. All information provided is volunteered by you. Any website address provided in the URL will be linked to from your name, if you wish to include such information. We do not collect and save information provided when commenting such as email address and will not use this information except where indicated. This site and its representatives will not be held responsible for errors in any comment submissions.

Again, we repeat: We reserve all rights of refusal and deletion of any and all comments and trackbacks.

Leave a Comment