{"id":3726,"date":"2008-09-07T22:19:56","date_gmt":"2008-09-07T22:19:56","guid":{"rendered":"http:\/\/www.paradisec.org.au\/blog\/2008\/09\/glossed-texts-the-fiddle-factor\/"},"modified":"2011-02-05T07:49:19","modified_gmt":"2011-02-05T07:49:19","slug":"glossed-texts-the-fiddle-factor","status":"publish","type":"post","link":"https:\/\/www.paradisec.org.au\/blog\/2008\/09\/glossed-texts-the-fiddle-factor\/","title":{"rendered":"Glossed texts &#8212; the fiddle factor"},"content":{"rendered":"<p>In a recent <a href=\"\/blog\/2008\/09\/grammarwriting-group-2-general-properties\/\">blog post<\/a>, Jane Simpson reported on opinions expressed by a group at ANU meeting to discuss grammar writing:<\/p>\n<blockquote><p>&#8220;We all agree it&#8217;s a good thing to publish glossed texts so that readers can check out the hypotheses proposed in the grammar, and expressed by the glossing.&#8221;<\/p><\/blockquote>\n<p>I&#8217;d like to inject a note of caution here.  It seems to me that many times published texts, with interlinear glossing or not, and especially those that derive from transcriptions of spoken language, have often been fiddled with (or to put it more politely &#8216;edited&#8217;) on their way from recording to printed page. This is also often true of published texts that are based on written originals produced by literate native speakers. It is rarely the case that, as Wamut commented about Jeffrey Heath&#8217;s work on <a href=\"http:\/\/linguistlist.org\/forms\/langs\/LLDescription.cfm?code=nid\">Ngandi<\/a> at the end of Jane&#8217;s blog post:<\/p>\n<blockquote><p>\n&#8220;What is especially great, is that when you go back to Heath&#8217;s archived field recordings, the spoken texts are there in pristine form, that is, the spoken text and written text <b>correlate perfectly<\/b>&#8221; [emphasis added]<\/p><\/blockquote>\n<p>Heath adopted the same principle of &#8220;perfect correlation&#8221; in his published work on other languages such as his 1980 <a href=\"http:\/\/catalogue.nla.gov.au\/Record\/19825\">Nunggubuyu Myths and Ethnographic Texts<\/a> which clearly states in the introduction: &#8220;in the texts presented here I have not &#8216;weeded out&#8217; false starts, intrusive English words, or grammatical errors by the narrators&#8221;.<br \/>\nIn many other cases of text publication, I know editing has taken place &#8212; I have done it myself, and some other researchers have admitted to it (though rarely indicating <b>exactly what<\/b> editorial changes were made &#8212; more on this below). The texts in my 1997 book of <i>Texts in the Mantharta Languages, Western Australia<\/i>. [Tokyo: ILCAA, Tokyo University of Foreign Studies] were heavily edited, though I didn&#8217;t mention that in print at the time, and it was only when it came to creating a multimedia <a href=\"http:\/\/www.linguistics.unimelb.edu.au\/research\/jiwarli\/index.html\">Jiwarli website<\/a> where both published texts and original recordings were presented that I  had to <a href=\"http:\/\/www.linguistics.unimelb.edu.au\/research\/jiwarli\/stories.html\">confess<\/a>: &#8220;[y]ou may also notice that the Jiwarli texts are not word for word identical to the sound files, as Jack Butler, after recording the stories, made his own corrections in the texts&#8221;. There was no attempt to deceive here, rather it was Jack&#8217;s explicit wish that the stories be edited for publication.<br \/>\nAs an example, consider published <i>Text 50<\/i> (which appears on the website <a href=\"http:\/\/www.linguistics.unimelb.edu.au\/research\/jiwarli\/ethno.html\">here<\/a>) and the way it corresponds to the original recording (<i>italics<\/i> indicates material on the tape which was deleted in the editing process, <b>bold<\/b> indicates text added during editing, and { x == y} indicates substitution during editing):<\/p>\n<p><!--more--><br \/>\nNhukuramartuthu ngurrunyjarri julyumartu ngunha nhanyaartu {<i>porcupinemanha<\/i> == <b>jiriparrinha<\/b>} puniyanha. {<u>porcupine<\/u> == <b>Jiriparri<\/b>} ngunha jakuparlarrirarru. Ngurntirarri jakuparlarru parnajipi<u>thu<\/u> ngunha warrirru nhanyapuka. Ngurrunyjarrilu yarnararnilaartu ngurntapuka ngunha<u>pa<\/u> jakuparla. Wangkirarringu. Yarnararrima nhurra. <u>Yarnararrima nhurra<\/u>. Ngatha {<b>nhurranha<\/b> murrurrpa manara <u>nhurranha<\/u>}. <u>Yarnararrima<\/u>. Ngatha {<b>nhurranha<\/b> murrurrpa manara <u>nhurranha<\/u>}. <u>Yarnararrima. Ngatha murrurpa manara nhurranha<\/u>. Kunyarnurru ngunha kumpanhu. {<u>Porcupinemanha<\/u> == <b>Jiriparri<\/b>} ngunha kurlkanyunthurru yarnararrira. <u>When he<\/u> Yarnararrira<u>thu<\/u> parnarru thangkalpuka wurungku wirntupinyangurru pirrurru yanararri thikaru.<br \/>\nEditorial changes that Jack and I made are the following:<\/p>\n<ul>\n<li>replacement of the loan word &#8216;porcupine&#8217; with the indigenous word <i>jiriparri<\/i>, and deletion of the English expression &#8216;when he&#8217;\n<li>omission of the enclitics: <i>-thu<\/i> &#8216;old information&#8217;, <i>-pa<\/i> &#8216;specific referent&#8217; in order to decontextualise reference\n<li>omission of repetition  three repeats of &#8216;Lie on your back. I&#8217;ll get you cicatrices&#8217;\n<li>reordering of constituents: the possessor &#8216;your&#8217; and &#8216;cicatrices&#8217; are separated on the tape but were made adjacent in the editing for publication<\/ul>\n<p>\nWamut also mentions in his comment on Jane&#8217;s post another possible way in which published texts can differ from recordings:<\/p>\n<blockquote><p>&#8220;I&#8217;ve heard other spoken texts vary from the published text because the field worker has interrupted the speaker for clarification etc.&#8221;<\/p><\/blockquote>\n<p>There are also cases I know of where speakers &#8220;interrupt&#8221; themselves. My colleague <a href=\"http:\/\/www.hrelp.org\/aboutus\/staff\/index.php?cd=davidnathan\">David Nathan<\/a> tells me that when he was working with Luise Hercus to produce a multimedia CD-ROM of Baagandji materials, he found Luise&#8217;s audio recordings of stories also contained interpolations and explanations in English by the speaker which do not appear in the published texts.<br \/>\nI think descriptive linguists and language documenters could well take some guidance in this area from the work of epigraphers who have been developing a <a href=\"http:\/\/www.tei-c.org\/release\/doc\/tei-p4-doc\/html\/\">TEI\/XML<\/a> markup for epigraphy called <a href=\"http:\/\/epidoc.sourceforge.net\/\">EpiDoc<\/a>. Some of the EpiDoc proposals are concerned with adaptation of the TEI guidelines to deal with a range of issues such as legibility of characters on stone, missing elements or partially represented signs, but in addition there are several issues that I think should equally be of concern to language documentation:<\/p>\n<ul>\n<li>additions and deletions to the text\n<li>editorial supplements, observations, and hypotheses, including:\n<ul>\n<li>identification and expansion of abbreviations understood by the editor\n<li>identification of abbreviations not understood by the editor\n<li>editorial supplement in which the editor makes a &#8220;subaudible&#8221; word manifest\n<li>editorial supplement in which the editor explains a &#8220;breviatio&#8221; or note\n<li>editorial supplement for characters wholly lost\n<li>letters omitted because the stonecutter did not carry out the text to the end\n<\/ul>\n<li>editorial corrections\n<ul>\n<li>letters erroneously included in the text, which the editor suppresses\n<li>letters erroneously omitted from the text, which the editor adds\n<li>letters erroneously substituted in the text, which the editor corrects <\/ul>\n<\/ul>\n<p>\nThe EpiDoc guidelines contain explicit recommendations on how to encode these as markup annotations to the text. For work on endangered languages I think there are  some additional aspects that should be encoded, especially because we need to typically distinguish at least three participants in the process of published text creation, namely the original speaker, the transcriber, and the linguist-editor. We should pay attention to:<\/p>\n<ul>\n<li>encoding code-switching, code-mixing and borrowing, ideally by coding for the language (or variety) of the items transcribed\n<li>puristic editorial amendments on the part of the transcriber\n<li>puristic editorial amendments on the part of the linguist\n<li>deletions by the transcriber\n<li>additions by the transcriber\n<li>reorderings by the transcriber\n<li>additions and clarifications (editorial comments) by the linguist-editor\n<li>when the transcriber is not the originally recorded speaker we need to deal with (1) inter-speaker variation at the dialect or idiolect level and (2) inter-speaker variation arising from language loss, eg. phonemic or grammatical reduction among semi-speakers in a later generation transcribing earlier recorded texts<\/ul>\n<p>To my mind, it will only be when linguists make available marked up documents encoding these aspects along with the published texts, <b>and<\/b> the original media recordings (ideally publically available through an archive or distributed on CD or DVD along with the published texts), that we can start truly talking about &#8220;falsifiability&#8221; of grammars and other analytical claims about languages. The &#8220;published texts&#8221; alone are often simply not enough.<\/p>\n<hr>\n<p>\n<b>Notes<\/b>:<br \/>\n1. The ideas presented here have been fermenting since they were first publicly presented at an <a href=\"http:\/\/www.hrelp.org\/events\/workshops\/elap2005\/\">ELAP Workshop<\/a> at SOAS in February 2005. At the <a href=\"http:\/\/www.caicyt.gov.ar\/eventos\/ii-simposio-internacional-documentacion-lingueistica-y-cultural-en-america-latina\">Simposio Internacional: Contacto de Lenguas y Documentati\u00f3n<\/a> (International Symposium on Language Contact and Documentation) held in Buenos Aires <a href=\"\/blog\/2008\/08\/indigenous-languages-in-argentina\/\">last month<\/a>, Ulrike Mosel presented a paper entitled &#8220;Putting oral narratives into writing experiences from a language documentation project in Papua New Guinea&#8221; in which she explored the issue of editing recorded <a href=\"http:\/\/www.mpi.nl\/DOBES\/projects\/teop\">Teop<\/a> texts for publication. She independently identified many of the same issues I outline here.<br \/>\n2. I have been unable to find any discussion of the importance of explicit encoding of transcriptional and analytical editing decisions among the list of &#8220;best practices&#8221; promoted, eg. by <a href=\"http:\/\/emeld.org\/school\/what.html\">the E-MELD School of Best Practice<\/a>, despite the fact that, to me at least, they play an important role in &#8220;practices which are intended to make digital language documentation optimally longlasting, accessible, and re-usable by other linguists and speakers&#8221;.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In a recent blog post, Jane Simpson reported on opinions expressed by a group at ANU meeting to discuss grammar writing: &#8220;We all agree it&#8217;s a good thing to publish glossed texts so that readers can check out the hypotheses proposed in the grammar, and expressed by the glossing.&#8221; I&#8217;d like to inject a note &#8230; <a title=\"Glossed texts &#8212; the fiddle factor\" class=\"read-more\" href=\"https:\/\/www.paradisec.org.au\/blog\/2008\/09\/glossed-texts-the-fiddle-factor\/\" aria-label=\"Read more about Glossed texts &#8212; the fiddle factor\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[5],"tags":[],"class_list":["post-3726","post","type-post","status-publish","format-standard","hentry","category-linguistics"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts\/3726","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/comments?post=3726"}],"version-history":[{"count":2,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts\/3726\/revisions"}],"predecessor-version":[{"id":4517,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts\/3726\/revisions\/4517"}],"wp:attachment":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/media?parent=3726"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/categories?post=3726"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/tags?post=3726"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}