{"id":3855,"date":"2009-12-13T11:53:57","date_gmt":"2009-12-13T11:53:57","guid":{"rendered":"http:\/\/www.paradisec.org.au\/blog\/2009\/12\/finding-language-material-web2-or-wikipedia\/"},"modified":"2011-02-05T07:38:53","modified_gmt":"2011-02-05T07:38:53","slug":"finding-language-material-web2-or-wikipedia","status":"publish","type":"post","link":"https:\/\/www.paradisec.org.au\/blog\/2009\/12\/finding-language-material-web2-or-wikipedia\/","title":{"rendered":"Finding language material, Web2 or Wikipedia?"},"content":{"rendered":"<p>[From <a href=\"http:\/\/www.linguistics.unimelb.edu.au\/thieberger\/\">Nick Thieberger, University of Melbourne<\/a>]<br \/>\nOn the topic of trying to locate material in a small language, I was reading Kaisa Maliniemi&#8217;s 2009 article on the discovery of new linguistic material in Kven and S&aacute;mi in Norway&#8217;s public records archives. She discusses the fact that the records have been publicly available for some time and that a number of researchers must have worked with them in the past, but there was no trace in that activity of the fact that the records included considerable amounts of information in these two minority languages. She argues that archives can make available to &#8216;the other&#8217; those voices and knowledge marginalized by the western-dominated global mainstream. But the point that the article made strongly for me is that we should be able to provide a means for tagging such collections so that they can be located by others interested in those languages (this was also a topic at the ELIIP conference reported on by Jane Simpson <a href=\"\/blog\/2009\/11\/our-language-our-flower-day-1-of-eliip\/\">here<\/a> and <a href=\"\/blog\/2009\/11\/concluding-the-eliip-workshop\/\">here<\/a> ).<br \/>\nThe suggestion that we can use Wikipedia [in Peter Austin&#8217;s<a href=\"\/blog\/2009\/11\/our-language-our-flower-day-1-of-eliip\/#c581927\"> reply<\/a> to Jane&#8217;s blog] is only part of a solution. I have put links to South Efate material into a Wikipedia entry  <a href=\"http:\/\/en.wikipedia.org\/wiki\/South_Efate_language\">here<\/a> as a way to make the information available. We can, however, do better than an unstructured language page that is made by hand, as in the Wikipedia approach, rather than being automatically populated by web-based information in Web2 style. Using Web2 technologies, the <a href=\"http:\/\/www.language-archives.org\/\">Open Language Archives Community<\/a> (OLAC) harvests information from participating collections and then establishes a page for every language represented in those collections, like <a href=\"http:\/\/www.language-archives.org\/language\/erk\">this one<\/a>, where the three-letter language code (ISO-639-3) designates the language, in this case  &#8216;erk&#8217; = South Efate (Vanuatu). Of course there are languages without ISO standard codes and they need to be brought into the system too.<br \/>\nA focus of our archive, <a href=\"http:\/\/www.paradisec.org.au\/\">PARADISEC<\/a>, is to make previously unlocatable material available, and we have done this in several ways. The first, and most straightforward, is to provide an online catalog of material in our own collection. The catalog, using standard terms like country names, language names and the metadata given by the Open Language Archives Community, allows depositors to enter their own metadata. For many, this is the first time they have actually systematised their collection. Because the catalog is part of the OLAC federation, it is accessible via their search mechanisms, and is also locatable via Google.<br \/>\nSecond we have made material available by taking scans of around 14,000 pages of notes and placing them online, with enough contextual information to allow them to be located [see <a href=\"http:\/\/paradisec.org.au\/fieldnotes\/AC2.htm\">Arthur Capell&#8217;s notes  here<\/a>, or <a href=\"http:\/\/paradisec.org.au\/fieldnotes\/SAW2\/SAW2.htm\">Stephen  Wurm&#8217;s notes here<\/a>, or<a href=\"http:\/\/paradisec.org.au\/fieldnotes\/ROES\/web\/roes.htm\"> Calvin Roesler&#8217;s notes here<\/a>]. If you look at the OLAC page with South Efate material listed you will also find a number of references and links to Arthur Capell&#8217;s notes which we put online.<br \/>\nThird, we can enter a record in our catalog to make an existing resource more widely available, and, as our catalog is harvested by the Open Language Archives Community, it will then be more generally locatable. For example, George Grace is a linguist who has worked in various parts of the western Pacific, and his fieldnotes have been scanned and <a href=\"http:\/\/digicoll.manoa.hawaii.edu\/grace\/Pages\/PDFlist.html\">put online<\/a> at the University of Hawai&#8217;i (UH) library. If you know that it is there and you search for his name, then you can find it in Google. However, there is no provision made by UH for standardising language names by use of the three-letter code (or ISO-639-3) that reduces ambiguity in searching. The UH library catalog currently does not list these items, nor does their &#8216;Online resources&#8217; catalog. By entering a record into the PARADISEC catalog (<a href=\"http:\/\/azoulay.arts.usyd.edu.au\/paradisec\/edit_item.php?item_pid=GG1-01\">here<\/a>) the information is then propagated through to OLAC:<br \/>\n<a href=\"\/blog\/Waropen%20olac%20search.jpg\"><img loading=\"lazy\" decoding=\"async\" alt=\"Waropen olac search.jpg\" src=\"\/blog\/images\/Waropen%20olac%20search-thumb.jpg\" width=\"376\" height=\"300\" \/><\/a>.<br \/>\nA Google search for one of the languages mentioned in this collection, &#8216;Waropen&#8217;, locates our record (hit number 3) in OLAC:<br \/>\n<a href=\"\/blog\/Waropen%20google%20olac%20find.jpg\"><img loading=\"lazy\" decoding=\"async\" alt=\"Waropen google olac find.jpg\" src=\"\/blog\/images\/Waropen%20google%20olac%20find-thumb.jpg\" width=\"413\" height=\"300\" \/><\/a><br \/>\nThe item at UH comes in at hit number 57:<br \/>\n<a href=\"\/blog\/Waropen%20hit%20at%20UH%20google%20number%2057.jpg\"><img loading=\"lazy\" decoding=\"async\" alt=\"Waropen hit at UH google number 57.jpg\" src=\"\/blog\/images\/Waropen%20hit%20at%20UH%20google%20number%2057-thumb.jpg\" width=\"409\" height=\"300\" \/><\/a><br \/>\nOLAC&#8217;s language pages are an excellent source of information, and if we can add to each page by providing a fairly minimal pointer in an OLAC-compliant record then that may also solve the problem for the Kven and S&aacute;mi material that Maliniemi discovered.<\/p>\n<hr>\n<p>Maliniemi, Kaisa. 2009. Public records and minorities: problems and possibilities for S&aacute;mi and Kven. <em>Archival Science<\/em>. Vol. 9, Numbers 1-2: 15-27 DOI 10.1007\/s10502-009-9104-3<\/p>\n","protected":false},"excerpt":{"rendered":"<p>[From Nick Thieberger, University of Melbourne] On the topic of trying to locate material in a small language, I was reading Kaisa Maliniemi&#8217;s 2009 article on the discovery of new linguistic material in Kven and S&aacute;mi in Norway&#8217;s public records archives. She discusses the fact that the records have been publicly available for some time &#8230; <a title=\"Finding language material, Web2 or Wikipedia?\" class=\"read-more\" href=\"https:\/\/www.paradisec.org.au\/blog\/2009\/12\/finding-language-material-web2-or-wikipedia\/\" aria-label=\"Read more about Finding language material, Web2 or Wikipedia?\">Read more<\/a><\/p>\n","protected":false},"author":13,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[9],"tags":[],"class_list":["post-3855","post","type-post","status-publish","format-standard","hentry","category-archiving"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts\/3855","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/users\/13"}],"replies":[{"embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/comments?post=3855"}],"version-history":[{"count":1,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts\/3855\/revisions"}],"predecessor-version":[{"id":4081,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/posts\/3855\/revisions\/4081"}],"wp:attachment":[{"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/media?parent=3855"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/categories?post=3855"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.paradisec.org.au\/blog\/wp-json\/wp\/v2\/tags?post=3855"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}