On June 13th
we Arnold Zable will launch a Somali-English Dictionary app for both Android and iOS platforms, using the successful Ma! Iwaidja dictionary model. This is the product of a collaboration between the School of Languages and Linguistics at the University of Melbourne and Burji Arts, a Melbourne-based Somali arts and cultural organisation. The app contains some 26,000 Somali words and English equivalents with audio for selected items, so users can hear words or phrases. This app will have the capability of accepting input from users who can contribute items and suggest alternate pronunciations.
After passing quality controls, the additions can then be added to the database, i.e., Somali speakers around the world will be able to use their smartphones and tablets to record the Somali word for each entry, as well as record comments on each entry. These recorded words are then uploaded to a website for editing before being added to the next version of the app. The app is available worldwide and is free to download. It will help Somali speakers learning the English language and English speakers working with Somali speakers. As others may want to build their own dictionary apps I’ll give an outline of how the content of the dictionary was developed and converted from other formats, in particular the text and the audio.
Converting the text
We are very grateful to Dunwoody Press for supplying the dictionary as an editable document, and for giving us permission to re-use the text. As the original was not structured in any way (it was a Microsoft Word document) it was necessary to use clues in the format of the document to impose an explicit structure (required by the app). So, for example, the text looks like this in the dictionary:
Bear in mind that this is a huge dictionary (530 pages and over 26,000 headwords) so the work of converting it had to be done automatically over the whole file and not by hand. The whole work was first put through the conversion service called OxGarage and the snippet in the box here came out something like this:
<text> <body> <hi rend=”bold”>aqoonsi</hi> (-ga) <hi rend=”italic”>vn2a-m, vn7-m</hi> knowledge; recognition, identification, identity Alt: aqoonsasho (/da) [vn-f]</p> <hi rend=”bold”>aqoonsii</hi> (aqoonsiiyey) <hi rend=”italic”>v2a=caus</hi> make s.o. acknowledge s.t.; cause to recognize</p> </body> </text>
Having each formatted part of an entry marked by tags like this then permits you to identify the headword, the part of speech and then the definition and these can then be put into the format required of the app, as follows:
<p class=EntryParagraph><span class=Lexeme>aqoonsi</span><span class=suffix>(-ga)</span><span class=DefinitionE>knowledge; recognition, identification, identity</span><span class=Alt>aqoonsasho (/da) [vn-f]</span><span class=Partofspeech>vn2a-m, vn7-m</span></p>
<p class=EntryParagraph><span class=Lexeme>aqoonsii</span><span class=Homonym>(aqoonsiiyey)</span><span class=DefinitionE>make s.o. acknowledge s.t.; cause to recognize</span><span class=Partofspeech>v2a=caus</span></p>
While more could be done to automate the cross-references and relationships between words in the dictionary, the present app does not provide for this functionality (yet).
Getting the words recorded
Nadia Faragaab, who initiated the Somali dictionary work, read a script of Somali words, recorded it, and then used the script to transcribe the recording with time-alignment. The time-aligned text was then used to segment the audio file into units, each named for the word that it represented (as described in an earlier blog post here) and converted to .m4a format. These filenames were added to the list provided for import to the app.
See the Burji website where you can contribute to the development of the app.
We gratefully acknowledge the support of the University of Melbourne’s Vice Chancellor’s Staff Engagement Awards, the Melbourne Social Equity Institute, the Arts Faculty Engagement Committee and the Research Unit for Indigenous Language. We also acknowledge Dunwoody Press for providing the content of the app.
The app is available at the iTunes app store for Apple and Google Play for Android.
Photos of the Somali app launch (13/6/14) by Fabien Zuffo Deschamps photography