Somali phone dictionary

On June 13th we Arnold Zable will launch a Somali-English Dictionary app for both Android and iOS platforms, using the successful Ma! Iwaidja dictionary model. Opening screen of the Somali dictionary appThis is the product of a collaboration between the School of Languages and Linguistics at the University of Melbourne and Burji Arts, a Melbourne-based Somali arts and cultural organisation. The app contains some 26,000 Somali words and English equivalents with audio for selected items, so users can hear words or phrases. This app will have the capability of accepting input from users who can contribute items and suggest alternate pronunciations.

After passing quality controls, the additions can then be added to the database, i.e., Somali speakers around the world will be able to use their smartphones and tablets to record the Somali word for each entry, as well as record comments on each entry. These recorded words are then uploaded to a website for editing before being added to the next version of the app. The app is available worldwide and is free to download. It will help Somali speakers learning the English language and English speakers working with Somali speakers. As others may want to build their own dictionary apps I’ll give an outline of how the content of the dictionary was developed and converted from other formats, in particular the text and the audio.

 

Converting the text

We are very grateful to Dunwoody Press for supplying the dictionary as an editable document, and for giving us permission to re-use the text. As the original was not structured in any way (it was a Microsoft Word document) it was necessary to use clues in the format of the document to impose an explicit structure (required by the app). So, for example, the text looks like this in the dictionary:

Text as it appears in the original dictionary
Text as it appears in the original dictionary

 

Bear in mind that this is a huge dictionary (530 pages and over 26,000 headwords) so the work of converting it had to be done automatically over the whole file and not by hand. The whole work was first put through the conversion service called OxGarage and the snippet in the box here came out something like this:

<text> <body> <hi rend=”bold”>aqoonsi</hi> (-ga) <hi rend=”italic”>vn2a-m, vn7-m</hi> knowledge; recognition, identification, identity Alt: aqoonsasho (/da) [vn-f]</p> <hi rend=”bold”>aqoonsii</hi> (aqoonsiiyey) <hi rend=”italic”>v2a=caus</hi> make s.o. acknowledge s.t.; cause to recognize</p> </body> </text>

Having each formatted part of an entry marked by tags like this then permits you to identify the headword, the part of speech and then the definition and these can then be put into the format required of the app, as follows:

<p class=EntryParagraph><span class=Lexeme>aqoonsi</span><span class=suffix>(-ga)</span><span class=DefinitionE>knowledge; recognition, identification, identity</span><span class=Alt>aqoonsasho (/da) [vn-f]</span><span class=Partofspeech>vn2a-m, vn7-m</span></p>

<p class=EntryParagraph><span class=Lexeme>aqoonsii</span><span class=Homonym>(aqoonsiiyey)</span><span class=DefinitionE>make s.o. acknowledge s.t.; cause to recognize</span><span class=Partofspeech>v2a=caus</span></p>

While more could be done to automate the cross-references and relationships between words in the dictionary, the present app does not provide for this functionality (yet).

Getting the words recorded

Nadia Faragaab, who initiated the Somali dictionary work, read a script of Somali words, recorded it, and then used the script to transcribe the recording with time-alignment. The time-aligned text was then used to segment the audio file into units, each named for the word that it represented (as described in an earlier blog post here) and converted to .m4a format. These filenames were added to the list provided for import to the app.

Further information

See the Burji website where you can contribute to the development of the app.

We gratefully acknowledge the support of the University of Melbourne’s Vice Chancellor’s Staff Engagement Awards, the Melbourne Social Equity Institute, the Arts Faculty Engagement Committee and the Research Unit for Indigenous Language. We also acknowledge Dunwoody Press for providing the content of the app.

The app is available at the iTunes app store for Apple and Google Play for Android.

Photos of the Somali app launch (13/6/14) by Fabien Zuffo Deschamps photography 

10423719_633346263416522_5770575142041107917_n 1535436_633345896749892_1996435137394808758_n 1979526_633346120083203_1905242776833208065_n 10351229_633346553416493_4444587973619101761_n 10367797_633346520083163_1675674045945125390_n 10371921_633346680083147_3127279436975524835_n 10414549_633346633416485_5121632635808709095_n