Somali phone dictionary

On June 13th we Arnold Zable will launch a Somali-English Dictionary app for both Android and iOS platforms, using the successful Ma! Iwaidja dictionary model. Opening screen of the Somali dictionary appThis is the product of a collaboration between the School of Languages and Linguistics at the University of Melbourne and Burji Arts, a Melbourne-based Somali arts and cultural organisation. The app contains some 26,000 Somali words and English equivalents with audio for selected items, so users can hear words or phrases. This app will have the capability of accepting input from users who can contribute items and suggest alternate pronunciations.

After passing quality controls, the additions can then be added to the database, i.e., Somali speakers around the world will be able to use their smartphones and tablets to record the Somali word for each entry, as well as record comments on each entry. These recorded words are then uploaded to a website for editing before being added to the next version of the app. The app is available worldwide and is free to download. It will help Somali speakers learning the English language and English speakers working with Somali speakers. As others may want to build their own dictionary apps I’ll give an outline of how the content of the dictionary was developed and converted from other formats, in particular the text and the audio.


Converting the text

We are very grateful to Dunwoody Press for supplying the dictionary as an editable document, and for giving us permission to re-use the text. As the original was not structured in any way (it was a Microsoft Word document) it was necessary to use clues in the format of the document to impose an explicit structure (required by the app). So, for example, the text looks like this in the dictionary:

Text as it appears in the original dictionary
Text as it appears in the original dictionary


Bear in mind that this is a huge dictionary (530 pages and over 26,000 headwords) so the work of converting it had to be done automatically over the whole file and not by hand. The whole work was first put through the conversion service called OxGarage and the snippet in the box here came out something like this:

<text> <body> <hi rend=”bold”>aqoonsi</hi> (-ga) <hi rend=”italic”>vn2a-m, vn7-m</hi> knowledge; recognition, identification, identity Alt: aqoonsasho (/da) [vn-f]</p> <hi rend=”bold”>aqoonsii</hi> (aqoonsiiyey) <hi rend=”italic”>v2a=caus</hi> make s.o. acknowledge s.t.; cause to recognize</p> </body> </text>

Having each formatted part of an entry marked by tags like this then permits you to identify the headword, the part of speech and then the definition and these can then be put into the format required of the app, as follows:

<p class=EntryParagraph><span class=Lexeme>aqoonsi</span><span class=suffix>(-ga)</span><span class=DefinitionE>knowledge; recognition, identification, identity</span><span class=Alt>aqoonsasho (/da) [vn-f]</span><span class=Partofspeech>vn2a-m, vn7-m</span></p>

<p class=EntryParagraph><span class=Lexeme>aqoonsii</span><span class=Homonym>(aqoonsiiyey)</span><span class=DefinitionE>make s.o. acknowledge s.t.; cause to recognize</span><span class=Partofspeech>v2a=caus</span></p>

While more could be done to automate the cross-references and relationships between words in the dictionary, the present app does not provide for this functionality (yet).

Getting the words recorded

Nadia Faragaab, who initiated the Somali dictionary work, read a script of Somali words, recorded it, and then used the script to transcribe the recording with time-alignment. The time-aligned text was then used to segment the audio file into units, each named for the word that it represented (as described in an earlier blog post here) and converted to .m4a format. These filenames were added to the list provided for import to the app.

Further information

See the Burji website where you can contribute to the development of the app.

We gratefully acknowledge the support of the University of Melbourne’s Vice Chancellor’s Staff Engagement Awards, the Melbourne Social Equity Institute, the Arts Faculty Engagement Committee and the Research Unit for Indigenous Language. We also acknowledge Dunwoody Press for providing the content of the app.

The app is available at the iTunes app store for Apple and Google Play for Android.

Photos of the Somali app launch (13/6/14) by Fabien Zuffo Deschamps photography 

10423719_633346263416522_5770575142041107917_n 1535436_633345896749892_1996435137394808758_n 1979526_633346120083203_1905242776833208065_n 10351229_633346553416493_4444587973619101761_n 10367797_633346520083163_1675674045945125390_n 10371921_633346680083147_3127279436975524835_n 10414549_633346633416485_5121632635808709095_n

2 thoughts on “Somali phone dictionary”

Here at Endangered Languages and Cultures, we fully welcome your opinion, questions and comments on any post, and all posts will have an active comments form. However if you have never commented before, your comment may take some time before it is approved. Subsequent comments from you should appear immediately.

We will not edit any comments unless asked to, or unless there have been html coding errors, broken links, or formatting errors. We still reserve the right to censor any comment that the administrators deem to be unnecessarily derogatory or offensive, libellous or unhelpful, and we have an active spam filter that may reject your comment if it contains too many links or otherwise fits the description of spam. If this happens erroneously, email the author of the post and let them know. And note that given the huge amount of spam that all WordPress blogs receive on a daily basis (hundreds) it is not possible to sift through them all and find the ham.

In addition to the above, we ask that you please observe the Gricean maxims:

*Be relevant: That is, stay reasonably on topic.

*Be truthful: This goes without saying; don’t give us any nonsense.

*Be concise: Say as much as you need to without being unnecessarily long-winded.

*Be perspicuous: This last one needs no explanation.

We permit comments and trackbacks on our articles. Anyone may comment. Comments are subject to moderation, filtering, spell checking, editing, and removal without cause or justification.

All comments are reviewed by comment spamming software and by the site administrators and may be removed without cause at any time. All information provided is volunteered by you. Any website address provided in the URL will be linked to from your name, if you wish to include such information. We do not collect and save information provided when commenting such as email address and will not use this information except where indicated. This site and its representatives will not be held responsible for errors in any comment submissions.

Again, we repeat: We reserve all rights of refusal and deletion of any and all comments and trackbacks.

Leave a Comment