Following on from the work reported on two posts ago with progress on speech recognition for Bislama and Nafsan, Aso Mahmudi has now created a desktop app (called Easper – Elan Automated Speech Recognition) that takes a wav file as input, segments it, does speaker diarisation, and transcribes it, delivering an Elan file as the output. All this is done on a laptop computer and needs no internet connection.
As can be seen in the image below, you can set the number of speakers, and can alter the silence recognition (‘gap between segments’) and the minimum segment length, depending on the characteristics of the recording and the rate of speech. You select the language model (which has to have been created in advance) and then run the process. For example, with a 44 minute file the first pass of segmentation took less than three minutes. Transcription took seven minutes on my laptop (MacBookPro 2021, 16Gb RAM, System 26.3). It always needs checking and correcting, but is remarkably good (as reported earlier, around 10% character error rate for Nafsan).

Easper also provides Elan file analysis, as seen below, with the following results given:
- characters used in the file (can be corrected by the user)
- how many words in the file
- a frequency list of all words
- how many overlaps of speakers
- how many long segments (these can be problematic and result in artefacts like repetition of the same word many times in the transcript)

Aso Mahmudi is currently working on extending the Nafsan model to neighbouring languages. We can then determine how useful it will be to continue extending existing models to other languages with similar phonological systems and orthographies.
Follow
Here at Endangered Languages and Cultures, we fully welcome your opinion, questions and comments on any post, and all posts will have an active comments form. However if you have never commented before, your comment may take some time before it is approved. Subsequent comments from you should appear immediately.
We will not edit any comments unless asked to, or unless there have been html coding errors, broken links, or formatting errors. We still reserve the right to censor any comment that the administrators deem to be unnecessarily derogatory or offensive, libellous or unhelpful, and we have an active spam filter that may reject your comment if it contains too many links or otherwise fits the description of spam. If this happens erroneously, email the author of the post and let them know. And note that given the huge amount of spam that all WordPress blogs receive on a daily basis (hundreds) it is not possible to sift through them all and find the ham.
In addition to the above, we ask that you please observe the Gricean maxims:*Be relevant: That is, stay reasonably on topic.
*Be truthful: This goes without saying; don’t give us any nonsense.
*Be concise: Say as much as you need to without being unnecessarily long-winded.
*Be perspicuous: This last one needs no explanation.
We permit comments and trackbacks on our articles. Anyone may comment. Comments are subject to moderation, filtering, spell checking, editing, and removal without cause or justification.
All comments are reviewed by comment spamming software and by the site administrators and may be removed without cause at any time. All information provided is volunteered by you. Any website address provided in the URL will be linked to from your name, if you wish to include such information. We do not collect and save information provided when commenting such as email address and will not use this information except where indicated. This site and its representatives will not be held responsible for errors in any comment submissions.
Again, we repeat: We reserve all rights of refusal and deletion of any and all comments and trackbacks.