Sinhala, spoken in Sri Lanka as an official language, is one of the less privileged languages; still there are no established text input methods. As with many of the Asian languages, Sinhala also has a large set of characters, forcing us to develop an input method that involves a conversion process from a key sequence to a character/word. This paper proposes a novel word-based predictive text input system named SriShell Primo. This system allows the user to input a Sinhala word with a key sequence that highly matches his/her intuition from its pronunciation. A key to this scenario is a pre-compiled table that lists conceivable roman character sequences utilized by a wide range of users for representing a consonant, a consonant sign, and a vowel. By referring to this table, as the user enters a key, the system generates possible character strings as candidate Sinhala words. Thanks to a TRIE structured word dictionary and a fast search algorithm, the system successively and efficiently narrows down the candidates to possible Sinhala words. The experimental results show that the system greatly improves the user friendliness compared to former characterbased input systems while maintaining high efficiency.
|出版ステータス||Published - 2008|
|イベント||2008 Workshop on NLP for Less Privileged Languages, held in conjunction with the 3rd International Joint Conference on Natural Language Processing, IJCNLP 2008 - Hyderabad, India|
継続期間: 2008 1月 11 → …
|Conference||2008 Workshop on NLP for Less Privileged Languages, held in conjunction with the 3rd International Joint Conference on Natural Language Processing, IJCNLP 2008|
|Period||08/1/11 → …|
ASJC Scopus subject areas