Roadmap for an Arabic Controlled Language

Hoyam Salah El Fahal, Mohammed Nasri, Karim Bouzoubaa, Adil Kabbaj


Controlled Natural Languages or CNLs are artificial subsets of natural languages that aim to make communication clearer and more precise. In general, CNLs are used in communication between humans or with computers and, particularly, when clarity and unambiguity are required. Existing CNLs have been developed to be exploited in many applications such as technical documentation, machine translation or database query language. So far, many CNLs have been developed for Western languages, especially English, but no concrete CNL has yet been proposed for Arabic even with the increasing number of Arabic Internet users in the last two decades. In this paper, we propose a roadmap for developing an Arabic CNL to provide new kind and advanced natural language services for Arabic people. Methodologically speaking, we review the most important existing CNLs in English and other languages helping us knowing some statistics related to the vocabulary size and the number of grammar rules that could help in designing the new CNL. This paper proposes two major approaches; one relies on leveraging on already-built CNLs, whereas the other consists in starting from scratch. The survey of Arabic NLP challenges along the available resources and tools allowed us to favor the second approach as the basis for the proposed roadmap.

Full Text:



