Language Preservation and Semantization: Prototyping Automated Glossing of an Endangered Mixed Language Corpus

Karim Tharani


This article discusses the prototyping of an online vocabulary learning tool for the oral language of the ginans, a corpus of gnostic hymn-like poems of the Ismaili community. The language of the ginans is mixed and borrows vocabulary from various Indo-Aryan and Perso-Arabic dialects. The teachings encoded in the oral language of the ginans, therefore, remain foreign to the English-speaking community members living in the Western diaspora. This study is based on the premise that for the tradition and the teachings of ginans to be preserved in the diaspora, the successive English-speaking generations of the Ismaili community must learn and understand the vocabulary of the ginans. The process through which humans learn and understand the vocabulary of a language is called semantization. The glossing of foreign language (L2) materials with meanings in the native language (L1) of learners has proven to be an effective enabler of semantization. The prototype glossed ginan utilizing lexical resources, including a concordance and an English glossary to facilitate semantization of the ginan vocabulary. Using the design-based research (DBR) methodology, the prototype was implemented over two iterative design cycles. During the evaluation of the prototype by target learners, over 90% of the participants indicated that they would make use of the prototype when made available publicly.

Full Text:



Alibhai, M. (2020, March 5). Tajbibi Abualy Aziz (1926-2019) Part one: A Satpanthi Sita. The Olduvai Review.

Asani, A. S. (2002). Ecstasy and enlightenment: The Ismaili devotional literature of South Asia. London: I.B. Tauris.

Asani, A. S. (2011). From Satpanthi to Ismaili Muslim: The articulation of Ismaili Khoja identity in South Asia. In F. Daftary (Ed.), A modern history of the Ismailis: Continuity and change in a Muslim community (pp. 95-128). London: I. B. Tauris Publishers.

Asani, A. S. (2021). The Ginans: Betwixt Satpanthī Scripture and “Ismaili” Devotional Literature [Manuscript submitted for publication]. University Department, Harvard University.

Azari, F. (2012). Review of effects of textual glosses on incidental vocabulary learning. International Journal of Innovative Ideas, 12(2), 13-24.

Bakker, P. J., & Mous, M. (1994). Mixed languages. 15 case studies in language intertwining. Amsterdam: IFOTT.

Barkwell, L. (2017). A background paper on Michif: An overview of the last 35 years. Unpublished manuscript. Downloaded from Retrieved February 20, 2021, from

Beheydt, L. (1987). The semantization of vocabulary in foreign language learning. System, 15(1), 55-67.

Holmes, W. (2013). Level Up! A design‐based investigation of a prototype digital game for children who are low‐attaining in mathematics. Doctoral thesis (D.Phil.) University of Oxford. Retrieved February 20, 2021, from

Jimoyiannis, A., & Komis, V. (2006a). Exploring secondary education teachers’ attitudes and beliefs towards ICT adoption in education, Themes in Education, 7(2), 181-204.

Jimoyiannis, A., & Komis, V. (2006b). Examining teachers’ beliefs about ICT in education: implications of a teacher preparation programme, Teacher Development, 11(2), 149-173.

Jonassen, D. H. (2000). Computers as mind tools for schools. NJ: Prentice Hall.

Jonassen, D. H. (2003). Computers as mind tools for schools: engaging critical thinking. NJ: Prentice-Hall.

Jurafsky, D., & Martin, J. H. (2019). Speech & language processing. Unpublished manuscript. Draft of October 2, 2019. Retrieved from

Kassam, T. R. (1995). Songs of wisdom and circles of dance: hymns of the Satpanth Isma'ili Muslim saint, Pir Shams. SUNY Press.

Lewis, M. P., Simons, G. F., & Fennig, C. D. (2020). Mixed languages. Ethnologue: Languages of the world. Retrieved February 21, 2021, from

Petten, C. (2006). Michif speakers talk language preservation. Ontario Birchbark, 5(4), 1772-1781. Retrieved February 20, 2021, from

Poole, R. E. (2011). Concordance-based glosses for facilitating semantization and enhancing productive knowledge of academic vocabulary [Doctoral dissertation, University of Alabama Libraries].

Russell, M., Bebell, D., O’Dwyer, L., & O’Connor, K. (2003). Examining teacher technology use. Implications for preservice and inservice teacher preparation, Journal of Teacher Education, 54(4), 297-310.

Schmitt, N. (2008). Instructed second language vocabulary learning. Language Teaching Research, 12(3), 329-363.

Vrachnos, E. (2008). Factors determining teachers’ beliefs and perceptions of ICT in education. In A. Cartelli & M. Palma (eds.), Encyclopedia of Information Communication Technology (pp. 321-334). Hershey, PA: IGI Global.

Wang, F., & Hannafin, M. J. (2005). Design-based research and technology-enhanced learning environments. Educational technology research and development, 53(4), 5-23.