Arabic text summarization approaches: A comparison study

Authors

  • Hani S. AlGhanem Faculty of Engineering & IT, The British University in Dubai, Dubai, UAE
  • Rashan H. Ajamiah Faculty of Engineering & IT, The British University in Dubai, Dubai, UAE

Abstract

Text summarization is considered one of the essential parts of the NLP area, as it gets attention since the ’50s of the last century. Although it has evolved rapidly in the last decades for Latin languages, Arabic text summarization is still a green area for researchers. Many algorithms can be used to generate Arabic text summarization. The analysis shows that the best algorithm for single document summarization in the Arabic language is Arabic summarization using the clustering technique with word rooting capability. The unique algorithm for multi-document summarization is Text Summarization using the Centrality Concept. A detailed literature review covers Text summarization in general and Arabic text summarization in specific and its challenges.

References

Abu Nada, A. M., Alajrami, E., Al-Saqqa, A. A., & Abu-Naser, S. S. (2020). Arabic Text Summarization Using AraBERT Model Using Extractive Text Summarization Approach.

Al-Abdallah, R. Z., & Al-Taani, A. T. (2017). Arabic single-document text summarization using particle swarm optimization algorithm. Procedia Computer Science, 117, 30-37.

Al-Abdallah, R. Z., & Al-Taani, A. T. (2019, February). Arabic text summarization using firefly algorithm. In 2019 Amity International Conference on Artificial Intelligence (AICAI) (pp. 61-65). IEEE.

Alami, N., Meknassi, M., & Rais, N. (2015). Automatic texts summarization: Current state of the art. Journal of Asian Scientific Research, 5(1), 1-15.

Algaphari, G., Ba-Alwi, F. M., & Moharram, A. (2013). Text summarization using centrality concept. International Journal of Computer Applications, 79(1).

AlGhanem, H., Shanaa, M., Salloum, S., & Shaalan, K. (2020). The Role of KM in Enhancing AI Algorithms and Systems. Advances in Science, Technology and Engineering Systems Journal, 5(4), 388-396.

Alguliev, R., & Aliguliyev, R. (2009). Evolutionary algorithm for extractive text summarization. Intelligent Information Management, 1(02), 128.

AL-Khawaldeh, F. T. (2019). A study of the effect of resolving negation and sentiment analysis in recognizing text entailment for Arabic. arXiv preprint arXiv:1907.03871.

AL-Khawaldeh, F. T. (2019). Answer extraction for why Arabic questions answering systems: EWAQ. arXiv preprint arXiv:1907.04149.

AL-Khawaldeh, F. T., & Samawi, V. W. (2015). Lexical cohesion and entailment based segmentation for arabic text summarization (lceas). World of Computer Science & Information Technology Journal, 5(3).

Al-Radaideh, Q., & Afif, M. (2009). Arabic text summarization using aggregate similarity. In International Arab conference on information technology (ACIT2009), Yemen.

Al-Taani, A. T., & Al-Omour, M. M. (2014). An extractive graph-based Arabic text summarization approach. In The International Arab Conference on Information Technology.

Al-Zahrani, A. M., Mathkour, H., & Abdalla, H. I. (2015). PSO-Based Feature Selection for Arabic Text Summarization. J. UCS, 21(11), 1454-1469.

Ashworth, W. (1973). Abstracting as a fine art.

Awajan, A. (2007). Arabic text preprocessing for the natural language processing applications. Arab Gulf Journal of Scientific Research, 25(4), 179-189.

Azmi, A., & AlShenaifi, N. (2014). Handling “why” questions in Arabic. In The 5th International Conference on Arabic Language Processing (CITALA’14).

Bataineh, B. M., & Bataineh, E. A. (2009, July). An efficient recursive transition network parser for Arabic language. In Proceedings of the World Congress on Engineering (Vol. 2, pp. 1-3).

Baxendale, P. B. (1958). Machine-made index for technical literature—an experiment. IBM Journal of research and development, 2(4), 354-361.

Berger, A., & Mittal, V. O. (2000, October). Query-relevant summarization using FAQs. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (pp. 294-301).

Brandow, R., Mitze, K., & Rau, L. F. (1995). Automatic condensation of electronic publications by sentence selection. Information Processing & Management, 31(5), 675-685.

Chang, C. H., Kayed, M., Girgis, M. R., & Shaalan, K. F. (2006). A survey of web information extraction systems. IEEE transactions on knowledge and data engineering, 18(10), 1411-1428.

DeJong, G. (1979). Prediction and substantiation: A new approach to natural language processing. Cognitive Science, 3(3), 251-273.

DeJong, G. (1982, August). Automatic Schema Acquisition in a Natural Language Environment. In AAAI (pp. 410-413).

Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM (JACM), 16(2), 264-285.

Elbarougy, R., Behery, G., & El Khatib, A. (2020). Extractive Arabic Text Summarization Using Modified PageRank Algorithm. Egyptian Informatics Journal, 21(2), 73-81.

Elbarougy, R., Behery, G., & KHATIB, A. E. (2020). Graph-Based Extractive Arabic Text Summarization Using Multiple Morphological Analyzers. Journal of Information Science & Engineering, 36(2).

Erkan, G., & Radev, D. R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of artificial intelligence research, 22, 457-479.

Fejer, H. N., & Omar, N. (2015). Automatic multi-document Arabic text summarization using clustering and keyphrase extraction. Journal of Artificial Intelligence, 8(1), 1.

Gholamrezazadeh, S., Salehi, M. A., & Gholamzadeh, B. (2009, December). A comprehensive survey on text summarization systems. In 2009 2nd International Conference on Computer Science and its Applications (pp. 1-6). IEEE.

Haboush, A., Al-Zoubi, M., Momani, A., & Tarazi, M. (2012). Arabic text summarization model using clustering techniques. World of Computer Science and Information Technology Journal (WCSIT) ISSN, 2221-0741.

Imam, I., Nounou, N., Hamouda, A., & Khalek, H. A. A. (2013). An ontology-based summarization system for arabic documents (ossad). International Journal of Computer Applications, 74(17), 38-43.

Jaradat, Y. A., & Al-Taani, A. T. (2016, April). Hybrid-based Arabic single-document text summarization approach using genatic algorithm. In 2016 7th International Conference on Information and Communication Systems (ICICS) (pp. 85-91). IEEE.

Jing, H., Barzilay, R., McKeown, K., & Elhadad, M. (1998, March). Summarization evaluation methods: Experiments and analysis. In AAAI symposium on intelligent summarization (pp. 51-59).

Jusoh, S. (2018). A STUDY ON NLP APPLICATIONS AND AMBIGUITY PROBLEMS. Journal of Theoretical & Applied Information Technology, 96(6).

Kanaan, G., Al-shalabi, R., & Sawalha, M. (2003). Full automatic Arabic text tagging system. In proceedings of the International Conference on Information Technology and Natural Sciences (pp. 258-267).

Kupiec, J., Pedersen, J., & Chen, F. (1995, July). A trainable document summarizer. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 68-73).

Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81).

Lloret, E. (2008). Text summarization: an overview. Paper supported by the Spanish Government under the project TEXT-MESS (TIN2006-15265-C06-01).

Lokhande, M. M. P., Gawande, M. N., Koprade, M. S., & Bewoor, M. M. TEXT SUMMARIZATION USING HIERARCHICAL CLUSTERING ALGORITHM AND EXPECTATION MAXIMIZATION CLUSTERING ALGORITHM.

Luhn, H. P. (1958). A business intelligence system. IBM Journal of research and development, 2(4), 314-319.

McKeown, K., & Radev, D. R. (1995, July). Generating summaries of multiple news articles. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 74-82).

Meselhi, M. A., Bakr, H. M. A., Ziedan, I., & Shaalan, K. (2014, December). Hybrid named entity recognition-application to Arabic language. In 2014 9th International Conference on Computer Engineering & Systems (ICCES) (pp. 80-85). IEEE.

Mhamdi, C., Al-Emran, M., & Salloum, S. A. (2018). Text mining and analytics: A case study from news channels posts on Facebook. In Intelligent Natural Language Processing: Trends and Applications (pp. 399-415). Springer, Cham.

Mohamed, S. S., & Hariharan, S. (2018). A performance study of text summarization model using heterogeneous data sources. International Journal of Pure and Applied Mathematics, 119(16), 2001-2007.

Nenkova, A. (2005). Automatic text summarization of newswire: Lessons learned from the document understanding conference.

Nenkova, A., & Passonneau, R. J. (2004). Evaluating content selection in summarization: The pyramid method. In Proceedings of the human language technology conference of the north american chapter of the association for computational linguistics: Hlt-naacl 2004 (pp. 145-152).

PadmaPriya, G., & Duraiswamy, K. (2012). An approach for concept-based automatic multi-document summarization using machine learning. Int. J. Appl. Inf. Syst, 3, 49-55.

Paice, A. D. B., & Moore, J. B. (1990). On the Youla-Kucera parametrization for nonlinear systems. Systems & Control Letters, 14(2), 121-129.

Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).

Radev, D. R., Hovy, E., & McKeown, K. (2002). Introduction to the special issue on summarization. Computational linguistics, 28(4), 399-408.

Radev, D. R., Jing, H., Styś, M., & Tam, D. (2004). Centroid-based summarization of multiple documents. Information Processing & Management, 40(6), 919-938.

Saleh, D. I. (2017). feature-based opinion summarization for arabic reviews. feature-based opinion summarization for arabic reviews.

Salton, G., Singhal, A., Mitra, M., & Buckley, C. (1997). Automatic text structuring and summarization. Information processing & management, 33(2), 193-207.

Waheeb, S. A., & Husni, H. (2014). Multi-Document Arabic Summarization Using Text Clustering to Reduce Redundancy. International Journal of Advances in Science and Technology (IJAST), 2(1), 194-199.

Waheeb, S. A., Khan, N. A., Chen, B., & Shang, X. (2020). Multidocument Arabic Text Summarization Based on Clustering and Word2Vec to Reduce Redundancy. Information, 11(2), 59.

Downloads

Published

2020-12-23

Issue

Section

Articles