Staying at the front line of literature: How can topic modelling help researchers follow recent studies?

Main Article Content

Joni Lämsä
Catalina Espinoza
Ari Tuhkala
Raija Hämäläinen


Staying at the front line in learning research is challenging because many fields are rapidly developing. One such field is research on the temporal aspects of computer-supported collaborative learning (CSCL). To obtain an overview of these fields, systematic literature reviews can capture patterns of existing research. However, conducting systematic literature reviews is time-consuming and do not reveal future developments in the field. This study proposes a machine learning method based on topic modelling that takes articles from a systematic literature review on the temporal aspects of CSCL (49 original articles published before 2019) as a starting point to describe the most recent development in this field (52 new articles published between 2019 and 2020). We aimed to explore how to identify new relevant articles in this field and relate the original articles to the new articles. First, we trained the topic model with the Results, Discussion, and Conclusion sections of the original articles, enabling us to correctly identify 74% (n = 17) of new and relevant articles. Second, clusterisation of the original and new articles indicated that the field has advanced in its new and relevant articles because the topics concerning the regulation of learning and collaborative knowledge construction related 26 original articles to 10 new articles. New irrelevant studies typically emerged in clusters that did not include any specific topic with a high topic occurrence. Our method may provide researchers with resources to follow the patterns in their fields instead of conducting repetitive systematic literature reviews.

Article Details

How to Cite
Lämsä, J., Espinoza, C., Tuhkala, A. ., & Hämäläinen, R. (2021). Staying at the front line of literature: How can topic modelling help researchers follow recent studies?. Frontline Learning Research, 9(3), 1–12.


Alexander, P. A. (2020). Methodological guidance paper: The art and science of quality systematic reviews. Review of Educational Research, 90(1), 6–23.

Bird, S., Loper E., & Klein, E. (2009). Natural language processing with Python. O’Reilly Media Inc.

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

Boyd-Graber, J. L., Hu, Y., & Mimno, D. (2017). Applications of topic models (Vol. 11). Now Publishers Incorporated.

Chen, X., Zou, D., & Xie, H. (2020). Fifty years of British Journal of Educational Technology: A topic modeling based bibliometric perspective. British Journal of Educational Technology, 51(3), 692–708.

Gruber, H., Hämäläinen, R. H., Hickey, D. T., Pang, M. F., & Pedaste, M. (2020). Mission and scope of the Journal Educational Research Review. Educational Research Review, 30, 100328.

Hadwin, A. F. (2021). Commentary and future directions: What can multi-modal data reveal about temporal and adaptive processes in self-regulated learning? Learning and Instruction, 72, 101287.

Hew, K. F., Lan, M., Tang, Y., Jia, C., & Lo, C. K. (2019), Where is the ‘theory’ within the field of educational technology research? British Journal of Educational Technology, 50(3), 956–971.

Järvelä, S., & Rosé, C. P. (2020). Advocating for group interaction in the age of COVID-19. International Journal of Computer-Supported Collaborative Learning, 15(2), 143–147.

Lämsä, J. (2020). Developing the temporal analysis for computer-supported collaborative learning in the context of scaffolded inquiry [Doctoral dissertation, University of Jyväskylä]. JYU dissertations, 245.

Lämsä, J., Hämäläinen, R., Koskinen, P., Viiri, J., & Lampi, E. (2021). What do we do when we analyse the temporal aspects of computer-supported collaborative learning? A systematic literature review. Educational Research Review, 33, 100387.

Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G., & The PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med, 6(7), 1–6.

Murphy, P. K., Knight, S. L., & Dowd, A. C. (2017). Familiar paths and new directions: Inaugural call for manuscripts. Review of Educational Research, 87(1), 3–6.

Rehurek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks (pp. 45–50). ELRA.

Tuhkala, A., Kärkkäinen, T., & Nieminen, P. (2018). Semi-automatic literature mapping of participatory design studies 2006–2016. In Proceedings of the 15th Participatory Design Conference (pp. 1–5). Association for Computing Machinery.