Frontline Learning Research Vol. 9 No. 4 (2021) 116 - 140
ISSN 2295-3159

How teachers interpret displays of students’ gaze in reading comprehension assignments

Carolien A. N. Knoop-van Campen1*, Ellen Kok2,3*, Roos van Doornik2**, Pam de Vries2**,
Marleen Immink2, Halszka Jarodzka3 & Tamara van Gog2

1Behavioral Science Institute, Radboud University, Netherlands
2Department of Education, Utrecht University, Netherlands
3Department of Online Learning and Instruction, Open University of the Netherlands

Article received 3 June 2021/ revised 16 September/ accepted 4 November/ available online 7 December


Reading comprehension is a central skill in secondary education. To be able to provide adaptive instruction, teachers need to be able to accurately estimate students’ reading comprehension. However, they tend to experience difficulties doing so. Eye tracking can uncover these reading processes by visualizing what a student looked at, in what order, and for how long, in a gaze display. The question is, however, whether teachers could interpret such displays. We, therefore, examined how teachers interpret gaze displays and perceived their potential use in education to foster tailored support for reading comprehension. Sixty teachers in secondary education were presented with three static gaze displays of students performing a reading comprehension task. Teachers were asked to report how they interpreted these gaze displays and what they considered to be the promises and pitfalls of gaze displays for education. Teachers interpreted in particular reading strategies in the gaze displays quite well, and also interpreted the displays as reflecting other concepts, such as motivation and concentration. Results showed that teachers’ interpretations of the gaze displays were generally consistent across teachers and that teachers discriminated well between displays of different strategies. Teachers were generally positive about potential applications in educational practice. This study provides first insights into how teachers experience the utility of gaze displays as an innovative tool to support reading instruction, which is timely as rapid technological developments already enable eye tracking through webcams on regular laptops. Thus, using gaze displays in an educational setting seems to be an increasingly feasible scenario.

Keywords: Gaze displays; Teachers; Reading strategies; Secondary education; Eye tracking

Info corresponding author email: Doi:

1. Introduction

Reading comprehension is a crucial skill for academic success (Murnane et al., 2012). As such, reading comprehension plays an important role in nearly all subjects in secondary education. To optimally support students’ reading comprehension, teachers should ideally provide students with adaptive support that is tailored to their personal needs (Van de Pol et al., 2019). However, since reading is mainly a covert process, in practice it can be difficult for teachers to get insight into students’ needs and tailor their instructions accordingly. Eye tracking might provide an innovative tool to visualize those covert processes, as it allows for constructing gaze displays. Gaze displays are visualizations of students’ eye movements during reading. Such displays have the potential to provide teachers with insights in the reading strategy used by a student, and as such, in the efficiency of the reading process. While eye tracking is currently not yet widely available to teachers, it can be expected to become a viable option in the near future: for instance, rapid technological developments already enable eye tracking through webcams on regular laptops (Madsen et al., 2021). Eye-tracking is in fact already being used in the educational practice in a school in Germany (Böhm, 2021). As such, investigating whether gaze displays could be a useful tool to support reading instruction is timely. Whether it can be a useful tool for educational practice, however, depends first of all on how teachers interpret the displays and how they translate the information provided by the gaze display into action (Molenaar & Knoop-van Campen, 2019), and secondly, on what promises and pitfalls of gaze displays teachers perceive. Therefore, the present study aims to explore how teachers interpret students’ gaze displays of reading assignments and perceive their usefulness.

1.1 Reading comprehension

Reading comprehension can be defined as “the understanding of the meanings of words as they are used in sentence contexts, comprehension of sentences, and the acquisition of new information from passages of prose” (Guthrie & Mosenthal, 1987, pp. 291-292). In this process, a mental construction of the meaning of the text is built with new information acquired from the text (Guthrie & Mosenthal, 1987; Pikulski & Chard, 2005). Efficient reading comprehension combines accuracy and reading time: Students quickly and correctly comprehend a text.

In building such a mental construction, reading strategies are important (Afflerbach et al., 2008). Reading strategies can be defined as purposeful, goal-directed actions which readers perform to achieve an established reading objective (Afflerbach et al., 2008). The choice for a certain reading strategy depends largely on the reading objective (Guthrie & Mosenthal, 1987; Urquhart & Weir, 1998). Reading strategies can be divided into two main categories: selective reading strategies and intensive reading strategies (Liu, 2010).

The selective strategy is used when the reader is looking for certain information (e.g., the answer to a specific question) without having to understand the rest of the text (Guthrie & Mosenthal, 1987; Krishnan, 2011). The parts of the text that do not contain the desired information are skipped (Liu, 2010). The intensive reading strategy is often used for making a summary of the text or when remembering a text for learning purposes (Katalayi & Sivasubramaniam, 2013; Krishnan, 2011; Liu, 2010). When applying this strategy, the reader reads the entire text as all information is necessary (Liu, 2010; Urquhart & Weir, 1998).

Effective regulation of reading strategies is an important aspect of academic achievement (Andreassen et al., 2017; Salmerón et al., 2017). There are various evidence-based best practices for reading comprehension instruction (Gambrell et al., 2011). Ideally, teachers have a balanced reading comprehension program (Duke & Pearson, 2009) which includes both explicit strategy instruction (Duffy, 2002) as well as support in the form of modelling reading strategies in action (Schutz & Rainey, 2020) and is tailored to students’ needs.

1.2 Visualizing reading comprehension with gaze displays

In practice, it can be difficult for teachers to tailor their instruction to the needs of the students. Since reading is a covert process, it is difficult for teachers to determine how students read texts and whether they apply reading strategies correctly and efficiently. The automaticity of viewing behavior makes it hard to report (Kok et al., 2017; Võ et al., 2016), so students’ self-reports are relatively susceptible to either purposefully or accidentally false answers (Benfatto et al., 2016; Shute & Zapata-Rivera, 2012). Teachers can have students read text aloud, but this is a time-consuming intervention and not suitable for classroom lessons. Therefore, the information that teachers can draw upon to tailor their instruction is often limited.

A way to uncover reading strategies is by capturing readers’ eye movements during reading using eye tracking and then visualize their eye movements in gaze displays (De Koning & Jarodzka, 2017; Knoop-van Campen et al., 2021; Van Gog & Jarodzka, 2013). A gaze display of a student’s reading behavior is a condensed visualization of large data sets of his/her eye-movement recordings, which are x- & y-coordinates of the focus of the eyes on a computer screen. Like learning analytics, they provide condensed summaries of large amounts of information (Almosallam & Ouertani, 2013).

In recent years, gaze displays have found several applications in education and educational research. For instance, visualizations of teachers’ gaze have been used to show learners where the teacher is looking (eye movement modeling examples), and this has been shown to often improve their learning from video examples (Chisari et al., 2020; Jarodzka et al., 2013; Mason et al., 2015; Scheiter et al., 2018; Van Gog et al., 2009). Furthermore, displays of students’ own gaze have been used to improve their self-assessment and self-regulated learning (Donovan et al., 2008; Eder et al., 2020; Henneman et al., 2014; Kok et al., 2017; Kostons et al., 2009).

Thus far, however, research on the use of gaze displays as a tool for teachers is limited. Existing research shows that people can interpret gaze displays of others to a certain extent, but interpretation performance differs between different tasks and stimuli. Specifically, studies indicate that people can identify search goals (Zelinsky et al., 2013), preferences (Foulsham & Lock, 2015), and deduce a given task (Bahle et al., 2017; Van Wermeskerken et al., 2018) from displays of someone else’s gaze. Emhardt et al. (2020) presented participants with gaze displays of students who answered multiple-choice questions about graphs. They found that participants could infer from those displays which answer a student selected, and if students were confident in their answers. In a study with teachers, Špakov et al. (2017) developed several different visualizations of gaze behavior of students learning to read and found that overall, teachers appreciated them, and considered them informative. However, they did not investigate how teachers interpreted those gaze displays, i.e., which information they extracted from the display, whereas teachers’ interpretations of this type of information are critical as they influence how they integrate the information in their professional routines (cf. Knoop-van Campen & Molenaar, 2020). To dive deeper into this question, we approach the interpretation of gaze displays with the Learning Analytics Process Model (Verbert et al., 2014). This provides a suitable framework for systematically analyzing teachers’ interpretations (Verbert et al., 2014).

The Learning Analytics Process Model distinguishes various stages (Van Leeuwen et al., 2021; Verbert et al., 2014). The awareness stage entails teachers becoming aware of the information. The interpretation stage involves teachers asking themselves questions and reflecting on the information they see, and trying to provide answers to these questions by interpreting and creating new insights into the data. Lastly, the enactment stage entails teachers using their interpretations and insights to induce new meaning or even change behavior (Verbert et al., 2014). How teachers interpret and use (learning analytics) information is strongly impacted by their experiences and routines (Molenaar & Knoop-van Campen, 2019). Gaze displays are currently not used in secondary education, so teachers do not have any experience in the use of gaze displays. Their interpretations might be arbitrary and undirected, possibly resulting in varied and contradicting interpretations of gaze displays. These interpretations are important, as they determine the contents of teachers’ actions realizing tailored reading instruction (cf. Jivet et al., 2017; Molenaar & Knoop-van Campen, 2019).

In sum, gaze displays seem to be a promising tool to provide information on covert processes like reading comprehension. Findings on how people in general interpret gaze displays, suggest that teachers might be able to extract information on students’ reading strategies from their gaze displays and compare that to their idea of an effective (desired) strategy. However, making inferences about students’ reading comprehension is arguably a much more complex task than e.g. inferring their preference for an object or multiple-choice answer from a gaze display (in which people tend to look more towards the preferred object or answer option), as it requires interpreting patterns of eye movements. In addition, teachers are known to differ largely in their way they interpret process data (Molenaar & Knoop-van Campen, 2019) which, in the case of reading comprehension tasks, may depend even more based on their own vision of (the importance of) efficient reading and reading strategies. Thus, it is unclear which information teachers would be able to extract from gaze displays of students’ reading behavior, and whether this is consistent between teachers.

1.3 The present study

The present study investigated how teachers interpret gaze displays and whether and how they would want to use gaze displays in teaching reading comprehension. Gaze displays are an innovative tool that could potentially support reading instruction, and the questions addressed in this study are timely as eye-tracking technology is rapidly developing and might become available for use in schools in the near future (as mentioned earlier rudimentary eye tracking is presently already enabled through webcams on regular laptops). Moreover, these questions are not only relevant for reading comprehension research and practice, but will also add to eye-tracking research on the use of gaze displays, by investigating whether people can also interpret gaze displays of more complex tasks.

Specifically, we examined whether teachers could discriminate between gaze displays that show different reading strategies and whether there was consistency among teachers in interpreting the same gaze displays. If teachers discriminate between gaze displays that reflect different reading strategies (i.e., distinguish between displays in their interpretation of them), this reflects that the visual information in the displays is interpreted in terms of (reading) processes that differ between students. If teachers are consistent in their interpretation of the same display (i.e., teachers agree in their interpretation of a display), this indicates that extracted information is interpreted similarly between teachers. Finally, we examined the promises and pitfalls teachers reported in working with gaze displays.

2. Method

2.1 Participants

Participants were teachers who teach the subject Dutch at secondary schools (in the senior general and/or university preparatory tracks) across the Netherlands and as such, are knowledgeable ofreading comprehension strategies. All schools in the Netherlands in which senior general and/or university preparatory is taught were invited by e-mail to participate. Informed consent was provided by 125 participants. Only participants who answered all questions for at least one display were included, which totals 60 participants (46 female). The exact number of participants differs between analyses and is reported in the results section. For the full sample, teachers’ mean age was 45.4 (SD = 12.8) and they reported an average of 13.1 (SD = 9.2) years of working experience (range 0-38 years).

2.2 Design

An explorative approach with open questions was used to elicit thoughts from teachers about gaze displays. In a questionnaire, secondary school teachers were presented with three different gaze displays of students who completed a reading comprehension assignment (see Figure 1). Teachers typed a response to three open questions per display and two open questions about displays in general. The Learning Analytics Process Model (Verbert et al., 2014) was used to query teachers’ awareness of the gaze displays (what did teachers observe), their interpretation (how did teachers interpret this information), and their enactment (how did teachers intend to act upon this information).

Concerning the question of how teachers interpret gaze displays, we coded their answers and examined whether teachers discriminate between gaze displays of different students and whether there is consistency among teachers regarding interpreting the same gaze displays. Finally, we examined the promises and pitfalls teachers reported in working with gaze displays by asking them about their views on implementing gaze displays in general in their reading comprehension education.

Figure 1c. Gaze display Z.

2.2 Instruments

2.2.1 Gaze displays

Three static displays were used (see Figure 1). The displays were images of the eye-movement locations during the full trial. The gaze displays were taken from an earlier study (Knoop-van Campen et al., 2021). In that study, students received reading comprehension assignments which they made voluntary without time limit. Students received a reading comprehension task (“Which advantages of self-driving trucks are mentioned in the text?”), and subsequently were presented with the text in which they had to find the answer. The text consisted of three paragraphs (three Areas of Interest: AOIs) and was preceded by an open-ended question of which the answer is situated in the second paragraph (bottom left). After reading the text, they typed in their answer (without the opportunity to visit the text again).

An SMI RED-500 eye-tracker (SMI vision, Berlin, Germany) captured students’ eye movements during reading at 250 Hz. Gaze displays were developed in BeGaze 3.7 (SMI vision, Berlin, Germany) using the scanpath utility. Fixations were shown as circles with the size of the circle corresponding to the fixation duration (500 ms = 1 degree). Saccades were shown as lines between fixations. Fixations and saccades were defined using SMI’s standard velocity-based event-detection algorithm with peak velocity threshold of 40 degrees/second and minimum fixation duration of 50 ms.

The three gaze displays varied on the type of reading strategy used. The reading strategy (intensive or selective) was quantified with a so-called disparity score, which captures the duration of fixations across different parts of the text (see (Knoop-van Campen et al., 2021, for more information). This disparity score was calculated based on the standard deviation of the weighted fixation duration times in the three paragraphs. When the standard deviation is low, an equal amount of attention was given to all three paragraphs, while a high standard deviation indicates that the student focused mostly on one part of the text. This disparity score could vary between 0 (attention evenly divided over the three AOI’s: SD of 33%, 33%, and 33% is 0) and 57 (attention 100% focused on one AOI: SD of 100%, 0%, and 0% is 57). Based on an experts’ cut-off point and hand-coded validation, disparity scores below 14 were considered intensive reading strategies, and above 14 were considered selective reading strategies.

The three selected displays showed a clear intensive strategy (display X: disparity score = 1.42), a clear selective strategy (Display Y: disparity score = 25.25) and a borderline intensive strategy (display Z: disparity score = 11.73). Data quality was expressed in the tracking ratio (percentage of time samples where a valid x- & y-coordinate was recorded: display X: 97%, display Y: 98%, display Z: 100%) and deviation scores as retrieved using a four-point validation procedure after nine-point calibration (M = .62, SD = .53.

2.2.2 Measures

To elicit thoughts from teachers about their awareness, interpretation, and enactment of the gaze displays, three open-ended questions reflected these phases of the Learning Analytics Process Model (Verbert et al., 2014), see Table 1. To optimally grasp teachers’ motives, opinions on and attitudes towards gaze displays, the open-ended questions were phrased according to the interpretive paradigm (Tijmstra & Boeije, 2011) in such a way that allowed teachers to elaborate on their thoughts (e.g., “Please feel free to describe everything that comes to mind.”).

Table 1

Questions for Awareness, Interpretation, and Enactment

To collect teachers’ perceived promises and pitfalls of gaze displays, also open-ended questions asked whether they thought gaze displays could be useful in their educational practices. First, they were asked “Do you feel that gaze displays of your students would help you as a teacher to have a deeper, better, or other insight into the reading behavior of your students? Why do you think so?”. Then, they were asked “Would you use gaze displays of your students (in the future) if this would be easy to realize? For which goal would you use those displays? Which concrete actions would you want to realize?”. Again, teachers were invited to write down their answers.

2.2.4 Procedure

The online survey was developed in Qualtrics (Qualtrics, Provo, UT). The online survey mode enabled teachers to complete the survey at any time and place that was suitable for them within three weeks after receiving the invitation. Teachers spent on average approximately 23 minutes to complete the survey.

In this questionnaire, informed consent was given, following which participants answered demographic questions. Next, the different components of gaze displays were shortly explained (i.e., circles show where a person looks, larger fixations represent longer reading times, lines shown the route over the stimulus). Then the reading comprehension question and the accompanying text were displayed so teachers could become familiar with the material and knew students’ reading objective. Next, teachers were presented with consecutively the three gaze displays with corresponding questions (displays were anonymized, so teachers did not receive any student characteristic). The order in which the three gaze displays were presented, was randomized. Last, teachers were asked to comment on the usability of gaze displays in education in general.

2.2.5 Data analyses

The Learning Analytics Process Model (Verbert et al., 2014) was used as the basis for coding the written responses of awareness, interpretation, and enactment of gaze displays. Additionally, open coding of all answers was conducted, to which (if necessary), theoretically relevant codes were added (such as those for strategies). To enable expressing consistency and discrimination, we created contrasting code scores for awareness and interpretation. For example, a score of 1 was given on the code ‘Title’ if the teacher mentioned that the student looked at the title, and a score of -1 was given if the teacher mentioned that the student did not look at the title. If the teacher did not mention the title at all, no score was given for that code. See Appendix A for the codebook including coding instructions and examples.

Coding took place in three rounds and by four coders. First, all four coders each coded the same ten cases and compared and discussed their answers, based on which the codebook was adapted. In round two, each coder coded 25% of the data plus 20% (in total 31 cases) as a second coder. The average Krippendorffs Alpha1 was 0.56 and the average percentage overlap was 92%. All variables (12 variables in total) with a Krippendorffs Alpha <.60 or percentage overlap <.80 were discussed. In round 3, these variables were recoded and a new 20% of overlapping cases were coded. One variable with a Krippendorffs Alpha of 0 was removed because only a few participants mentioned it. The final average Krippendorffs Alpha was 0.7 and the average percentage overlap was 94.1%2 .

For analyses, scores of 0 were defined as system-missing values, and subsequently, chi-squared tests were used to quantify teachers’ discrimination of displays. We considered a significant chi-square test as support that teachers discriminate between gaze displays on that code because it means that the pattern of -1 scores and 1 scores differs between gaze displays and that enough teachers mentioned this code to have the statistical power to detect differences. The average consistency within a code was the percentage of teachers who mentioned the most-mentioned score (either -1 or 1), averaged over the three displays. Consistency ranged between 50% (equal numbers of answers were scored as -1 or 1) to 100% (all answers were scored the same, as either -1 or 1).

The coding process for the perceived promises and pitfalls of gaze displays was similar to that of awareness, interpretation, and enactment. Each of the four coders coded 25% of the data and an additional 31 cases as a second coder (63% of data double coded). The average Krippendorffs Alpha was 0.71 and the percentage overlap was 93%. The three variables with a Krippendorffs Alpha < .60 or a percentage overlap of < .80 were discussed. Two of those were removed because the code was only used four times. The third variable was coded again and all cases were double-coded. However, Krippendorffs Alpha was negative after recoding, and since this code was also quite uncommon, it was dropped too. The final Krippendorffs Alpha thus stayed 0.71 with 93% overlap. Answers to the two questions were originally coded separately for each question, but most codes were combined since teachers reported pitfalls and promises as a response to both questions. See Appendix A for the final codebook.

3. Results

3.1 Awareness phase

3.1.1 Discrimination

Table 2 provides an overview of the frequency of scores per code per display.

Table 2

Frequency of Scores per Code about Awareness

The main differences between the three displays were in the way the respective students looked at the title, paragraph 3, and the image (see Figure 1). These differences were reflected in the remarks of the teachers (see Table 2): They discriminated between the three gaze displays for the title, paragraph 3, the image, and whether or not the student looked at everything, but not for subheading 1, subheading 2, paragraph 2, important words, and rereading. Surprisingly, teachers also discriminated between the three displays for paragraph 1, whereas the displays showed no obvious difference. This seems to be caused by more teachers mentioning paragraph 1 in displays Y and Z versus only a few teachers mentioning paragraph 1 in display X.

3.1.2 Consistency

Figure 2 shows the consistency (averaged over three displays) in each of the ‘awareness’ codes. A high consistency indicates that most teachers expressed observing the same viewing behavior. The three most consistent codes were image, paragraph 2, and title. For each of those, almost 100 % of teachers’ remarks were in the same direction. Variables in which teachers are least consistent were subheadings 1 and 2, and important words.

Figure 2. The Average Consistency within Awareness Codes. Note. Average consistency within a code is expressed as the average percentage of teachers who mention the most-mentioned score (either -1 or 1). A higher percentage means more consistency in answers, minimum consistency is 50% (equal number of remarks scores as -1 and as 1).

3.2 Interpretation phase

3.2.1 Discrimination

The main difference between gaze displays was in the reading strategy (see method section). Table 3 shows the frequencies with which each of the reading strategies were mentioned for each display. A Chi-squared test showed a significant discrimination between displays, χ2(6) = 37.12, p < .001, which is in line with the eye-tracking data: Display X shows an intensive strategy, display Y shows a selective strategy, and display Z showed a borderline intensive strategy.

Table 3

Frequencies for the Reading Strategies

Apart from those overall reading strategies, teachers made remarks about more specific reading strategies, like the use of headings and signal words as strategies (see Table 4). Teachers discriminated between displays for the use of signal words but not for the use of headings.

Next to reading strategies, teachers also wrote down other interpretations of the gaze displays (see Table 4). Teachers discriminated between displays for confidence and reading time but not for task performance, efficiency, concentration, and motivation.

Table 4

Frequency of Scores for each Code about Interpretation

3.2.2 Consistency

Figure 3 shows the average consistency within the interpretation codes. Consistency was highest for concentration, motivation, and confidence. Consistency was lowest (but still relatively high) for task performance, use of headings, and the use of signal words.

Figure 3. The Average Consistency within Interpretation Codes.

Note. Consistency (averaged over the three displays) within a code is expressed as the average percentage of teachers who mention the most-mentioned score (-1 or 1). Higher means more consistency in answers.

Table 3 shows consistency in the remarks regarding reading strategies. Display X shows relatively low consistency in codes (41%), in contrast to displays Y and Z (consistency respectively 76% and 73%), where most teachers remarked that a selective strategy was used. The relative low consistency on display X could be explained by the finding that the score ‘no strategy’ was given to sentences like ‘reads the full text without a clear strategy’ (P017X). In display X, the scores ‘intensive strategy’ and ‘no strategy’ together make up 79% of the remarks. 3.3 Enactment phase

In the enactment phase, the main contrast was found between teachers who deem no actions necessary and those that would act on the gaze display. Additionally, several teachers mentioned that they need more information than just the gaze display. Table 5 provides an overview of the scores per display.

Table 5

Frequency of Enactment Codes

3.3.1 Discrimination

A chi-squared test was executed to investigate whether teachers would discriminate between displays regarding either initiating or not initiating action (any action vs. no action needed). The chi-squared test showed significant discrimination between the three displays, χ2(2) = 10.610, p = .005. Teachers were most likely to act on displays X and Z, and less so on display Y.

3.3.2 Consistency

The consistency was lowest for display Y, 72% of teachers would act on the displays, whereas for both displays X and Z almost all teachers (93%) would act on the displays.

3.4. Promises and pitfalls of gaze displays in reading education

Finally, we asked teachers which general promises and pitfalls they saw in working with gaze displays to support reading instruction. Most teachers responded positively (n = 31, 61%) or moderately positive (n = 15, 29%) to the question of whether the gaze displays would help them gather a deeper, better, or other insight into the reading behavior of their students. Only five teachers (10%) said displays would not support them. Furthermore, most teachers expressed a willingness to use gaze displays if this would be easy to realize. A total of 29 teachers (57%) would use them, 17 (33%) would maybe use them, 3 (6%) would not use them, and 2 teachers (4%) did not answer this question.

Promises of gaze displays mentioned by teachers were to give them insight into students reading behavior (n = 44, 86%). Additionally, part of the teachers said they would use displays to provide students with an insight into their own reading behavior, for example as feedback (n = 18, 35%). Next to that, several other applications were mentioned, such as supporting reading motivation and stimulating parent engagement. An important pitfall voiced by teachers was that they lack knowledge about eye tracking and feel that they need more training in gaze display interpretation (n = 11, 22%). Teachers also voiced practical concerns such as lack of time and resources (n = 9, 18%).

4. Discussion

We investigated how teachers interpret and would implement gaze displays in teaching reading comprehension. Using the Learning Analytics Process Model (Verbert et al., 2014) we examined whether teachers discriminated between gaze displays and whether there was consistency among teachers.

4.1 Discrimination and Consistency in Remarks about Gaze Displays

In the awareness phase, the teachers’ pattern of remarks largely mirrors the actual pattern of eye-tracking data. This is reflected in the discrimination between gaze displays that differ and the limited discrimination in codes if gaze displays are minimally different. This suggests that teachers are indeed able to interpret the gaze displays in terms of the coded concepts. Inconsistency among teachers seems to stem from expectations of teachers and/or from data-quality issues. For example, whereas most teachers said that paragraph 3 is hardly read, some teachers said that paragraph 3 was read more than expected. E.g., “[the student] looks at section 4 of 5 more intensively, whereas those are not necessary to answer the question” (P004Z). Thus, whereas remarks coded as awareness might reflect just what can be seen, they are already influenced by the expectations that teachers have about how the student should look at the text. This is in line with the findings of Knoop-van Campen and Molenaar (2020) that teachers’ use of learning analytics information is embedded in their professional routines and expectations.

In the interpretation phase, many teachers interpreted gaze displays in terms of reading strategies. They discriminated correctly between displays and they were quite consistent in that. This supports the work of Knoop-van Campen and colleagues (2021) by showing that teachers can extract reading strategies from gaze displays. More detailed information about reading strategies such as the use of headings and signal words is gathered from the gaze displays, although not very consistently. Interestingly, teachers also made inferences about other aspects of students’ performance from the displays. For instance, concentration was quite consistently mentioned, although teachers did not discriminate between displays and generally interpreted the displays as showing low concentration. As we did not have information about students’ concentration, it is not possible to judge the accuracy of these aspects of teachers’ interpretations. It is important to investigate in future research to what extent gaze displays reliably reflect other aspects of study behavior or task performance, such as concentration, confidence, or invested effort and to what extent teachers can reliably pick up on this. On the one hand, it is not unlikely that these aspects of study behaviour can be inferred to a certain extent from specific eye movement patterns (cf. Emhardt et al., 2020; Rosengrant et al., 2021). On the other hand, we also know from research on teachers’ judgments of students’ performance, that teachers have a strong inclination to use more generic student characteristics in making those judgments (e.g., the motivation or effort they generally display in class), and that they will even ‘fabricate’ student characteristics when they lack information on a student’s identity. That is, when asked to judge students’ text comprehension or math performance based on performance data (i.e., a causal diagram the student filled out about the text), teachers tend to use the information they have about student characteristics, such as motivation, gender, intelligence, or concentration in class, to inform their judgments. However, teachers still do so even when they are presented with anonymized performance data, making statements about students having sloppy habits, having low concentration, being clever or uncertain, even though they did not know who the student was (Oudman et al., 2018; Van de Pol et al., 2021). Previous research in the classroom has shown that learning analytics can decrease teachers’ bias towards students (Knoop-van Campen et al., 2021). If teachers could reliably infer other aspects of study behavior or task performance (such as students concentration or effort on that specific task) from gaze displays, this could support teachers in providing better feedback and guidance on students learning processes.

In the enactment phase, most teachers would act on the displayed viewing behavior in some way, although teachers did discriminate between the three displays. Consistency was thus relatively high. Most actions that teachers mentioned concerned reading strategies. For example, explaining or modelling reading strategies to the student or adapting the task such that other reading strategies would be useful. These results support the notion of Xhakaj et al. (2017) that teachers can act on information on a display by altering their instruction, and that this positively affects their teaching (Knoop-van Campen et al., 2021). Apart from that, several teachers mention that the use of gaze displays alone does not provide enough information to act on.

Whereas teachers showed a certain amount of discrimination between displays and consistency within displays in all three stages, discrimination and consistency are still far from perfect in our sample. However, note that we did not train teachers to interpret gaze displays. Most teachers had never seen gaze displays before. This interpretation should thus be considered the lower boundary, and with more information, training, and practice, performance is likely to improve (Rienties et al., 2018).

4.2 Promises and pitfalls of gaze displays in reading education

Overall, the teachers were positive about the application of gaze displays in practice. They see many possible applications, but also acknowledge practical problems, even though these might decrease as time passes by and with technological improvements. Teachers saw applications of gaze displays both for their monitoring of student’s reading behavior (our intended application), but also considered gaze displays as potential feedback to students, which is an interesting idea that warrants further research (see e.g., Henneman et al., 2014, for an example of gaze as feedback in emergency medicine).

4.3 Limitations and Future Research

The present study had some limitations and also leads to suggestions for future research. First, interpretation performance is dependent on the display itself (Van Leeuwen et al., 2019). In this study, we selected displays of data that were not perfect. This is likely to impacte performance, but is also likely to happen in classroom situations, thus providing a realistic reflection of classroom practice. Also, the design of the display may have impacted how teachers interpreted it. For example, Van Wermeskerken and colleagues (2018) found differences between static and dynamic displays regarding participants’ interpretation performance. Furthermore, gaze behavior is not always straightforward. For example, displays X and Y showed clear examples of the two reading strategies, and indeed, teachers were relatively consistent in describing these strategies. Display Z showed more of a borderline strategy, which was reflected in a lower consistency in teachers’ remarks. As such, it would be relevant to investigate how characteristics of gaze displays impact teachers’ interpretation performance.

Secondly, due to the current set-up with open questions, it was not possible to assess the accuracy of most statements that teachers made regarding the eye-tracking data. Since the study aimed to unveil teachers’ thoughts about gaze displays (i.e., what information they would extract from a gaze display), we do not have the information from students regarding, for example, their motivation or concentration. In addition, as we asked teachers to elaborate on the presented displays, it was not feasible to use more than a few displays in the present study. Future research could seek a more confirmative approach, in which more displays are used that vary systematically over a larger set of measurements (e.g., strategies, reading performance, motivation, confidence) and in which teachers are asked to rate these measures (see for example the work of Emhardt et al., 2020). Future research could examine whether patterns of viewing behavior teachers distinguish relate to actual student behavior. For instance, teachers interpreted the use of headings and signal words from the gaze displays. Think-aloud studies could be used to investigate whether students used those strategies (Ericsson & Simon, 1993).

Third, our use of contrasting codes allowed us to make tentative statements about teachers’ judgment accuracy: High consistency combined with high discrimination is necessary if teachers were to validly use gaze displays to act upon. While this approach sensitized us to those contrasts, at the same time it may have diminished our attention to comments that were hard to capture in contrasts.

Finally, teachers were generally very positive about the gaze displays. However, participants who did not like the displays had probably selected themselves out earlier in the questionnaire. Indeed, only 41% of the participants who provided informed consent finished the survey. Further research is required to understand the willingness to use gaze displays in a representative sample of the population.

4.4 Practical applications

This study provides initial steps towards using gaze displays in the classroom to improve students’ reading comprehension, a skill that is central in many school subjects. The findings are promising, showing that even untrained teachers seem able to interpret gaze displays of reading assignments to some extent. Training and guidelines could be developed to aid teachers in the interpretation of gaze displays as many teachers mentioned that they would like to know more about how to interpret gaze displays. Although future research is needed (e.g., on optimal display design, impact of gaze displays on teachers’ actions in practice, and consecutive on student achievement), our results suggest that in the near future, gaze displays may indeed be used to support teachers in determining which reading strategy a student uses, information that cannot easily or reliably be retrieved another way. Although eye tracking technology is at present not yet widely available to schools, this can be expected to change in the near future, as low-cost webcam-based eye tracking solutions are being developed that can be used with regular laptops with camera-function (e.g., Madsen et al., 2021) and the quality of such build-in cameras increases swiftly. It is thus certainly plausible that in the near future, students’ eye movements can be tracked during reading tasks (or other learning tasks) and that information on their eye movements can be visualized for teachers and used in educational practices.

4.5 Conclusion

This study provides first insights into how teachers interpret gaze displays as a form of learning analytics in educational practice. Earlier literature showed that gaze displays can be interpreted in terms of the search target, viewing task, preference, or answer to a multiple-choice question (Bahle et al., 2017; Emhardt et al., 2020; Foulsham & Lock, 2015; Van Wermeskerken et al., 2018; Zelinsky et al., 2013). Our results add to and extend findings from earlier research by showing that gaze displays of reading strategies can be meaningfully interpreted by teachers, which is arguably a substantially more complex task than making inferences about people’s objects or answer preferences from gaze displays.

Teachers’ remarks about what they saw in the gaze displays (awareness phase) mostly overlapped with the actual pattern of eye-tracking data. We found that teachers interpreted reading strategies in the gaze displays (interpretation phase) quite well, and also interpreted them as reflecting other concepts, such as motivation and concentration. Teachers based their initiated actions on the gaze displays (enactment phase). Participants were generally positive about using gaze displays in education, both to inform themselves as well as feedback for students, although they mentioned practical issues and a need for more information as areas for improvement.

To conclude, gaze displays in reading education seem to be a promising form of learning analytics in education, by providing teachers insight into the reading process of their students, and teachers seem to be willing to incorporate gaze displays in their (future) education. Future research should further explore the possibilities of the use of gaze displays in educational practice.



*Shared first authorship

** Authors contributed equally

1Krippendorffs Alpha was calculated using the KALPHA macro in SPSS at the nominal measurement level (Hayes & Krippendorff, 2007) because it enables the inclusion of four coders.

2Note that Krippendorffs alpha is sensitive to low-prevalent scores, especially with binary variables (like our enactment codes). In this study, almost all codes have low-prevalent scores. So, in combination with the high percentage overlap, we consider the reliability acceptable.


Afflerbach, P., Pearson, P. D., & Paris, S. G. (2008). Clarifying differences between reading skills and reading strategies. The Reading Teacher, 61(5), 364-373.
Almosallam, E. A., & Ouertani, H. C. (2013). Learning analytics: Definitions, applications and related fields. Proceedings of the first international conference on advanced data and information engineering (DaEng-2013),
Andreassen, R., Jensen, M. S., & Bråten, I. (2017). Investigating self-regulated study strategies among postsecondary students with and without dyslexia: A diary method study. Reading and Writing, 30(9), 1891-1916.
Bahle, B., Mills, M., & Dodd, M. D. (2017). Human classifier: Observers can deduce task solely from eye movements. Attention Perception & Psychophysics, 79(5), 1415-1425.
Benfatto, M. N., Öqvist Seimyr, G., Ygge, J., Pansell, T., Rydberg, A., & Jacobson, C. (2016). Screening for dyslexia using eye tracking during reading. Plos One, 11(12), e0165508.
Böhm, M. (2021, September 14). Lesediagnostik und Leseförderung.
Chisari, L. B., Mockevičiūtė, A., Ruitenburg, S. K., van Vemde, L., Kok, E. M., & van Gog, T. (2020). Effects of prior knowledge and joint attention on learning from eye movement modelling examples. Journal of Computer Assisted Learning, 36(4), 569-579.
De Koning, B. B., & Jarodzka, H. (2017). Attention guidance strategies for supporting learning from dynamic visualizations. In R. Lowe & R. Ploetzner (Eds.), Learning from Dynamic Visualization: Innovations in Research and Application (pp. 255-278). Springer International Publishing.
Donovan, T., Manning, D. J., & Crawford, T. (2008). Performance changes in lung nodule detection following perceptual feedback of eye movements. Proceedings of SPIE - The International Society for Optical Engineering in San Diego, CA,
Duffy, G. G. (2002). The case for direct explanation of strategies. In C. C. Block & M. Pressley (Eds.), Comprehension instruction: Research-based best practices (pp. 28-41). Guilford.
Duke, N. K., & Pearson, P. D. (2009). Effective practices for developing reading comprehension. Journal of education, 189(1-2), 107-122.
Eder, T. F., Richter, J., Scheiter, K., Keutel, C., Castner, N., Kasneci, E., & Huettig, F. (2020). How to support dental students in reading radiographs: effects of a gaze based compare and contrast intervention. Advances in Health Sciences Education: Theory and Practice.
Emhardt, S. N., van Wermeskerken, M., Scheiter, K., & van Gog, T. (2020). Inferring task performance and confidence from displays of eye movements. Applied Cognitive Psychology, 34(6), 1430-1443.
Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data. MIT Press.
Foulsham, T., & Lock, M. (2015). How the eyes tell lies: Social gaze during a preference task. Cognitive science, 39(7), 1704-1726.
Gambrell, L. B., Malloy, J. A., & Mazzoni, S. A. (2011). Evidence-based best practices for comprehensive literacy instruction. In L. M. Morrow & L. B. Gambrell (Eds.), Best practices in literacy instruction (fourth edition) (Vol. 4, pp. 11-56). The Guilford Press.
Guthrie, J. T., & Mosenthal, P. (1987). Literacy as multidimensional: Locating information and reading comprehension. Educational Psychologist, 22(3-4), 279-297.
Hayes, A. F., & Krippendorff, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1(1), 77-89.
Henneman, E. A., Cunningham, H., Fisher, D. L., Plotkin, K., Nathanson, B. H., Roche, J. P., Marquard, J. L., Reilly, C. A., & Henneman, P. L. (2014). Eye tracking as a debriefing mechanism in the simulated setting improves patient safety practices. Dimensions of Critical Care Nursing, 33(3), 129-135.
Jarodzka, H., van Gog, T., Dorr, M., Scheiter, K., & Gerjets, P. (2013). Learning to see: Guiding students' attention via a model's eye movements fosters learning. Learning and Instruction, 25 (0), 62-70.
Jivet, I., Scheffel, M., Drachsler, H., & Specht, M. (2017). Awareness is not enough: Pitfalls of learning analytics dashboards in the educational practice. European Conference on Technology Enhanced Learning, Tallinn, Estonia.
Katalayi, G. B., & Sivasubramaniam, S. (2013). Careful reading versus expeditious reading: Investigating the construct validity of a multiple-choice reading test. Theory and Practice in Language Studies, 3, 877-884.
Knoop-van Campen, C. A. N., Doest., D. ter, Verhoeven, L., & Segers, E. (2021). The effect of audiosupport on strategy, time, and performance on reading comprehension in secondary school students with dyslexia. Annals of Dyslexia, 1-20
Knoop-van Campen, C. A. N. & Molenaar, I. (2020). How teachers integrate dashboards into their feedback practices. Frontline Learning Research, 8(4), 37-51.
Knoop-van Campen, C. A. N., Wise, A., & Molenaar, I. (2021). The equalizing effect of teacher dashboards on feedback in K-12 classrooms. Interactive Learning Environments, 1-17.
Kok, E. M., Aizenman, A. M., Võ, M. L.-H., & Wolfe, J. M. (2017). Even if I showed you where you looked, remembering where you just looked is hard. Journal of Vision, 17(12), 1-11.
Kostons, D., van Gog, T., & Paas, F. (2009). How do I do? Investigating effects of expertise and performance-process records on self-assessment. Applied Cognitive Psychology, 23(9), 1256–1265.
Krishnan, K. S. D. (2011). Careful versus expeditious reading: The case of the IELTS reading test. Academic Research International, 1(3), 25.
Liu, F. (2010). Reading abilities and strategies: A short introduction. International Education Studies, 3(3), 153-157.
Madsen, J., Julio, S. U., Gucik, P. J., Steinberg, R., & Parra, L. C. (2021). Synchronized eye movements predict test scores in online video education. Proceedings of the National Academy of Sciences, 118(5).
Mason, L., Pluchino, P., & Tornatora, M. C. (2015). Eye-movement modeling of integrative reading of an illustrated text: Effects on processing and learning. Contemporary Educational Psychology, 41, 172-187.
Molenaar, I., & Knoop-van Campen, C. (2019). How teachers make dashboard information actionable. IEEE Transactions on Learning Technologies, 12(3), 347-355.
Murnane, R., Sawhill, I., & Snow, C. (2012). Literacy challenges for the twenty-first century: Introducing the issue. The Future of Children, 3-15.
Oudman, S., van de Pol, J., Bakker, A., Moerbeek, M., & van Gog, T. (2018). Effects of different cue types on the accuracy of primary school teachers' judgments of students' mathematical understanding. Teaching and Teacher Education, 76, 214-226.
Pikulski, J. J., & Chard, D. J. (2005). Fluency: Bridge between decoding and reading comprehension. The Reading Teacher, 58(6), 510-519.
Rienties, B., Herodotou, C., Olney, T., Schencks, M., & Boroowa, A. (2018). Making sense of learning analytics dashboards: A technology acceptance perspective of 95 teachers.International Review of Research in Open and Distributed Learning, 19(5).
Rosengrant, D., Hearrington, D., & O’Brien, J. (2021). Investigating Student Sustained Attention in a Guided Inquiry Lecture Course Using an Eye Tracker. Educational Psychology Review, 33(1), 11-26.
Salmerón, L., Naumann, J., García, V., & Fajardo, I. (2017). Scanning and deep processing of information in hypertext: An eye tracking and cued retrospective think‐aloud study. Journal of Computer Assisted Learning, 33(3), 222-233.
Scheiter, K., Schubert, C., & Schüler, A. (2018). Self-regulated learning from illustrated text: Eye movement modelling to support use and regulation of cognitive processes during learning from multimedia. British Journal of Educational Psychology, 88(1), 80-94.
Schutz, K. M., & Rainey, E. C. (2020). Making sense of modeling in elementary literacy instruction. The Reading Teacher, 73 (4), 443-451.
Shute, V. J., & Zapata-Rivera, D. (2012). Adaptive educational systems. In P. J. Durlach & A. Lesgold (Eds.), Adaptive technologies for training and education (Vol. 7, pp. 1-35). Cambridge University Press.
Špakov, O., Siirtola, H., Istance, H., & Räihä, K. (2017). Visualizing the reading activity of people learning to read. Journal of Eye Movement Research, 10(5), 1-12.
Tijmstra, J., & Boeije, H. (2011). Wetenschapsfilosofie in de context van de sociale wetenschappen. Boom Lemma.
Urquhart, S., & Weir, C. (1998). Reading in a Second Language: Process, Product and Practice. Routledge.
Van de Pol, J., de Bruin, A. B. H., van Loon, M. H., & van Gog, T. (2019). Students’ and teachers’ monitoring and regulation of students’ text comprehension: Effects of comprehension cue availability. Contemporary Educational Psychology, 56, 236-249.
Van de Pol, J., Van den Boom-Muilenburg, S. N., & Van Gog, T. (2021). Exploring the relations between teachers’ cue-utilization, monitoring and regulation of students’ text learning. Metacognition and Learning, 1-31.
Van Gog, T., & Jarodzka, H. (2013). Eye tracking as a tool to study and enhance cognitive and metacognitive processes in computer-based learning environments. In R. Azevedo & V. Aleven (Eds.), International Handbook of metacognition and learning technologies (pp. 143-156). Springer Science+ Business media.
Van Gog, T., Jarodzka, H., Scheiter, K., Gerjets, P., & Paas, F. (2009). Attention guidance during example study via the model’s eye movements. Computers in Human Behavior, 25(3), 785-791.
Van Leeuwen, A. van, Knoop-van Campen, C. A. N., Molenaar, I., & Rummel, N. (2021). How teacher characteristics relate to how teachers use dashboards: Results from two case studies in K-12. Journal of Learning Analytics, 8(2), 6-21.
Van Leeuwen, A., Rummel, N., & van Gog, T. (2019). What information should CSCL teacher dashboards provide to help teachers interpret CSCL situations? International Journal of Computer-Supported Collaborative Learning , 14(3), 261-289.
Van Wermeskerken, M., Litchfield, D., & van Gog, T. (2018). What am I looking at? Interpreting dynamic and static gaze displays. Cognitive Science, 42(1), 220-252.
Verbert, K., Govaerts, S., Duval, E., Santos, J. L., Van Assche, F., Parra, G., & Klerkx, J. (2014). Learning dashboards: An overview and future research opportunities. Personal and Ubiquitous Computing, 18(6), 1499-1514.
Võ, M. L. H., Aizenman, A. M., & Wolfe, J. M. (2016). You think you know where you looked? You better look again. Journal of Experimental Psychology: Human Perception and Performance , 42(10), 1477-1481.
Xhakaj, F., Aleven, V., & McLaren, B. M. (2017). Effects of a teacher dashboard for an intelligent tutoring system on teacher knowledge, lesson planning, lessons and student learning. European Conference on Technology Enhanced Learning,
Zelinsky, G. J., Peng, Y., & Samaras, D. (2013). Eye can read your mind: Decoding gaze fixations to reveal categorical search targets. Journal of Vision
, 13 (14), 1-13.

Appendix A

Codebook and detailed overview coding process

For awareness, we coded whether teachers mentioned that the student looked at or paid attention to the separate AOIs (i.e., title, subheadings 1 and 2, section 1, 2, 3, and the image). A score of 1 was given if the teacher mentioned that the student looked (a lot) at this part of the stimulus, and a score of -1 was given if the teacher mentions that the student did not look at this part of the stimulus (also if the teacher mentioned ‘missing’, ‘looking only a little bit at’). In addition to the AOIs, we also coded looking at ‘everything’, ‘important words/sentences’, ‘rereading’ with scores 1 and -1. If a code was not mentioned or if, in rare cases, both scores could be applied, the score 0 (missing value) was applied.

A similar approach was used for interpretation, but here 1 refers to high (e.g., high performance) and -1 refers to low (e.g., low confidence). These scores were used for ‘confidence’, ‘performance’, ‘reading time’, ‘efficiency’, ‘use of signal words’, ‘use of headings’, ‘concentration’, and ‘motivation’. Note that for reading time, score 1 refers to quick reading and score -1 refers to slow reading. Finally, the type of strategy was scored as intensive, selective, general (any strategy, but unclear which strategy), or no strategy.

For enactment, the form of contrast described above was not possible, so for those codes, only 1 (code mentioned) or 0 (code not mentioned) were scored. The codes were ‘no action is needed’, ‘give a compliment’, ‘provide instruction (in the form of an explanation / modelling / another form)’, ‘adapt the assignment’, ‘need more information (from a conversation with student / the performance / other information)’, ‘I do not know how to act on this’, ‘other actions’. To make contrasting codes, we additionally recoded codes into ‘any action’ (any form of enactment mentioned) versus ‘no action’ (teacher writes that no action is needed or only the action ‘give a compliment’).

Instructions awareness, interpretation, and enactment

Code the three questions together, i.e., codes for awareness can also be scored if this information is provided in the second question. Code what is actually said and avoid making inferences, i.e., do not code ‘everything’ if a participant summed up all different parts, but only if the person actually said that ‘everything’ was read.

For awareness and interpretation, two scores are available: -1 and 1. Score 0 if this code is not mentioned (missing value). Score -1 if the negative form (see definitions) is mentioned and 1 if the positive form (see definitions) is mentioned. If something is mentioned, but it is unclear whether -1 or 1 should be coded, score as 0. For Enactment, only 0 (not mentioned) and 1 (mentioned) are possible.

If a person refers back to a previous answer, simply code that as everything not mentioned (as it is impossible to know which aspects of the other answer the person refers to).

All codes are identical for displays X, Y, and Z.  


Actual quotes as examples are italicized. Translations to Dutch can be found in brackets.