Koponen & Kokkonen

A Systemic view of the learning and differentiation of scientific concepts: The case of electric current and voltage revisited

Article received 12 February 2014 / revised 14 April 2014 / accepted 29 June 2014 / available online 3 July 2014

In learning conceptual knowledge in physics, a common problem is the incompleteness of a learning process, where students’ personal, often undifferentiated concepts take on more scientific and differentiated form. With regard to such concept learning and differentiation, this study proposes a systemic view in which concepts are considered as complex, dynamically evolving structures. The dynamics of the concept learning and differentiation is driven by the competition of model utility in explaining the evidence. Based on the systemic view, we introduce computational model, which represents the essential features of the conceptual system in the form of directed graph (DGM), where concepts are nodes connected to other conceptual elements (nodes) in the graph. The results of a DGM are then compared to the empirical findings to identify differentiation between concepts of electric current and voltage based on a re-analysis of previously published empirical findings on upped secondary school students’ learning paths in the context of DC circuits. The comparison shows that the model predicts and explains many relevant, empirically observed features of the learning paths of concept learning and differentiation, such as: 1) Context-dependent dynamics, 2) the persistence of ontological shift and concept differentiation, and 3) the effects of communication on individual learning paths. The systemic view and the DGM model based on it make these generic features of interest in concept learning and differentiation understandable and show that these features are associated with the guidance of theoretical knowledge. Finally, we discuss briefly the implications of the results on teaching and instruction.

Keywords: Concept learning; concept differentiation; ontological shift; complex system, directed graph model

Corresponding author: Ismo T. Koponen, Department of Physics, P.O. Box 64, FI-00014 University of Helsinki, Finland. ismo.koponen@helsinki.fi

Learning scientific concepts is a demanding and lengthy process, in which the learner’s initial and personal concepts and conceptions gradually change towards more scientific concepts in that they are part of an extensive and coherent knowledge system (theory), which regulates and constrains their use. Previous research (Lee & Law 2001; Reiner, Slotta, Chi & Resnick, 2000; Smith, Carey & Wiser, 1985) has raised the notion that learners seldom use concepts in the same sense as they are used in scientific knowledge. One particular but central question is proper concept differentiation. When two closely related concepts are linked to the same phenomenon, novice learners do not always properly understand them as different concepts. Rather, the concepts are confused and used in undifferentiated ways (Lee & Law, 2001; Reiner et al., 2000; Smith et al., 1985). The aim of the learning process then is to produce a clearer and more scientific understanding not only of how such concepts differ, but also of how they are related, a process referred to here as concept differentiation.

Concept differentiation has often been discussed from the viewpoint of “ontological shift”, which views that the ontological attributions are at the centre of concept learning, and changes in those attributions are the main mechanisms behind differentiation (Chi & Slotta, 1993; Chi, 2005, 2008). This position finds support in the notion that ontological commitments in concept development are deeply rooted in the psychological aspects of concepts (Murphy, 2004; Keil 1989). However, the ontological shift view has been criticized for overemphasising the role of static ontologies (Gupta, Hammer & Redish, 2010) and failing to pay proper attention to the role of theory in learning (Ohlsson, 2011).

In addition, when studying concept differentiation, one should understand that concepts must be shared and be communicable to other learners. Communicating and sharing of concepts is closely related to problem of knowledge convergence, discussed mostly in cases of the explanations convergence and seeking consensus and a common way to understand concepts and terms (Jeong & Chi, 2007; Weinberger, 2007). However, how communication affects the learning of scientific concepts and the differentiation process or, in general, which stages or steps of the learning paths communication could possibly affect, remains unclear.

Consequently, our understanding of the learning path in concept differentiation remains partially incomplete. One promising way to remedy this lack of understanding views learners’ concepts as complex structures and the learning process itself as a systemic process consisting of different conceptual elements and where those elements interact (Brown & Hammer, 2008; Koponen & Huttunen, 2013). The present study proposes a new way of synthesising different views by focusing explicit attention on concept learning and concept differentiation, so that the synthesis takes into account aspects of interest for personal concepts, such as the role of ontological attributions, and aspects relevant to scientific concepts, such as the communicability of concepts and their constrained, law-like use. Such synthesis, referred to here as the systemic view, sees concepts as complex structures. Different stages of concept learning, with partially differentiated concepts, are then seen as partial projections of the structure in different real situations; the projections are partial and incomplete mappings of more complete systems. On the level of personal concepts, the systemic view stems from recent views of the heterogeneity of concepts, which emphasise the diversity of roles of concepts in different cognitive processes (Machery, 2009). On the level of scientific concepts, the systemic view borrows much from the “dynamic frames” view of scientific concepts, where both ontological attributions and theoretical, law-like (nomic) knowledge are considered central to concept development (Anderssen & Nersessian, 2000; Andersen, Barker & Chen, 2006). In the systemic view, the learning process also requires a driving force or mechanism; this study suggests that the utility of models, through which the concepts are used, and the competition of models based on utility provides that mechanism (Ohlsson, 2009, 2011).

The systemic model is applied here to discuss and simulate concept differentiation and its generic features in one empirically well-studied case; the concepts of electric current and voltage. The generic features of interest are the robustness of certain simple forms of the concepts (often called as misconception or intuitive conceptions), the strong context dependence of these conceptions, the occurrence of ontological shift and its persistence once achieved, and the role of theoretical knowledge in concept differentiation and ontological shift. This study focuses on developing the theoretical background of the systemic model. To that end, we further develop the directed graph model that we recently introduced (Koponen, 2013). We embody the theoretical model by using re-analysis of empirical data of nine students’ learning processes, in groups of three (Koponen & Huttunen, 2013). We introduce a simulation model, based on directed graphs, to model the learning path and to reproduce the most important generic features of the empirical findings of concept differentiation.

Finally, we discuss some interesting implications for teaching that the model raises. The model presented here supports the view that ontological shift is not the primary agent in learning scientific concepts; rather, it stems from theoretical learning, driven by model utility. This means that instead of focusing on ontological training and on developing instructional methods based on it, attention should focus on how theoretical knowledge is introduced and applied in the learning process. Another important notion is the role of context in learning and how students are gradually introduced to more demanding tasks. The model results show that overly complex tasks cannot promote learning if students lack sufficiently advanced concepts; yet overly simple tasks lead to stagnation, where a learner gets stuck on simple models and unsophisticated concepts. What is needed is a learning path that is progressive and which demands use of complex models and concepts. According to the view presented here, the learner needs to receive theoretical knowledge through instruction and to see its utility in complex enough situations, thus avoiding “overlearning” of simple cases. This emphasises not only the teacher’s role, but also the importance of variation in contexts in which the knowledge is applied. These notions, based on the systemic view and on its computational embedding, therefore have direct practical consequences for how one should design learning paths and the role of teacher in them.

Pre-scientific concepts are often idiosyncratic, context dependent and difficult to communicate. Of course, scientific concepts, as used by advanced learners and experts, not only share some aspects with “personal” pre-scientific concepts, but also differ from them in important ways. One of the most important differences is that scientific concepts often refer to categories (entities or objects) that − like models − are themselves purely conceptual rather than categories within the reach of experience or observation, and their use is law-like (nomological) and constrained (Andersen & Nersessian, 2000; Andersen, Barker & Chen, 2006; Hoyningen-Huene, 1993). Nevertheless, personal and scientific concepts share features, in particular on the level of how theory or theory-like knowledge structures the sets of attributes that characterise the concepts.

Concept differentiation is a process where the sets of attributes that characterise and typify concepts become structured so that no other concept shares the same set and values of its attributes. In the case of scientific concepts, this requires that a given concept have law-like (i.e. nomological) relationships to other concepts. It is through these features that the concepts acquire the sharp descriptive power they hold in scientific theories (Andersen & Nersessian, 2000; Andersen et al., 2006; Hoyningen-Huene, 1993.) At the core of learning scientific concepts is a transformation process where personal, individual concepts which are meaningful to a learner himself or herself, but not easily communicable or meaningful to other learners, are transformed into concepts that are more communicable and, where consensus exists, how they can be used in relation to other concepts. This latter way of using concepts is already scientific in that normative and law-like rules govern the use of the concepts (the nomological use of concepts), and the attributes that characterise them are sharply identified. These brief notions suggest that a suitable theoretical underpinning must link the individual learning and use of concepts to the shared, public use of concepts; intra- and interpersonal levels of concept use and learning must be coupled.

2.1. Concepts: Personal and shared

Discussions of the learning process, where a student learns and acquires scientific concepts and becomes a fluent user of such concepts, must focus on differences in the ways in which concepts are understood when seen from the viewpoint of an individual’s personal cognition and learning, and when concepts are discussed as they are shared and used in scientific communities. In the former case, concepts are personal, often un-explicated, and seen as strongly context dependent (Carey, 2010; Gopnik & Meltzoff, 1997), whereas in the latter case, concepts are explicated elements of scientific theories, and their proper use is constrained by the knowledge system as a whole (Andersen, 2006; Andersen & Nersessian, 2000; Hoyningen-Huene, 1993). To highlight this difference between personal and shared scientific concepts, we use the terms “intrapersonal concepts” and “interpersonal concept”. Of these, the interpersonal concepts can be shared on the level of small and local groups, such as study groups in learning, or on the level of extended and global groups, such as scientific communities, in which case interpersonal concepts are simply called scientific concepts. The learning process, where an individual’s intrapersonal concepts acquire scientific character and are transformed into interpersonal ones, involves epistemic dimensions (the use of concepts in context of explanation) and communicative dimensions, where concepts are used in communication and consensus finding for what is explained and how. A description of the learning process from intrapersonal to interpersonal concepts requires one to have a model of concepts, which, at one end of the continuum, takes a form of an intrapersonal concept and in another end, as an interpersonal, scientific concept.

In psychology and cognitive science, two important viewpoints of interest here regarding intrapersonal concepts are concepts as prototypes and concepts as theories (Machery, 2009; Murphy, 2004; Smith & Medin, 1981). In the concepts-as-prototypes view, the prototype represents a certain class of entities or objects to which the concept refers, and the prototype is understood as a body of knowledge about the properties of the members in that class. However, such properties are assumed to be only statistical or probabilistic, and are not strictly necessary or sufficient by themselves to determine membership (Machery, 2009; Murphy, 2004). The statistical or probabilistic knowledge contained in the prototypes can be about either 1) the typicality of the category or 2) its cue-like properties. In both cases, a set of properties or attributes and their values indicate how likely or significant a given property is in regard to the identification of the concept (Smith & Medin, 1981). In concept learning, where concepts develop and are transformed, as in, for example, a concept combination process, new concepts emerging from the combination process inherit some − but not necessarily all − of the properties of the ancestor prototypes (Murphy, 2004). A reverse process of concept differentiation can be understood as a process where new concepts inherit partial or split sets of the properties of the original concepts.

From the viewpoint of concepts as prototypes, an important part of concept learning is to learn the concept’s ontological attributions, which determine to which ontological categories the concept refers (Keil, 1989; Murphy, 2004). The ontological shift theory of conceptual change (Chi & Slotta, 1993; Reiner et al., 2000; Slotta & Chi, 2006) addresses the way in which learners associate substance- and process-like attributes with the concepts they use. According to the ontological shift theory, many students’ learning difficulties originate from a misconceived ontological class (Chi & Slotta, 1993; Slotta & Chi, 2006).

The view of concepts-as-theories is based on psychological research, which views conceptual knowledge as theory-like (Carey, 2010; Gopnik & Meltzoff, 1997). The concepts-as-theory view focuses on the role of causal knowledge in the categorisation process and in concept learning. Concepts, in this view, are first and foremost carriers of causal knowledge about the properties of the members of classes to which the concepts refer. Therefore, causal knowledge is considered crucial for concept recognition and differentiation. Quite often, the role of causal knowledge is discriminative with regard to the attributes or properties attached to a concept (Machery, 2009; Murphy, 2004; Rehder, 2003).

These two different views of intrapersonal concepts can be thought of as two different ways to use concepts (Machery, 2009), thus reflecting the multifaceted aspects of concepts. Such multifacetedness can considered as a sign of a real cognitive difference between the various ways of using concepts (Machery, 2009) or as different projections (or mappings) of a more integrated, complex and generic system that projects differently in different real situations (Danks, 2010). Here, we adopt and further develop the latter viewpoint of concepts as systems projecting differently in different situations. The aspects of greatest interest in developing such a systemic view are: 1) attributes and sets of attribute values (as in the prototype view) and 2) causal and theoretical knowledge and its role in distinguishing the attributes in concept combination (as in the concepts-as-theories view).

In learning, concepts must be shared with other learners, instructors and teachers; concepts must be interpersonal. When concepts are shared, there must be common agreement of referents of the concepts, the ways in which they refer to and the ways to use concepts; there must be certain norms of usage. In particular, when concepts are scientific concepts, they are “public” in that members (scientists) of institutional groups (scientific communities) share these intrapersonal concepts. Crucial in to this is not only to agree on the norms, but also to link the norms to accepted verification methods, such as observations, experiments and models (Andersen et al., 2006; Andersen & Nersessian 2000; Hoyningen-Huene 1993).

There are relatively few attempts to discuss scientific concepts so that connection is made to a psychological understanding of concepts. One notable exception, however, is a view which sees scientific concepts as dynamic frames embracing conceptual knowledge (Andersen et al., 2006). The dynamic frame view assumes that advanced scientific concepts are acquired by the same process of categorisation as everyday concepts. The categorisation of interest here is how different exemplar-type problems fall into the same classes based on how different types of models serve in solving those problems. Then, the characteristic (but not defining) features of concepts emerge from the reference to classes of models, or clusters of models. Scientific concepts, where the models form the classes relevant for learning concepts, are also regulated and constrained by certain rules for applying concepts in construction of models; the norms guide how to use the concepts. The dynamic frames incorporate the attributes of concepts, in much the same way as in the prototype theory, and the theoretical knowledge as in the theory-theory approach, but now theoretical knowledge has a role of organising the attributes and imposing constraints on their co-variation (Andersen & Nersessian, 2000; Andersen et al., 2006).

2.3. Concept learning as convergence process

The focal point of this study is the individual learner’s process of learning scientific concepts, where personal concepts are transformed into scientific concepts. However, because learning takes place in a community of students and teachers, we must also understand how communication affect the learning process. Collaborative learning and sharing ideas in small groups has been shown to enhance student learning. Such learning is beneficial when the members’ knowledge supplements others’, but differs only slightly from it; members show some knowledge equivalence and knowledge sharing within the group (Jeong & Chi, 2007; Weinberger, 2007). Here, however, re-analysed and revisited the empirical data contain little information about the communication, and although the effect is evident, very simple models serve here to estimate the effects of communication on concept differentiation.

2.4. Systemic view

The systemic view sees concepts as part of a knowledge system, where the “concept” as a part of the operation of the system may have a plurality of appearances and project differently in different contexts, yet the parts of the system remain unchanged. Recently, some have suggested a different but related type of systemic view. It employs the ideas of dynamic complex systems, where robust and persistent conceptual patterns can arise in emergent fashion from interactions of elemental pieces of the dynamic system (Brown & Hammer, 2008). These interactions can thus give rise to a full spectrum of different projections of concepts, some of which are simple and some, complex. The systemic view is also adopted here, so concepts are considered functional parts of the system, affected by the system and its evolution.

The systemic view requires specific representations of concepts which can capture their complex, multifaceted and dynamic nature. A suitable model of such concepts should address at least the following features:

1) Attributes and sets of attribute values as in prototype and dynamic frame views.

2) Theoretical knowledge in role of constraining and guiding the use of concepts.

4) Model competition and utility as mechanisms affecting the evolution of concepts.

Requirements 1-2 are essential to retaining a connection to a psychological view of concepts and concept learning, which understand intrapersonal concepts and the transition from intra- to interpersonal concepts. Requirements 2-4 are essential to describing how intrapersonal concepts develop or change into scientific concepts. In what follows, we introduce just such a systemic model in section 3 and then, in section 4, discuss how empirical results concerning the differentiation of the concepts electric current and voltage can be embedded within it. Finally, in section 5, we present a computational embedding of the systemic view and use the computational model to simulate the process of concept differentiation.

The systemic view of concept learning and differentiation sees concepts as constructs which, in the one hand, take the form of intrapersonal concepts, and on the other hand, the form of interpersonal concepts. Such constructs are embedded in a conceptual system which evolves and affects the constructs as part of the system’s own evolution. The evolution of the concept system is changes in the connectedness of the concepts and in the strength of those connections. In that change, models play a central role, because through models, the concepts become projected onto actual, real situations.

3.1. Structure: Constructs

Concepts as complex structures are called here C-constructs. C-constructs are first and foremost connected to sets of attributes, where connecting links carry information about attributes and the strength of associations with those attributes. Other elements of the system carry knowledge of regularities and relate concepts to each other in different ways: causally, through constrained determination (constrained co-variation in a law-like manner) by constraining the use of concepts (e.g. conservation laws). These schemes are called determination constructs or, in shorthand, as D-constructs.

C- and D-constructs are the most elemental conceptual constructs of the systemic model, and as such, they offer no explanations or predictions by themselves. The task of explaining or predicting falls on models, which utilise the C- and D-constructs as their constituents. Models project concepts (C-constructs) onto phenomena to be explained, and through the success or failure of this projection, C-constructs are altered. The models, called here as M-constructs, are also conceptual constructs, but unlike C- and D-constructs, are context dependent.

The relationship of C-constructs’ to characteristic attributes is familiar from the prototype theory of concepts and is not only essential in describing personal concepts, but also important in describing scientific concepts. The basic level of attributes consists of simple and unstructured sets of attributes {a1, a2, ..., ak}, but more structured combinations can fall under more general schemes. These more general sets are subordinated under a more general (e.g. constraining) scheme, which here is typically a D-construct. In the course of concept development, these attributes are inherited, although some of the inherited attributes can be discarded as the concept evolves. These features resemble the dynamic framework approach to concepts (Andersen & Nersessian, 2000; Andersen et al., 2006). The attributes are not strictly mutually exclusive, especially when C-constructs are not used together. However, the more important it is to use two C-construct together, the more difficult it is to maintain dissonant attributes; C-constructs are differentiated with regard to their attributes, as they should if they represent scientific concepts.

D-constructs are general schemes which relate C-constructs to each other, typically in the form of causal connections or in the form of constrained determination (i.e. constrained co-variance with no causal dependence). In some cases, D-construct can simply constrain how a single C-construct can be applied. Therefore, these constructs are essentially the carriers of theoretical knowledge (c.f. Machery, 2009; Rehder, 2003). The D-construct is the general template of the form of determination and is largely independent of the context, yet it prescribes how on can legitimately apply C-constructs to a given context through models. D-constructs play a crucial role in discriminating between attributes, because D-constructs connect C-constructs. Through D-constructs the dissonances between attribute associations are revealed.

M-constructs are designed so that they serve as models which explain phenomena or their selected properties. They use C-constructs, because concepts are needed to build models (Nersessian, 2008). In some cases, M-constructs are also related to D-constructs, which then specify the relationship between C-constructs when more than one C-construct is involved. M-constructs are the basic vehicles for explaning or matching predictions with observable features of phenomena or, if one so wants, to select certain features of phenomena which fall under the explanatory power of a given M-construct. On the most advanced level, M-constructs are full-fledged scientific models. On the most basic level, M-constructs are simple and even self-explanatory. However, in either cases, M-constructs are evaluated only against observational evidence {e1, e2, ..., ek}, which may either lend it support or lead to its rejection/inhibition. M-constructs compete against other available M-constructs in providing the most likely explanation of the evidence.

Figure 1. (see pdf file) A schematic diagram (right) of C- and D-constructs and their connections to attributes {a1, a2, ..., ak}, and (left) C-, D- and M-constructs connected to each other and to sets of evidence {e1, e2, ..., ek}. Links can be congruent (solid line) or dissonant (dashed line).

The systemic view sees the knowledge system as a connected network of C-, D- and M-constructs, where connections between these constructs can continuously change when the evidence changes. Of course, the system can also reach stable states so that there are no changes in connections when additional evidence is available. Changes in connections are based on locally effective rules, but the total effect depends on the global state of the system as whole (due to connectedness of the system). Different concepts can then be expressed as different relational structures of the pieces or as different constellations of elements. Within the systemic view, connections can be a type of positive constraint so that a connection strengthens the role of a given element. The connections can also be a type of negative constraint so that the connections weaken the role of the element. Identifying connections and determining whether they are negative or positive must be based on empirical evidence.

3.2 Dynamics: Model competition and utility

The evolution of the concept system is driven by the utility of M-constructs in explaining evidence (c.f. Henderson, Goodman, Tenenbaum & Woodward, 2010; Ohlsson, 2009). The explanatory power of the M-construct changes with the changing amount of evidence to be explained. Because different M-constructs can explain the same evidence, M-constructs compete against each other. If the context is simple and only little evidence need be explained, one can achieve this by using simple and only partially correct models that correspond to simple M-constructs. Then, utility of the simple M-construct is better than utility of more complex ones (e.g. scientific ones), and the simpler ones are therefore more likely to be adopted. However, with the increasing complexity of the context and greater amounts of evidence to be explained, the complex models (M-constructs), which explain more, gain utility and become adopted. Thus, it is important to note that in learning, the adoption of a model is a question not only of its correctness, but also of its utility (Ohlsson, 2009, 2011). It is assumed here that different models and evidence to be explained are known in advance. Many of the models may be inactive and much of the evidence unknown for the learner in the initial stages of learning. From point of view of modelling the learning, one can assume finite collection of possible models, some of them active and some inactive (cf. Henderson et al., 2010).

3.3 Concept convergence

In a learning situation where concept learning and differentiation take place, learners share concepts in the group discussions within small groups. In learning, knowledge convergence is often considered crucial to forming shared, public concepts (Weinberger et al., 2007; Jeong & Chi, 2007). The empirical data of concept differentiation indicate that knowledge also converges during this process; the ways in which one uses and understands the concepts become similar, at least to certain degree (Koponen & Huttunen, 2013). In the systemic view, this kind of knowledge convergence means that the ways in which the C-constructs link to other elements of knowledge and their attributes among the learning group become more similar during the learning process. Here, the knowledge convergence discussed only to the extent that it concerns concept differentiation and via communication in small groups of three. Therefore, in what follows, we concentrate on simple triadic communication patterns (discussed in more detail in section 4) and assume that the effect of convergence takes place mainly through utility of models. The communication is described simply as a consensus-based knowledge sharing where, through communication, all group members always adopt the model with the strongest utility. There is no threshold effect on adoption of the model. Such a convergence model exaggerates the effect of communication on learning, but it is an adequate model for estimating the maximal expected effect of communication on concept differentiation.

Research on learning scientific concepts and concept differentiation has been conducted in several ways and on different topics, but perhaps most extensively on the concepts electric current and voltage (Cohen, Eylon & Ganiel; 1983; Shipstone, 1984; Engelhardt & Beichner, 2004; Koumaras, Kariotoglou & Psillos, 1997; Lee & Law 2001; McDermott & Shaffer, 1992; Reiner et al. 2000;). Some of the studies have focused on students’ explanatory models (Cohen et al., 1983; Engelhardt & Beichner, 2004; McDermott & Shaffer, 1992; Koumaras et al., 1997; Shipstone, 1984), while some other studies have focused on ontological attributions (Lee & Law, 2001; Reiner et al., 2000). The general outcome of these studies is that the concepts electric current and voltage are often mixed with personal, intuitive concepts or conceptions (or intrapersonal concepts) and, furthermore, are poorly differentiated.

The impact of numerous empirical studies on deeper theoretical understanding of concept learning and differentiation, however, has been relatively modest for at least two reasons. First, although these studies have identified brought a variety of different types of models, intuitive conceptions and ontological attributions, they have failed to abstract from the empirical details general and generic features which could provide a broad enough theoretical perspective to understand the relationship between different views. Therefore, for lack of a sufficiently broad theoretical perspective, discussions have often focused on the differences of a preferred theoretical perspective over that of some other perspective, rather than trying to provide a more integrated, progressive and broader theoretical framework that makes the partial accounts understandable (see e.g. Chi & Brem, 2009; Gupta et al., 2010; Ohlsson, 2009). We believe that many empirical findings can be captured within the systemic model when suitably idealised to reveal the essential generic features behind the multitude of details. Furthermore, the systemic view can help us to understand how different aspects of concept learning are related. In what follows, we focus only on features of interest to advanced learners, typically those on an upper-secondary school level or first-year university level.

4.1. Empirical results revisited and re-interpreted

The purpose of the present work is to provide a new theoretical framework to discuss concept differentiation when learners’ concepts take on a scientific character. Rather than report new empirical results, this study is re-uses and re-analyses already published empirical data (Koponen & Huttunen, 2013). The empirical data consists of nine students’ (upper secondary school) interviews about their conceptions of electric current and voltage in DC circuits. The students built DC circuits, observed their behaviour, and then proposed explanations for the observed brightness of light bulbs. The interviews were transcribed and analysed to identify the models students use to explain the behaviour. The nine students discussed their explanations in groups of three. The study consisted of three different contexts I-III:

I: Light bulbs in series. The participants compared two variants (a single light bulb and two light bulbs) in terms of the brightness of the bulbs. This comparison produces evidence e1 and e2.

II: Light bulbs in parallel. The first variant is again involves a single light bulb. The second variant involves two light bulbs in parallel. Comparing the two variants yields evidence e1’ and e2’.

III: Comparison of the brightness of light bulbs in series (I) and in parallel (II). In the first variant, participants compare the brightness of light bulbs in series, and parallel circuits to the one-bulb case only. In the second variant, participants compare series and parallel cases to each other. This produces evidence e1’’ and e2’’.

All six different types of evidence are referred to as an evidence set E = {e0, e1, e2, e0’,e1’ e2’, e0’’, e1’’, e2’’}, with e0, e0’ and e0’’ representing observations of the brightness of a single light bulb in each context (the brightest light bulb). Further details about the empirical setup, design and excerpts from the student interviews are reported by Koponen and Huttunen (2013). These empirical studies reveal some common features, which answer the following questions:

2. What are the determination (constraining or causal) schemes students employ as part of their models?

3. What attributes do students associate with the concepts, models or determination schemes they use?

4. How does communication in a small group (3 students) affect the relationships between concepts, models and determination schemes?

The results of the re-analysis serve here to construct idealised sets of the M- and D-constructs in contexts I-III, with a summary of the results in Table 1. A summary of the attributes revealed by the analysis is in Table 2.

M-constructs M1 and M2 are well-known electric current-based intuitive models found in many empirical studies (see Koponen & Huttunen, 2013, and references therein), while constructs M1’ and M2’ represent corresponding models, but are based on voltage (undifferentiated from current). These appear in relatively fewer cases, but are taken into account here. Constructs M3 and M3’ are partially correct explanations, which take into account the role of components in determining the current. Construct M3’, however, appears only once in the empirical data. Construct M4 is the correct scientific model based on Ohm’s law (D3) and Kirchhoff’s laws I (D1) and II (D2) which correctly differentiates between electric current and voltage.

The M- and D-constructs inferred from the empirical study (Koponen & Huttunen 2013)

Construct		Construct
M1	The battery as a source of current.	M1’	The battery as a source of voltage.
M2	M1+ components consume current.	M2’	M1’+ components consume voltage.
M3	M1 + voltage over components creates current.	M3’	M1’ + current over components creates voltage.
M4	Model based on Ohm’s law + Kirchhoff’s laws KI and KII.
D0	Constraining laws: Conservation (of “electricity” or current).	D1	Constraint: Current is conserved in junctions/branches (Kirchhoff I).
D2	Constraint: Voltages in a closed loop equal zero (Kirchhoff II).	D3	Ohm’s law: U = RI or U/I = R.

Attributes a1-a9 inferred from the empirical study, with key word(s) used to characterise and identify each attribute.

Attribute	Key word	Attribute	Key word
a1	Stored	a2	Contained
a3	Consumed	a4	Conserved
a5	Degraded or diminished	a6	Divided and diminished
a7	Maintained	a8	Partitioned and conserved
a9	Generated, supported

4.2. Representations as directed graphs

Most of the relevant elements found in the interviews can now be represented according to the systemic view and by using a Directed Graph Model (DGM) to relate different C-, D- and M-constructs to sets of attributes and evidence. The DGM is a representation, where connections between different elements are related through directed links which are either congruent or dissonant. The links and their direction provide information on how the elements interact. This has the advantage that DGM can serve as a computational template (Koponen, 2013). An example of how DGM relates to different constructs and the most important links connecting them appears in Figure 2. The links shown as solid lines are mutually supporting, congruent links; dissonant links, shown as dotted lines, point out contradictions. Congruent links were recognised on the basis of how students combined these elements in different situations. The recognition of dissonant links was more problematic. Most of the dissonant links are the interviewers’ interpretation of unavoidable logical contradictions rather than notions expressed by the students themselves (Koponen & Huttunen, 2013).

Different students’ conceptions can now be visualised as graphs with different node strengths. Some typical students’ conceptions A-D found in the interviews and represented in this way appear in Figure 3. Cases A and B are the most common in contexts I and II, while C and D usually occur only in context III. D occurred in only two of the nine cases studied, while C (or constellations close to it) occurred in four cases (Koponen & Huttunen, 2013). An important aspect of the DGM representations is that they represent the students’ understanding as a constellation of C-, D- and M-constructs and associated attributes with various strengths. Of course, these strengths are idealisations of the systemic model, which only phenomenologically represents the apparent importance of a given construct as it can be identified in interviews, and only partial information about such strengths are available from the empirical data. Nevertheless, such fine grained representations of students’ conceptions contain more information than do traditional ways based on written descriptions only.

Figure 2.(see pdf file) Directed Graph Model of all essential C-, D- and M-constructs based on the empirical results, as reported in Table 2. C- and D- constructs are linked to attributes {a1, a2, ..., ak}, M-constructs are linked to sets of evidence {e1, e2, ..., ek}. Links can be congruent (solid line) or dissonant (dashed line). Construct C1 is current, and C2 is voltage.

Figure 3. (see pdf file) Some examples A-D of typical graphs representing students’ conceptions as projected on the DGM shown in Figure 2 (these graphs are sub-graphs of the DGM). Nodes are classified into three classes: Strong, s > 0.7 (large circle); average, 0.3 < s < 0.7 (medium circle); and weak s < 0.3 (bullet). Construct C1 is current, and C2 is voltage.

4.3. Effects of communication

The information about communication acts between the students (as it is available from the interviews) can serve to construct idealised communication patterns between students and to temporally locate the effects of communication on the students’ choices of models and attributions. Analyses of data on the individual students’ conceptions have been published previously (Koponen & Huttunen, 2013), but data on communication is unpublished. A summary of the changes in models and how communication takes place is given in Table 3. Here, the re-analysed data, ordered in temporal sequences to reveal the communication acts, allows the identification of changes in C- and M-constructs. Unfortunately, the original data offer no detailed information on changes in sets of attributes.

The results in Table 3 show that the communication patterns in groups G1 and G2 are reciprocal in that all students exchange information in all directions. Nevertheless, some one-person dominated patterns are evident, where a single student (s3 in G1 and s6 in G3) is more active than others. Formally, such communication patterns between students P, P’ and P’’ can be modelled as a triad (see Figure 4).

With group G3, communication takes place reciprocally between all students and the communication pattern is dense. Unfortunately, the empirical data here do not permit a more detailed analysis of the communication patterns. In what follows, the effect of communication is modelled as a triad; in one case as relatively sparse and one-person dominated, and in other case as dense and reciprocal.

Evolution of nine students’ conceptions 1-9 in groups G1-G3 of three students (s1-s3, s4-s6 and s7-s9) in contexts I, II and III, as idealised in terms of the DGM. Communication events are shown as directed dyads i→ j from student i to student j, or as reciprocal dyads i ↔ j.

	Group G1, with students 1-3			Group G2, with students 4-6			Group G3, with students 7-9
Context	s1	s2	s3	s4	s5	s6	s7	s8	s9
I	A	A	(D)	A	A	(D)	A	C	A,C
	1←3	2←3	3→1,2	4←5,6	5←6	6↔4	7←8,9	8↔9	9↔7,8
	A	C	D	C,B	B	D	A,C	C,A	A,C
II	1↔3,2	2↔3,1	3↔1,2	4←6	5↔6	6→4,5	7↔8,9	8↔7,9	9↔7,8
	C	(D)	D	C	B	D
	1←3		3→1	4←6	5←4	6↔4
	C	D	D	C	C	D	C	C,A	A,C
III	1↔3,2	2↔2,3	3↔1,2	4→5,6	5←4	6←4	7↔8,9	8↔7,9	9↔7,8
	D	D	D	D	D	D	C	C	C

Figure 4. (see pdf file) Patterns of students’ communication. The thickness of the arrows denotes the amount of communication. The communication pattern on the left is dominated by student P, while for students P’ and P’’ communication is sparse. The communication pattern on the right is reciprocal and dense between all students P, P’ and P’’.

5. Computational embedding of systemic view in terms of DGM

The Directed Graph Model (DGM) can serve as a computational template; as a computational embedding of the systemic view to produce generically similar features found in empirical situations. In what follows, we briefly describe the computational features of such embedding; with similar type updating rules (see Appendix) that have previously been introduced and motivated elsewhere (Koponen, 2013). Computational embedding transforms the qualitative notions contained in the systemic view into computational rules, quantifies the roles of model utility and theoretical guidance. Concept learning and the degree of concept differentiation are monitored through two quantities: Theoricity T and Separability S. Theoricity T describes the theoretical complexity of the concept, while Separability S is connected to differentiation and, thus, to ontological shift. The pair of values (S,T) then specifies the learning path.

In the DGM, C-, D- and M-constructs are nodes connected by directed links. Each node has a dynamically evolving strength, which determines its effect on the other nodes to which it is connected and, thus, the dynamics of the system. M-constructs are also connected to sets of evidence (see Figure. 2). Node strengths are updated after comparing M-constructs with the evidence, which means obtaining new evidence or reconsidering existing evidence. The DGM also has a memory effects in that the new strengths of the links and nodes and depend recursively on the previous values. Furthermore, the simulations take into account also effects of communication between learners. In computational embedding information contained in one graphs affects the strengths of the nodes and thus the dynamics of another graph. To describe the state of the system and to characterise the evolution of the concepts, we must define several quantities in terms of node strengths and links. Below is a short overview of these quantities. A complete description of the update rules and their definitions in terms of link and node strengths are given separately in the Appendix. The details given in the Appendix are not essential for understanding in general level how the model works, but the mathematical details give are needed to fully appreciate how the memory effects arise through connectivity from the global state of the network.

The most important of the quantities are the Theoricity T and Separability S of C-constructs, which serve to specify the learning paths. Theoricity T is a measure of the theoretical complexity of a C-construct that roughly describes the number of paths from C-constructs to M-constructs while taking into account the strengths of the links and nodes (see the Appendix for details). Separability S describes the degree of dissimilarity between C-constructs with regard to the different attributes associated with them (see Table 2). If two C-constructs are connected to completely different sets of attributes, S is at its maximum value. Both quantities are defined in a range from 0 to 1 so that T = 1 means full theoretic complexity (corresponding to the scientific use of given concept) and S = 1 means complete differentiation.

The dynamics of the DGM depend crucially on the Utility U of M-constructs, and the Utility is the basis for model selection (the strengths of M-constructs depend on their Utility, see Appendix). First and foremost, the Utility is proportional to the ratio of explained evidence to the theoretical complexity T of the C-construct while taking into account the relevant strengths of nodes and links. If the model explains most of the available evidence and its Theoricity is low, the Utility will be high and the model will be favoured in explanations. With more evidence to explain, the less complex models will generally explain less, thereby their Utility is reduced. The extent to which the model takes into account conflicting evidence can be controlled with the parameter K, which also controls the effect of D-constructs on Utility. If K is set to a high value, the state of the system is heavily guided by evidence and theoretical knowledge (i.e. D-constructs). With low values for K, conflicting evidence and theoretical information will be more or less ignored, thus (since D-constructs describing causal conservation laws will be less important), favouring simpler models. The importance of evidence (whether conflicting or not) can be also adjusted by weakening the links between M-constructs and evidence. One can also alter the order in which one encounters the evidence. In practice, parameter K is related to the potential of an individual student to make use of theoretical knowledge to construct explanatory models.

Learning also depends on the state of an individual student’s initial knowledge. The state of initial knowledge is taken into through the initial strengths of the different models M1-M4, usually so that simple models (such as M1 and M2) have high a priori strengths (they are then the preferred models), while complex models have low initial strengths. Also, the initial strength of D-constructs affects how strongly theoretical knowledge will guide the learning process, an effect taken into account through parameter D’. In addition, students differ in their attentiveness to evidence, so that part of the evidence receives more weight than some other parts of the evidence. This is taken into account by giving weights to the evidence also. Finally, the sequence of evidence and the order one encounters the evidence (i.e. the training sequence) affect the dynamics of the learning paths. The values of these parameters and their initial values serve to model the individual learners’ initial knowledge and their potential to make use of theoretical knowledge.

In addition to the individual characteristics described above, communication between individuals affects the dynamics of learning paths. In the DGM, communication can also affect the strengths of the M-constructs. Assuming pairwise communication between students, the stronger M-constructs affect the weaker ones so that the lower value is increased by the communication impact factor C. A value C = 1 means complete adoption of the highest utility models in communication, and C = 0 means ignoring completely the information provided through communication. The Appendix explains the details of the communication model as part of the DGM.

The computational model is idealisation and takes into account only the roughest features of concepts, models and communication. However, the model is constructed to include the most essential generic features and, as such, is capable of providing important insight into how different parts of a conceptual system and its internal connectivity affect concept differentiation.

The Directed Graph Model (DGM) and simulations based on it must make understandable the following generic features of concept learning and differentiation:

1) Context-dependent dynamics of concept learning and differentiation (learning paths). The students’ conceptual states (as shown in Figure 4) are context dependent in that they appear mostly in given contexts I-III, with a given set of evidence (or observations) to be explained, and the state changes with the changes in set of evidence.

2) The dynamics and persistence of ontological change in attributions. Changes in ontological attributions are indicative of concept differentiation. When it takes place, it leads robust and persistent learning outcome.

3) The effect of communication on concept differentiation. In two of three cases, a given group has a student with a more sophisticated conception and more differentiated concepts than the two other students have, but who eventually partially adopt that sophisticated conception.

Of course, the learning process entails many other details, but these generic features 1-3 are the most important and interesting ones that any model of concept learning and differentiation should explain. In what follows, we concentrate on simulating just such a process of concept learning and differentiation by using the DGM and monitoring the learning process through the Theoricity T and Separability S of concepts.

6.1. Model parameters and initial conditions

The DGM allows parameterisation of many different initial stages. The initial stages are described through the initial strength of M-constructs and D-constructs, and through the strength of the evidence. For initial model strengths, we studied here cases where M1 and M2 are strong models (strength 1.0 - 0.75), and M3 is of moderate strength (strength 0.5 - 0.25), and other M constructs are weak (strength 0.25 - 0.05). These cases are interesting, because they provide information on how initial, rather unsophisticated models such as M1 and M2 evolve during the learning process towards sophisticated models such as M4, and how concept differentiation relates to this change. This is also the learning path of most practical interest, a path from intuitive to scientific concepts.

For initial D-construct strengths, we studied cases where D1, D2 and D3 are of equal strengths D’, varying from 1.0 to 0.4. In addition to initial values for D’, theoretical knowledge operates through congruent and dissonant connections, which can be tuned by parameter K (see the Appendix and Table 4) so that value K = 1 denotes the strongest guidance and, K = 0, no guidance at all. In addition to these parameterisations, the dynamics of the DGM and the learning paths depend on what we call here the training sequence, meaning evidence and the order in which one encounters it.

The training sequences are constructed to correspond to empirical contexts I-III (see section 4.1) so that each sequence IàIIàIII consists of evidence {e0, e1, e2, e0’, e1’ e2’, e0’’, e1’’, e2’’} where each element e0, e1, … is associated with strength e, specifying how much attention one pays to the evidence. If e = 1, then evidence is taken fully into consideration, but for 0.0 < e < 1.0, only partially. In the simulations, each event is repeated N times, (N = 3 or N = 4) and the sequence is then reversed in order to verify the permanence of learning (i.e. no reduction of values T and S for the reversed sequence, and no “hysteresis” effect). Thus, the computation consists of training sequences of form {Nx(e’,e’,e’); Nx(e’’,e’’,e’’); Nx(e’’’,e’’’,e’’’); Nx(e’’’,e’’’,e’’’); Nx(e’’,e’’,e’’); Nx(e’,e’,e’)}, with N = 3 consisting of 54 events and for N = 4 of 72 events. The training sequences of that form are completely specified by N and the set of values O = (e’,e’’,e’’’). In some cases, to test the hysteresis, an additional 38 events are added in random sequence, denoted by R. In summary, the parameters that specify the initial conditions are N and O, and the parameters affecting the dynamics are K and D’.

6.2. Simulations of personal learning paths

The personal (individual, without communication) learning paths are studied first in a case, where initial conditions favour models M1 and M2 with initial strengths of 0.75, but where model M2’ also has a substantial strength of 0.5. The learning paths begin from unsophisticated models, which closely correspond to patterns such as A and B (see Figure 3), and then progress towards a more sophisticated patterns of type D. The learning paths of concept differentiation are monitored through the evolution of Theoricity T and Separability S. For comparison, estimates of the values of T and S corresponding to the empirical cases idealised as graphs A-D (see Figure 3) are: T = 0.35 - 0.45, S = 0.10 - 0.20 for A; T = 0.35 - 0.45, S = 0.55 - 0.65 for B; T = 0.55 – 0.70, S = 0.60 – 0.70 for C; and T = 0.90 – 1.00, S = 0.95 – 1.00 for D.

The learning paths are shown in Figure 5 for training sequence parameterisations O1=(1.0,1.0,1.0), O2=(1.0,0.8,0.8) and O3=(1.0,0.5,0.1), with N = 3. The positions, where sequences corresponding to contexts I, II and III end, are denoted. Theoretical guidance is studied for strong guidance K = 1.0 and 0.8, and for weaker guidance K = 0.50, while parameter D’ (for D-constructs) ranges from 1.0 to 0.4. The evolution of M-construct strengths, which reflects the competition between models that must explain more evidence, is shown in Figure 6. The situation shown in Figures 5 and 6 is asymmetric with respect to C1 (current) and C2 (voltage), with C1 always having a higher Theoricity T than C2. This asymmetry stems from asymmetry in the initial strengths of M1 and M2, as shown in Figure 6. This corresponds to the most frequent empirical situation in which students initially favour current-based models over voltage-based ones. One can also interpret the results in reverse way, with C2 having a higher Theoricity and the roles of M1 and M2 reversed. However, this situation where voltage-based model is initially preferred over current-based model seldom occurs in empirical cases.

For strong theoretical guidance (K = 1.0 or 0.8), together with close attention to observations (training sequences O1 and O2), learning and concept differentiation are successful. In such cases (Figure 5, in the upper row, two cases on the left) learning is complete for concept C1 (current), which is fully scientific (T = 1) and completely differentiated (S = 1) from concept C2 (voltage). Concept C2 is nearly scientific (T = 0.6), and with some extra training (shown in grey in Figure 5), it rapidly becomes a fully scientific (T = 1) concept. The learning paths are step-wise, with clearly distinguishable stable stages in Theoricity T with increasing Separability S. A sequence corresponding context I is already enough for relatively advanced differentiation (i.e. ontological shift), although Theoricity T may remain low. There appears to be a threshold of S = 0.7-0.8, which one can reach even with moderate development in theoricity. This threshold shows that one can achieve nearly complete separability and good differentiation (i.e. nearly complete ontological shift) even though the learning is otherwise still incomplete.

When theoretical guidance decreases, K = 0.5 and D’ = 0.60 (Figure 5, lower row, middle) or K = 0.8 and D’ = 0.4 (Figure 5, upper row, right), the theoricity of concepts C1 and C2 remains low for the training sequence (black dots), but again, with extra training (grey dots), improvement is possible. This trend shows that, eventually, even moderate theoretical guidance is effective, but then more training is needed. However, the order one encounters the evidence in training is not crucial, if the context III is involved. When theoretical guidance is low (K = 0.5, D’ = 0.4) and little attention focuses on evidence in case of context III, very little learning takes place, irrespective of the amount of training. This situation is shown in Figure 5 in the lower right corner.

The evolution of M-constructs in Figure 6 corresponds to learning paths in Figure 5. The initially dominant M-constructs M1 and M2 remain dominant until the end of the sequence corresponding to context I. During the sequence corresponding to context II, models M1’ and M2’ also become active and grow stronger. However, when sequence III begins, with strong theoretical guidance, the initial models cannot compete with M4 (fully scientific model), which eventually dominates when the sequence corresponding to context III ends. This does not occur with low theoretical guidance (Figure 6, right column). However, with extra training, M4 is eventually enforced in cases of moderate theoretical guidance also (Figure 6, lowest row), but not for the least guidance (Figure 6, lower right corner). It is noteworthy that M3’ is never activated, which is in line the empirical finding that such a voltage-based model is only seldom encountered.

Figure 5. The Theoricity T and Separability S of concepts C1 (bullets) and C2 (boxes) in the case of six different learning paths with given parameters K and D’ that control the strength of theoretical guidance. The upper row shows cases where K ≥ 0.8 is always relatively high but D’ varies from 1.0 to 0.4. In the lower row, K also varies from a high value of 0.8 to a lower value of 0.5. The initial values of the model strengths and strengths of the observations are different in cases shown in the left, middle and right columns (corresponding model strengths are shown in Figure 6). Left column: Initial values of model strengths favour models M1 and M2’ with strengths of 0.5, while other models have a weaker but equal strength of 0.25. The observations of events I-III are strong (link strengths have a value of 1). Middle column: Model strengths as in the left column, but M3, M3’ and M4 are reduced to 0.15, observations I-II are strong (1), but III is only moderately strong (0.75). Right column: Otherwise similar to the middle column, but the observations in case III are weak (0.10). The training sequence from I to III (end points of each sequence are marked in the figure), with three repetitions for each event appear in black dots. The training sequence testing the permanence of learning from I to III, then back from III to I, and one random sequence appears in grey dots. The values corresponding to the empirical results of configurations A-D (see Figure 3) are indicated (two symbols for each are located in the pairs of the lowest estimated and highest estimated values for T and S).

Figure 6. (see pdf file) Model evolution of the learning paths shown in Figure 5 with parametrisations for K and D and different training sequences O1, O2 and O3 as indicated. The color represents the strength of a given model. The number of steps in the simulation appears on the vertical axis thus indicating the ordering of the sequence.

Learning paths with slightly different initial conditions from the cases in Figure 5 are shown in Figure 7. In the cases shown in the upper row, the M-constructs M3 and M4 are slightly weaker than in the case shown in Figure 5. In the cases shown in the lower row, M-constructs M1 and M2 are of nearly equal strengths, which makes the initial stage more symmetric with respect to C1 and C2. In both cases, the attention to events corresponding to contexts II and III becomes weaker from left to right, represented as parameterisations O3 (as in Figure 5) and O4 = (1.0,0.8,0.5). In addition, the training sequences is now such that each event is reproduced three (N = 3, black dots) or four times (N = 4, grey dots). The corresponding evolution of M-constructs is shown in Figure 8.

Compared to the cases shown in Figure 5, one can observe some interesting differences. In the case shown in Figure 7, in the upper left corner, learning consists mostly of ontological shift through end of the sequence corresponding to context II. After that, when the sequence corresponding context III begins, Theoricity T rapidly increases because M4 rapidly gains strength (see Figure 8) due to strong theoretical guidance. Eventually, when the training sequence ends, learning is again complete. The training sequence with N = 4 leads to higher Theoricity T of concepts in I and II, but interestingly, to a slower increase in theoricity in III than in cases with N = 3 because with N = 4, M3’ grows strong during I and II, which slows the adoption of M4. This “overlearning” effect is most pronounced in the case shown in Figure 7, in the upper right corner, where theoretical guidance is low and attention paid to events is also low. In this case, more frequent repetition of events with N = 4 leads to deterioration of the learning results, and learning stagnates on the low Theoricity T of C1 and C2. Ontological shift, however, advances and eventually, Separability S = 0.8 is reached. Such a situation corresponds to what occurs in real learning; too much focus on overly simple tasks, which reinforces unsophisticated models, may lead to the persistent and robust use of under-developed models and conceptions.

Figure 7. (see pdf file) Three deterministic cases of learning paths with a given K. The figures show Theoricity T and Separability S of C-constructs C1 (bullets) and C2 (boxes) from Figure 2 and indicates the values corresponding to empirical results A-D (see Figure 3). Construct C1 is current, and C2 is voltage. The initial conditions favour models M1, and M2’ (voltage based).

Figure 8. (see pdf file) Model evolution of the learning paths in Figure 7 with parameterisations for K and D and different training sequences O3 and O4 as indicated. The darkness represents the strength of a given model. The numbers of steps in the simulation appear on the vertical axis, thus indicating the ordering of the sequence.

The lower row in Figure 7 shows some interesting situations, where repetition temporarily leads to the deterioration of learning results, when simple situations recur after the sequence IàIIàIII. We briefly refer to this as “hysteresis” in learning. Eventually, however, (Figure 7, lower row, two cases on left) complete learning with T = 1 and S = 1 occurs and learning becomes permanent, no longer affected by further repetitions. In the case of weak theoretical guidance and weak learning from events (Figure 7, lower right corner), incomplete learning occurs, with stable learning resulting at T = 0.4 and S = 0.6. This is again due to “overlearning” of incomplete models, which prevents the adoption of the more sophisticated model M4. Similar results can also be observed for other cases with moderate K and moderate attention to observations; repeating simple situations I and II many times before encountering more complex situation III, may reinforce the incomplete models M1-M3 or M1’-M3’ so much that further development can no longer take place. This shows that repetitions of training sequences can have detrimental consequences on learning if initial theoretical guidance is too low.

The examples discussed above are asymmetrical situations, where concept C1 (current) is the favoured concept, while C2 (voltage) is initially less developed, and remains largely as is during further evolution. This is the most common situation in learning, although the roles can sometimes be reversed. However, because the DGM is symmetrical with respect to C1 and C2 (see Figure 2), a reversed situation where C2 has stronger Theoricity than C1 is quite to similar if C1 and C2 simply switch roles. Also, the symmetrical situations can occur can closely follow the results in Figures 5 and 7, with the learning paths for C1 and C2 then simply overlapping.

6.3. Simulations of communication effects

The effects of communication on learning paths are simulated by using the sparse and dense communication pattern between members P, P’ and P’’ in a group of three (see Figure 4), and two impact factors C = 0.2 and C = 0.75 for communication. The effect of communication is tested on cases where theoretical guidance is strong or moderate (Figures 5 and 7, upper row, in the right column). The results of learning paths are shown in Figure 9, and the evolution of M-constructs in Figure 10. These figures show that even the effect of dense communication with a high impact C = 0.75 only moderately affects the learning paths. The most obvious effect is that if one member P in the group has a learning path which is strongly theoretically guided, thus reaching high values for T and S, the other cases tend to learn from that specific case and improve their learning and, consequently, reach higher values for T and S than without communication. Eventually, members P’ and P’’ who are less successful (Figures 5 and 7, in the lower right corner) than member P also achieve complete learning owing to the communication. This happens equally well for sparse and low-impact communication as for dense and high-impact communication. Of course, this occurs only in cases with one successful learner in the group. These features appear to be in concordance with the empirical findings, although the empirical findings presently allow no more detailed comparisons.

In the learning model, which is simply biased toward adopting the strongest model, the good learning result may temporarily worsen (see Figure 9, upper right corner). However, this is a transient effect, and the learning path eventually evolves toward complete learning.

Figure 9. (see pdf file) Learning paths with the effect of communication taken into account. Different figures represent paths with different parameterisations for K and D and different training sequences O1, O2, O3 and O4.

Figure 10. (see pdf file) Model evolution of the learning paths in Figure 9 with different parameterisations for K and D and different training sequences O1, O2 and O3. The darkness represents the strength of a given model. The number of steps in the simulation appears on the vertical axis, thus indicating the ordering of the sequence.

6.4. Summary of simulation results

The results based on the DGM agree with the following central empirical findings of concept learning and differentiation:

1. Context-dependent dynamics. This is apparent in the strong dependence of paths on the learning sequences. Complete learning takes place only in sufficiently rich contexts (e.g. case III), whereas in narrow contexts (e.g. cases I and II), learning is moderate or incomplete. This is a consequence of model competition and the greater utility of complex models in complex contexts.

2. The persistence of ontological shift and concept differentiation (S ≈ 1). In the DGM, this is a direct consequence of the guidance of D-constructs and their “memory effect”, retaining the memory of successful applications of D-constructs. The persistence of the ontological shift agrees with the empirical findings. However, the ontological shift in attributions is not a driving force of concept learning, but an outcome of a learning process driven by theoretical knowledge.

3. Communication affects individual learning paths and enables less advanced members of the group to adopt more advanced M-constructs from the most advanced member of the group. Thus communication improves learning, although the effect in the cases studied here is not particularly strong.

In summary, the DGM model reproduces the generic features of interest in concept learning and differentiation, and demonstrates that these features are associated with the guidance of theoretical knowledge, model utility and the memory effects of success in using models.

The model presented here is based on the systemic view, where concepts are viewed as complex, dynamically evolving structures. The model is constructed to capture generic aspects of concept learning and differentiation as exemplified in the case of learning two closely related scientific concepts – here, electric current and voltage. The generic features of most interest in need of explanation are: 1) the robustness of certain simple ways to use concepts to provide explanations in simple situations, a phenomenon usually assigned to robustness of misconceived ontological classes, 2) the context-dependent dynamics of change and requirement to encounter complex enough situations to effect in the change, and 3) the robustness of ontological shift once it has occurred. We suggest here that in order to understand these features and the dynamics of the change, we must develop a rich and complex enough model of concepts. On the one hand, the model of concepts takes a form of simple and nearly self-explanatory concepts, but on the other hand, a form of complex structures, dependent on other concepts.

The systemic view is embodied by the use of the well-known case of the differentiation of electric current and voltage as concepts describing the behaviour of simple DC circuits. In that, we use here re-analysed empirical data. The re-analysed (and partly re-interpreted) empirical results are then represented by using different conceptual elements, or constructs: C-constructs, which stand for concepts, D-constructs for causal schemes and law-like theoretical schemes, and M-constructs, which are model-like structures that use C- and D-constructs as integrated parts. As a formal representational model for these constructs and their mutual relationships, we introduce a Directed Graph Model (DGM). In the DGM, concepts are nodes in the graph connected by directed links to other conceptual elements.

The DGM serves as a computational template to simulate concept learning and differentiation and their dynamics. The stability of certain properties of concepts, traditionally considered robust “misconceptions”, and their dependence on contexts is now seen as related to the complex interplay of different conceptual elements. Change is driven by competition between M-constructs (models) and by how available evidence governs it. However, how this is reflected in the theory- and attribute-relatedness of concepts depends on how those M-constructs employ concepts (i.e. how the concept projects onto the actual evidence). Thus, D-constructs (theoretical knowledge) are central. All these aspects are recognised in current cognitively oriented views of concept learning, but are usually discussed separately or as unrelated views. The present study strongly suggests unifying these views and treating concepts as complex, multifaceted and dynamic structures.

Finally, the present work suggests that the theoretical background developed for research on concept development has important implications for the ways in which researchers and instructors view the learning process and how, on this basis, they design teaching solutions. The results point to the crucial role of theoretical knowledge in guiding concept learning and, furthermore, show that ontological shift, while an important part of learning, is not the primary driving force of learning, but is rather a consequence of more fundamental changes in the conceptual system. This suggests that theoretical structures and model utility should receive more attention in designing instructional solutions. On the other hand, it is clear that initial conceptions need not be actively “unlearned”; they can instead serve as a natural and useful starting point for the transformation.

For teaching and instruction, one important message lies in the role of the training sequence in determining learning paths. The details of the training sequence, and their repetitions, do matter in the initial stages of learning if the guidance of theoretical knowledge is low. Then, too much repetition of overly simple situations to explain may lead to “overlearning” of unsophisticated concepts and models, and effectively prevent the acquisition of more advanced model. The results of the simulations, interpreted within the theoretical framework of the systemic model put forward here, suggests that designing specific training sequences which help one to “unlearn” unsophisticated models is unnecessary; rather, what is needed is a training sequence which gradually and at suitable stages of learning introduces more challenging learning situations, where the utility of more advanced and scientific concepts and models becomes apparent.

The theoretical positions discussed and suggested here directly impact learning and instruction by clarifying the degree to which degree ontological shift drives the learning process and to which degree it should be considered a consequence of more fundamental, theory-driven learning process. Also, the question of to what extent differentiation and concept learning take place through the evolution of existing structures, and to what extent the learner must receive these structures instead of constructing them receives clarification. Briefly, if the systemic view is correct, it suggests that ontological shift takes place, but is a consequence of theory-driven learning. The learner must receive complex theoretical structures through instruction and see their utility in complex enough situations to warrant adopting them. Such results are practical in that they guide teaching and the development of teaching solutions; they provide support to some of the well-known teacher-centred solutions (the role of the teacher in providing models to organise new knowledge and in familiarising the students with complex theoretical models), while showing the indispensability of a rich context and context variation in the construction of explanatory models and the role of predictions and observations in learning.

These notions, even without detailed suggestions for training sequences and instructional solutions, demonstrate that the choices to employ a certain theoretical framework to understand concept learning and differentiation are not neutral. Rather, they have fundamental consequences for how learning and instruction are conceived, how their purposes and goals are viewed, and how our attention is guided towards crucial generic features of learning and its dynamics.

Concepts are considered complex structures, which are projected differently in different contexts.

Concept differentiation can be modelled when embedded within a systemic view on concepts.

Theoretical guidance and theoretical schemes are crucial for concept differentiation. Ontological shift is a consequence of theory-guided learning process.

Attention must be paid on training sequences in learning. Too frequent use of overly simple situations in training will stagnate the concept learning in robust states corresponding misconceptions.

Andersen, H. Barker, B. and Chen, X. (2006). The Cognitive Structure of Scientific Revolutions. Cambridge, MA: Cambridge University Press.

Andersen, H. and Nersessian, N. J. (2000). Nomic Concepts, Frames, and Conceptual Change. Philosophy of Science, 67, S224-S241.

Brown, D. E., & Hammer, D. (2008). Conceptual Change in Physics. In S. Vosniadou (Ed.), International Handbook of Research on Conceptual Change (pp. 127–154). New York: Routledge.

Carey, S. (2010). The Origin of Concepts. New York, NY: Oxford University Press.

Chi, M. T. H., & Slotta, J. D. (1993). The Ontological Coherence of Intuitive Physics. Cognition and Instruction, 10, 249-260.

Chi, M. T. H. (2005). Commonsense Conceptions of Emergent Processes: Why Some Misconceptions Are Robust. The Journal of the Learning Sciences, 14, 161-199. DOI: 10.1207/s15327809jls1402_1.

Chi, M. T. H. (2008). Three Types of Conceptual Change: Belief Revision, Mental Model Transformation, and Categorical Shift. In S. Vosniadou (Ed.), International Handbook of Research on Conceptual Change (pp. 35–60). New York, NY: Routledge.

Chi, M. T. H., & Brem, S. K. (2009). Contrasting Ohlsson's Resubsumption Theory With Chi's Categorical Shift Theory'. Educational Psychologist, 44, 58 — 63. DOI: 10.1080/00461520802616283.

Cohen, R., Eylon, B., & Ganiel, U. (1983). Potential Difference and Current in Simple Electric Circuits: A Study of Students’ Concepts. American Journal of Physics, 51, 407-412.

Danks, D. (2010). Not different kinds, just special cases. Behavioral and Brain Sciences 33, 208-209.DOI: 10.1017/S0140525X1000052X

Engelhardt, P. V., & Beichner, R. J. (2004). Students’ Understanding of Direct Current Resistive Electrical Circuits. American Journal of Physics, 72, 98-115. DOI: 10.1119/1.1614813.

Gopnik, A., & Meltzoff, A. N. (1997). Words, Thoughts, and Theories. Cambridge, MA: MIT Press.

Gupta, A., Hammer, D., & Redish, E. F. (2010). The Case for Dynamic Models of Learners’ Ontologies in Physics. The Journal of the Learning Sciences, 19, 285-321. DOI: 10.1080/10508406.2011.537977.

Henderson, L., Goodman, N. D., Tenenbaum, J. B., & Woodward, J. F. (2010). The Structure and Dynamics of Scientific Theories: A Hierarchical Bayesian Perspective. Philosophy of Science, 77, 172–200.

Hoyningen-Huene, P. (1993). Reconstructing Scientific Revolutions: Thomas S. Kuhn’s Philosophy of Science. Chicago, IL: The University of Chicago Press.

Jeong, H & Chi, M. T. H. (2007). Knowledge convergence and collaborative learning. Instructional Science, 35, 287–315. DOI: 10.1007/s11251-006-9008-z.

Keil, F. C. (1989). Concepts, Kinds and Conceptual Development. Cambridge, MA: MIT Press.

Koponen, I. T. (2013). Systemic View of Learning Scientific Concepts: A Description in Terms of Directed Graph Model. Complexity, 19, 27-37. DOI: 10.1002/cplx.21474.

Koponen I. T. and Huttunen L. (2013). Concept Development in Learning Physics: The Case of Electric Current and Voltage. Science & Education, 22, 2227-2254. DOI: 10.1007/s11191-012-9508-y.

Koumaras, P., Kariotoglou, P. & Psillos, D. (1997). Causal Structures and Counter-intuitive Experiments in Electricity. International Journal of Science Education, 19, 617–630.

Lee, Y., & Law, N. (2001). Explorations in Promoting Conceptual Change in Electrical Concepts via Ontological Category Shift. International Journal of Science Education, 23, 111- 149.

Ohlsson, S. (2009). Resubsumption: A Possible Mechanism for Conceptual Change and Belief Revision. Educational Psychologist, 44, 20-40. DOI: 10.1080/00461520802616267.

Ohlsson, S. (2011). Deep Learning: How the Mind Overrides Experience. Cambridge, MA: Cambridge University Press.

Rehder, B. (2003). Categorization as Causal Reasoning. Cognitive Science, 27, 709–748.

Reiner, M., Slotta, J. D., Chi, M. T. H., & Resnick, L. B. (2000). Naive Physics Reasoning: A Commitment to Substance Based Reasoning. Cognition and Instruction, 18, 1-34.

Shipstone, D. M. (1984). A Study of Children’ s Understanding of Electricity in Simple DC

Slotta, J. D., & Chi, M. T. H. (2006). Helping Students Understand Challenging Topics in Science Through Ontology Training. Cognition and Instruction, 24, 261-289.

Smith, C., Carey, S., & Wiser, M. (1985). On differentiation: A Case Study of the Development of the Concept of Size, Weight and Density. Cognition, 21, 177-237.

Smith, E. E., & Medin, D. L. (1981). Categories and Concepts. Cambridge MA: Harvard University Press.

Weinberger, A., Stegmann, K., Fischer, F. (2007) Knowledge convergence in collaborative learning: Concepts and assessments. Learning and Instruction, 17, 416-426.

The dynamics of the DGM is determined by the update rules of the node strengths and weights of connecting links between the nodes. In the DGM, C-, D- and M-constructs are nodes, which are connected by directed links. Each node i has dynamically evolving strength s_i, which determines its effect on the other nodes to which it is connected and, thus, the dynamics of the system. Node strengths s (the subscript is omitted if not essential) are updated after each “event” e in the set of all events E, which means obtaining new evidence or reconsidering the evidence (i.e. any kind of comparison with the evidence). Variable e is treated as a running index that keeps track of the encounter with evidence. The strength of the previous step s(e-1) is then updated to a new value s(e-1) à s(e) that corresponds to evidence e. The congruent link between nodes i and j is described by the value a_ij =1, while dissonant link has a_ij= -1. The following quantities are then defined entirely in terms of node and link weights.

1. Theoricity T is the theoretical complexity of the C-construct. The more there are D-constructs and M-constructs, which are connected to C-constructs, the greater is the theoretical complexity of the C-construct (i.e. the greater is its Theoricity). Quantitatively, within the DGM, Theoricity T can be quantified as the number of directed paths from the C-construct to the M-construct and to their respective strengths. In some of the paths, the D-constructs are also involved, which increases their Theoricities. Theoricity T_c of the C-construct at node c ϵ C (index c refers to C-constructs) is:

The first term represents one-step paths from the models (m ϵ M) to the C-construct. The second term represents two-step paths through the D-construct (d ϵ D). Note here that s_c= 1. The theoricity of a model, needed in what follows as part of the dynamic update rules for the DGM, is defined similarly, but now s_c à s_m and inverted directed paths from the model to the C- and D-constructs are counted (see Table IV).

2. Separability S measures the degree of dissimilarity between the set of attributes associated with two C-constructs represented by nodes c and c’. It is defined in regard to attributions only, as in the prototype theories. Separability is operationalised as a suitably normalised number of unshared elements so that for fully differentiated concepts, S = 1, while for similar concepts, S = 0, defined as:

Here, the element A_ac represents attributions (i.e. the values of attributes a, to be defined later) linked to C-construct c, and N is a normalisation factor . In calculating Separability, N takes into account the total strength of the attributions so that S ≈ 1 represents strong attributions with totally dissimilar attributions, while S << 1 can represent either totally similar or weak attributions.

3. Utility U of the models in providing explanations depends on the ratio of explained facts to the model’s Theoricity T. The utility of model m ϵ M is defined as

where e is an event in set E, and e’ is a node which groups events together (see Figure 2). Dissonant (negative) links are denoted by a’_ij. The first term in the sum represents direct congruent paths to the models (i.e. explanations), the second term direct dissonant paths, the third term two-step paths through e’, and the last term two-step paths through D-constructs. Parameter K ϵ [0,1] controls the effect of dissonant links and the D-constructs on utility. If the model explains most of the available evidence and its Theoricity T is low, its utility is high. However, with a growing set of evidence to explain, models with low theoricity generally fail, thereby reducing their utility. Utility is the basis for model comparison and selection.

The updating rules for the node strengths determine the dynamics of the graph and, consequently, the evolution of quantities U, T, and S, which depend dynamically on the node strengths. The updating rules also contain a memory effect, and the new values s(e) with evidence e depend recursively on the past values s(e-1), or s in shorthand. The updating rules are defined as follows:

4. The update rule for strength s_m of M-constructs m is based on Bayesian-type selection criteria (Koponen, 2013; Henderson et al., 2010) and depends on the utilities of the models. Each M-construct has a certain expected plausibility or probability s_m, which is updated to s_m(e) when more evidence e in the form of observations becomes available. The new plausibility with evidence e is evaluated according to the Bayesian rule so that if U_m(e) is the utility of model m when e is known, the model strength is then updated to

where the sum in the denominator ensures its normalisation. Generally, the more complex C-constructs provide more alternatives for M-constructs, so the Bayesian rule favours “simple” M-constructs. However, this may change when observations accumulate and more complex M-constructs explain more. The initial conditions of the system are its prior strengths and utilities, which must be deduced from available empirical data (based e.g. on interviews).

5. Strength s_d with evidence e depends on the connections of the D-construct to other D-constructs and M-constructs. The update rule for it is

where the first and the last sums take into account the fact that dissonant connections reduce strength, while the second sum takes into account the fact that connections to successful M-constructs increase strength. The factor (1-s_d) takes into account the “memory effect”; the use of given D-construct in successful M-constructs increase the value of s_d, which does not decrease again. This models the known effect that the successful application of theoretical knowledge increases confidence in that knowledge.

6. Attribute strengths s_a are updated by taking into account one-step congruent (second sum) and dissonant (first sum) paths from the models,

where parameter K controls the weight of dissonant paths (compare to Utility). Attributions A_ac are then defined on the basis of attribute strengths,

where the first term represents two-step paths, and the last term, three-step paths from the attributes to the C-constructs through the D-constructs (see Figure 2). Attributions serve as a basis for calculating the Separability S of a given pair of concepts (see definition 3 above).

7. The effect of communication on the dynamic of the DGM operates through the strengths of the M-constructs. At each step, where pairwise communication between students P and P’ is assumed, the stronger M-constructs affect the weaker ones. This is done by setting new values to the s_m and s’_m of P and P’, respectively, so that the larger value of them remains unchanged, but the lower value increases by factor C Max{0,s_m- s’_m}, where C is the communication impact factor. The update takes place before applying the Bayesian rule in step 4. This means that the update rule can be interpreted as learning by adopting better explanatory models before comparing utilities.

The update rules for the strengths of the M- and D-constructs drive the dynamic evolution of the graph. Table 4 summarises these strengths and other quantities defined in 1-6.

Definitions of quantities T, S and U, as well as update rules for node strengths s, are given for D- and M-constructs and their attributes. The attributes of the C-construct are given by a_ac, and the separability of C and C’ by S_CC’. Subscripts c, m, d and a denote C-, M- and D-constructs and attributes, respectively. Congruent links between i and j are denoted by a_ij = 1, while for dissonant links, a’_ij = -1.

Name		Definition
Theoricity	T_k	if k=m then i=c ; if k=c then i=m
Strength of D	s_d
Utility	U_m
Strength of M	s_m
Strength of a	s_a
Attribution of C	A_ac
Separability	S_cc’	where