A Systemic view of the learning and
differentiation of scientific concepts: The case of electric
current and voltage revisited
Ismo T. Koponen, Tommi
Kokkonen
Department of Physics,
University of Helsinki, Finland
Article received 12 February 2014 / revised 14
April 2014 / accepted 29 June 2014 / available online 3
July 2014
Abstract
In learning conceptual knowledge in physics, a
common problem is the incompleteness of a learning process,
where students’ personal, often undifferentiated concepts take
on more scientific and differentiated form. With regard to
such concept learning and differentiation, this study proposes
a systemic view in which concepts are considered as complex,
dynamically evolving structures. The dynamics of the concept
learning and differentiation is driven by the competition of
model utility in explaining the evidence. Based on the
systemic view, we introduce computational model, which
represents the essential features of the conceptual system in
the form of directed graph (DGM), where concepts are nodes
connected to other conceptual elements (nodes) in the graph.
The results of a DGM are then compared to the empirical
findings to identify differentiation between concepts of
electric current and voltage based on a re-analysis of
previously published empirical findings on upped secondary
school students’ learning paths in the context of DC circuits.
The comparison shows that the model predicts and explains many
relevant, empirically observed features of the learning paths
of concept learning and differentiation, such as: 1)
Context-dependent dynamics, 2) the persistence of ontological
shift and concept differentiation, and 3) the effects of
communication on individual learning paths. The systemic view
and the DGM model based on it make these generic features of
interest in concept learning and differentiation
understandable and show that these features are associated
with the guidance of theoretical knowledge. Finally, we
discuss briefly the implications of the results on teaching
and instruction.
Keywords: Concept learning; concept differentiation;
ontological shift; complex system, directed graph model
1.
Introduction
Learning scientific
concepts is a demanding and lengthy process, in which the
learner’s initial and personal concepts and conceptions
gradually change towards more scientific concepts in that they
are part of an extensive and coherent knowledge system (theory),
which regulates and constrains their use. Previous research (Lee
& Law 2001; Reiner, Slotta, Chi & Resnick, 2000; Smith,
Carey & Wiser, 1985) has raised the notion that learners
seldom use concepts in the same sense as they are used in
scientific knowledge. One particular but central question is
proper concept differentiation. When two closely related
concepts are linked to the same phenomenon, novice learners do
not always properly understand them as different concepts.
Rather, the concepts are confused and used in undifferentiated
ways (Lee & Law, 2001; Reiner et al., 2000; Smith et al.,
1985). The aim of the learning process then is to produce a
clearer and more scientific understanding not only of how such
concepts differ, but also of how they are related, a process
referred to here as concept differentiation.
Concept differentiation has
often been discussed from the viewpoint of “ontological shift”,
which views that the ontological attributions are at the centre
of concept learning, and changes in those attributions are the
main mechanisms behind differentiation (Chi & Slotta, 1993;
Chi, 2005, 2008). This position finds support in the notion that
ontological commitments in concept development are deeply rooted
in the psychological aspects of concepts (Murphy, 2004; Keil
1989). However, the ontological shift view has been criticized
for overemphasising the role of static ontologies (Gupta, Hammer
& Redish, 2010) and failing to pay proper attention to the
role of theory in learning (Ohlsson, 2011).
In addition, when studying
concept differentiation, one should understand that concepts
must be shared and be communicable to other learners.
Communicating and sharing of concepts is closely related to
problem of knowledge convergence, discussed
mostly in cases of the explanations convergence and seeking
consensus and a common way to understand concepts and terms
(Jeong & Chi, 2007; Weinberger, 2007). However, how
communication affects the learning of scientific concepts and
the differentiation process or, in general, which stages or
steps of the learning paths communication could possibly affect,
remains unclear.
Consequently, our
understanding of the learning path in concept differentiation
remains partially incomplete. One promising way to remedy this
lack of understanding views learners’ concepts as complex
structures and the learning process itself as a systemic process
consisting of different conceptual elements and where those
elements interact (Brown & Hammer, 2008; Koponen &
Huttunen, 2013). The present study proposes a new way of
synthesising different views by focusing explicit attention on
concept learning and concept differentiation, so that the
synthesis takes into account aspects of interest for personal
concepts, such as the role of ontological attributions, and
aspects relevant to scientific concepts, such as the
communicability of concepts and their constrained, law-like use. Such synthesis,
referred to here as the systemic view, sees concepts as complex
structures. Different stages of concept learning, with partially
differentiated concepts, are then seen as partial projections of
the structure in different real situations; the projections are
partial and incomplete mappings of more complete systems. On the
level of personal concepts, the systemic view stems from recent
views of the heterogeneity of concepts, which emphasise the
diversity of roles of concepts in different cognitive processes
(Machery, 2009). On the level of scientific concepts, the
systemic view borrows much from the “dynamic frames” view of
scientific concepts, where both ontological attributions and
theoretical, law-like (nomic) knowledge are considered central
to concept development (Anderssen & Nersessian, 2000;
Andersen, Barker & Chen, 2006). In the systemic view, the
learning process also requires a driving force or mechanism;
this study suggests that the utility of models, through which
the concepts are used, and the competition of models based on
utility provides that mechanism (Ohlsson, 2009, 2011).
The systemic model is
applied here to discuss and simulate concept differentiation and
its generic features in one empirically well-studied case; the
concepts of electric current and voltage. The generic features
of interest are the robustness of certain simple forms of the
concepts (often called as misconception or intuitive
conceptions), the strong context dependence of these
conceptions, the occurrence of ontological shift and its
persistence once achieved, and the role of theoretical knowledge
in concept differentiation and ontological shift. This study
focuses on developing the theoretical background of the systemic
model. To that end, we further develop the directed graph model
that we recently introduced (Koponen, 2013). We embody the
theoretical model by using re-analysis of empirical data of nine
students’ learning processes, in groups of three (Koponen &
Huttunen, 2013). We introduce a simulation model, based on
directed graphs, to model the learning path and to reproduce the
most important generic features of the empirical findings of
concept differentiation.
Finally, we discuss some
interesting implications for teaching that the model raises. The
model presented here supports the view that ontological shift is
not the primary agent in learning scientific concepts; rather,
it stems from theoretical learning, driven by model utility.
This means that instead of focusing on ontological training and
on developing instructional methods based on it, attention
should focus on how theoretical knowledge is introduced and
applied in the learning process. Another important notion is the
role of context in learning and how students are gradually
introduced to more demanding tasks. The model results show that
overly complex tasks cannot promote learning if students lack
sufficiently advanced concepts; yet overly
simple tasks lead to stagnation, where a learner gets stuck on
simple models and unsophisticated concepts. What is needed is a
learning path that is progressive and which demands use of
complex models and concepts. According to the view presented
here, the learner needs to receive theoretical knowledge through
instruction and to see its utility in complex enough situations,
thus avoiding “overlearning” of simple cases. This emphasises
not only the teacher’s role, but also the importance of
variation in contexts in which the knowledge is applied. These
notions, based on the systemic view and on its computational
embedding, therefore have direct practical consequences for how
one should design learning paths and the role of teacher in
them.
2.
Concept learning and differentiation: Theoretical
underpinnings
Pre-scientific concepts are
often idiosyncratic, context dependent and difficult to
communicate. Of course, scientific concepts, as used by advanced
learners and experts, not only share some aspects with
“personal” pre-scientific concepts, but also differ from them in
important ways. One of the most important differences is that
scientific concepts often refer to categories (entities or
objects) that − like models − are themselves purely conceptual
rather than categories within the reach of experience or
observation, and their use is law-like (nomological) and
constrained (Andersen & Nersessian, 2000; Andersen, Barker
& Chen, 2006; Hoyningen-Huene, 1993). Nevertheless, personal
and scientific concepts share features, in particular on the
level of how theory or theory-like knowledge structures the sets
of attributes that characterise the concepts.
Concept differentiation is
a process where the sets of attributes that characterise and
typify concepts become structured so that no other concept
shares the same set and values of its attributes. In the case of
scientific concepts, this requires that a given concept have
law-like (i.e. nomological) relationships to other concepts. It
is through these features that the concepts acquire the sharp
descriptive power they hold in scientific theories (Andersen
& Nersessian, 2000; Andersen et al., 2006; Hoyningen-Huene,
1993.) At the core of learning
scientific concepts is a transformation process where
personal, individual concepts which are meaningful to a learner
himself or herself, but not easily communicable or meaningful to
other learners, are transformed
into concepts that are more communicable and, where
consensus exists, how they can be used in relation to other
concepts. This latter way of using concepts is already
scientific in that normative and law-like rules govern the use
of the concepts (the nomological use of concepts), and the
attributes that characterise them are sharply identified. These
brief notions suggest that a suitable theoretical underpinning
must link the individual learning and use of concepts to the
shared, public use of concepts; intra- and interpersonal levels
of concept use and learning must be coupled.
Discussions of the learning
process, where a student learns and acquires scientific concepts
and becomes a fluent user of such concepts, must focus on
differences in the ways in which concepts are understood when
seen from the viewpoint of an individual’s personal cognition
and learning, and when concepts are discussed as they are shared
and used in scientific communities. In the former case,
concepts are personal, often un-explicated, and seen as strongly
context dependent (Carey, 2010; Gopnik & Meltzoff, 1997),
whereas in the latter case, concepts are explicated elements of
scientific theories, and their proper use is constrained by the
knowledge system as a whole (Andersen, 2006; Andersen &
Nersessian, 2000; Hoyningen-Huene, 1993). To highlight this
difference between personal and shared scientific concepts, we
use the terms “intrapersonal concepts” and “interpersonal
concept”. Of these, the interpersonal concepts can be shared on
the level of small and local groups, such as study groups in
learning, or on the level of extended and global groups, such as
scientific communities, in which case interpersonal concepts are
simply called scientific concepts. The learning process, where
an individual’s intrapersonal concepts acquire scientific
character and are transformed into interpersonal ones, involves
epistemic dimensions (the use of concepts in context of
explanation) and communicative dimensions, where concepts are
used in communication and consensus finding for what is
explained and how. A description of the learning process from
intrapersonal to interpersonal concepts requires one to have a
model of concepts, which, at one end of the continuum, takes a
form of an intrapersonal concept and in another end, as an
interpersonal, scientific concept.
In psychology and cognitive
science, two important viewpoints of interest here regarding
intrapersonal concepts are concepts as prototypes and concepts
as theories (Machery, 2009; Murphy, 2004; Smith & Medin,
1981). In the concepts-as-prototypes view, the prototype
represents a certain class of entities or objects to which the
concept refers, and the prototype is understood as a body of
knowledge about the properties of the members in that class.
However, such properties are assumed to be only statistical or
probabilistic, and are not strictly necessary or sufficient by
themselves to determine membership (Machery, 2009; Murphy,
2004). The statistical or probabilistic knowledge contained in
the prototypes can be about either 1) the typicality of the
category or 2) its cue-like properties. In both cases, a set
of properties or attributes and their values indicate how likely
or significant a given property is in regard to the
identification of the concept (Smith & Medin, 1981). In
concept learning, where concepts develop and are transformed, as
in, for example, a concept combination process, new concepts emerging from the
combination process inherit
some − but not necessarily all − of the properties of the
ancestor prototypes (Murphy, 2004). A reverse process of concept
differentiation can be understood as a process where new
concepts inherit partial or split sets of the properties of the
original concepts.
From the viewpoint of
concepts as prototypes, an important part of concept learning is
to learn the concept’s ontological attributions, which determine
to which ontological categories the concept refers (Keil, 1989;
Murphy, 2004). The ontological shift theory of conceptual change
(Chi & Slotta, 1993; Reiner et al., 2000; Slotta & Chi,
2006) addresses the way in which learners associate substance-
and process-like attributes with the concepts they use.
According to the ontological shift theory, many students’
learning difficulties originate from a misconceived ontological
class (Chi & Slotta, 1993; Slotta & Chi, 2006).
The view of
concepts-as-theories is based on psychological research, which
views conceptual knowledge as theory-like (Carey, 2010; Gopnik
& Meltzoff, 1997). The concepts-as-theory view focuses on
the role of causal knowledge in the categorisation process and
in concept learning. Concepts, in this view, are first and
foremost carriers of causal knowledge about the properties of
the members of classes to which the concepts refer. Therefore,
causal knowledge is considered crucial for concept recognition
and differentiation. Quite often, the role of causal knowledge
is discriminative with regard to the attributes or properties
attached to a concept (Machery, 2009; Murphy, 2004; Rehder,
2003).
These two different views
of intrapersonal concepts can be thought of as two different
ways to use concepts (Machery, 2009), thus reflecting the
multifaceted aspects of concepts. Such multifacetedness can
considered as a sign of a real cognitive difference between the
various ways of using concepts (Machery, 2009) or as different
projections (or mappings) of a more integrated, complex and
generic system that projects differently in different real
situations (Danks, 2010). Here, we adopt and further develop the
latter viewpoint of concepts as systems projecting differently
in different situations. The aspects of greatest interest in
developing such a systemic view are: 1) attributes and sets of
attribute values (as in the prototype view) and 2) causal and
theoretical knowledge and its role in distinguishing the
attributes in concept combination (as in the
concepts-as-theories view).
2.1.2 Interpersonal concepts
In learning, concepts must
be shared with other learners, instructors and teachers;
concepts must be interpersonal. When concepts are shared, there
must be common agreement of referents of the concepts, the ways
in which they refer to and the ways to use concepts; there must
be certain norms of usage. In particular, when concepts are
scientific concepts, they are “public” in that members
(scientists) of institutional groups (scientific communities)
share these intrapersonal concepts. Crucial in to this is not
only to agree on the norms, but also to link the norms to
accepted verification methods, such as observations, experiments
and models (Andersen et al., 2006; Andersen & Nersessian
2000; Hoyningen-Huene 1993).
There are relatively few
attempts to discuss scientific concepts so that connection is
made to a psychological understanding of concepts. One notable
exception, however, is a view which sees scientific concepts as
dynamic frames embracing conceptual knowledge (Andersen et al.,
2006). The dynamic frame view assumes that advanced scientific
concepts are acquired by the same process of categorisation as
everyday concepts. The categorisation of interest here is how
different exemplar-type problems fall into the same classes
based on how different types of models serve in solving those
problems. Then, the characteristic (but not defining) features
of concepts emerge from the reference to classes of models, or
clusters of models. Scientific concepts, where the models form
the classes relevant for learning concepts, are also regulated
and constrained by certain rules for applying concepts in
construction of models; the norms guide how to use the concepts.
The dynamic frames incorporate the attributes of concepts, in
much the same way as in the prototype theory, and the
theoretical knowledge as in the theory-theory approach, but now
theoretical knowledge has a role of organising the attributes
and imposing constraints on their co-variation (Andersen &
Nersessian, 2000; Andersen et al., 2006).
The focal point of this
study is the individual learner’s process of learning scientific
concepts, where personal concepts are transformed into
scientific concepts. However, because learning takes place in a
community of students and teachers, we must also understand how
communication affect the learning process. Collaborative
learning and sharing ideas in small groups has been shown to
enhance student learning. Such learning is beneficial when the
members’ knowledge supplements others’, but differs only
slightly from it; members show some knowledge equivalence and
knowledge sharing within the group (Jeong & Chi, 2007;
Weinberger, 2007). Here, however, re-analysed and revisited the
empirical data contain little information about the
communication, and although the effect is evident, very simple
models serve here to estimate the effects of communication on
concept differentiation.
The systemic view sees
concepts as part of a knowledge system, where the “concept” as a
part of the operation of the system may have a plurality of
appearances and project differently in different contexts, yet
the parts of the system remain unchanged. Recently, some have
suggested a different but related type of systemic view. It
employs the ideas of dynamic complex systems, where robust and
persistent conceptual patterns can arise in emergent fashion
from interactions of elemental pieces of the dynamic system
(Brown & Hammer, 2008).
These interactions can thus give rise to a full spectrum
of different projections of concepts, some of which are simple
and some, complex. The systemic view is also adopted here, so
concepts are considered functional parts of the system, affected
by the system and its evolution.
The systemic view requires
specific representations of concepts which can capture their
complex, multifaceted and dynamic nature. A suitable model of
such concepts should address at least the following features:
1) Attributes and sets of
attribute values as in prototype and dynamic frame views.
2) Theoretical knowledge in
role of constraining and guiding the use of concepts.
3) Models as they connect
to the development of scientific concepts.
4) Model competition and
utility as mechanisms affecting the evolution of concepts.
Requirements 1-2 are
essential to retaining a connection to a psychological view of
concepts and concept learning, which understand intrapersonal
concepts and the transition from intra- to interpersonal
concepts. Requirements
2-4 are essential to describing how intrapersonal concepts
develop or change into scientific concepts. In what follows, we
introduce just such a systemic model in section 3 and then, in
section 4, discuss how empirical results concerning the
differentiation of the concepts electric current and voltage can
be embedded within it. Finally, in section 5, we present a
computational embedding of the systemic view and use the
computational model to simulate the process of concept
differentiation.
3.
Systemic view on concept differentiation
The systemic view of
concept learning and differentiation sees concepts as constructs
which, in the one hand, take the form of intrapersonal concepts,
and on the other hand, the form of interpersonal concepts. Such
constructs are embedded in a conceptual system which evolves and
affects the constructs as part of the system’s own evolution.
The evolution of the concept system is changes in the
connectedness of the concepts and in the strength of those
connections. In that change, models play a central role, because
through models, the concepts become projected onto actual, real
situations.
Concepts as complex
structures are called here C-constructs. C-constructs are first
and foremost connected to sets of attributes, where connecting
links carry information about attributes and the strength of
associations with those attributes. Other elements of the
system carry knowledge of regularities and relate concepts to
each other in different ways: causally, through constrained
determination (constrained co-variation in a law-like manner) by
constraining the use of concepts (e.g. conservation laws). These
schemes are called determination constructs or, in shorthand, as
D-constructs.
C- and D-constructs are the
most elemental conceptual constructs of the systemic model, and
as such, they offer no explanations or predictions by
themselves. The task of explaining or predicting falls on
models, which utilise the C- and D-constructs as their
constituents. Models project concepts (C-constructs) onto
phenomena to be explained, and through the success or failure of
this projection, C-constructs are altered. The models, called
here as M-constructs, are also conceptual constructs, but unlike
C- and D-constructs, are context dependent.
The relationship of
C-constructs’ to characteristic attributes is familiar from the
prototype theory of concepts and is not only essential in
describing personal concepts, but also important in describing
scientific concepts. The basic level of attributes consists of
simple and unstructured sets of attributes {a1, a2, ..., ak},
but more structured combinations can fall under more general
schemes. These more general sets are subordinated under a more
general (e.g. constraining) scheme, which here is typically a
D-construct. In the course of concept development, these
attributes are inherited, although some of the inherited
attributes can be discarded as the concept evolves. These features
resemble the dynamic framework approach to concepts (Andersen
& Nersessian, 2000; Andersen et al., 2006). The attributes
are not strictly mutually exclusive, especially when
C-constructs are not used together. However, the more important
it is to use two C-construct together, the more difficult it is
to maintain dissonant attributes; C-constructs are
differentiated with regard to their attributes, as they should
if they represent scientific concepts.
D-constructs are general
schemes which relate C-constructs to each other, typically in
the form of causal connections or in the form of constrained
determination (i.e. constrained co-variance with no causal
dependence). In some cases, D-construct can simply constrain how
a single C-construct can be applied. Therefore, these constructs
are essentially the carriers of theoretical knowledge (c.f.
Machery, 2009; Rehder, 2003). The D-construct is the general
template of the form of determination and is largely independent
of the context, yet it prescribes how on can legitimately apply
C-constructs to a given context through models. D-constructs
play a crucial role in discriminating between attributes,
because D-constructs connect C-constructs. Through D-constructs
the dissonances between attribute associations are revealed.
M-constructs are designed
so that they serve as models which explain phenomena or their
selected properties. They use C-constructs, because concepts are
needed to build models (Nersessian, 2008). In some cases,
M-constructs are also related to D-constructs, which then
specify the relationship between C-constructs when more than one
C-construct is involved. M-constructs are the basic vehicles for
explaning or matching predictions with observable features of
phenomena or, if one so wants, to select certain features of
phenomena which fall under the explanatory power of a given
M-construct. On the
most advanced level, M-constructs are full-fledged scientific
models. On the most basic level, M-constructs are simple and
even self-explanatory. However, in either cases, M-constructs
are evaluated only against observational evidence {e1, e2, ...,
ek}, which may either lend it support or lead to its
rejection/inhibition. M-constructs compete against other
available M-constructs in providing the most likely explanation
of the evidence.
The systemic view sees the
knowledge system as a connected network of C-, D- and
M-constructs, where connections between these constructs can
continuously change when the evidence changes. Of course, the
system can also reach stable states so that there are no changes
in connections when additional evidence is available. Changes in connections
are based on locally effective rules, but the total effect
depends on the global state of the system as whole (due to
connectedness of the system). Different concepts can then be
expressed as different relational structures of the pieces or as
different constellations of elements. Within the systemic view,
connections can be a type of positive constraint so that a
connection strengthens the role of a given element. The
connections can also be a type of negative constraint so that
the connections weaken the role of the element. Identifying
connections and determining whether they are negative or
positive must be based on empirical evidence.
The evolution of the
concept system is driven by the utility of M-constructs in
explaining evidence (c.f. Henderson, Goodman, Tenenbaum &
Woodward, 2010; Ohlsson, 2009). The explanatory power of the
M-construct changes with the changing amount of evidence to be
explained. Because different M-constructs can explain the same
evidence, M-constructs compete against each other. If the
context is simple and only little evidence need be explained,
one can achieve this by using simple and only partially correct
models that correspond to simple M-constructs. Then, utility of
the simple M-construct is better than utility of more complex
ones (e.g. scientific ones), and the simpler ones are therefore
more likely to be adopted. However, with the increasing
complexity of the context and greater amounts of evidence to be
explained, the complex models (M-constructs), which explain
more, gain utility and become adopted. Thus, it is important to
note that in learning, the adoption of a model is a question not
only of its correctness, but also of its utility (Ohlsson, 2009,
2011). It is assumed here that different models and evidence to
be explained are known in advance. Many of the models may be
inactive and much of the evidence unknown for the learner in the
initial stages of learning. From point of view of modelling the
learning, one can assume finite collection of possible models,
some of them active and some inactive (cf. Henderson et al.,
2010).
In a learning situation
where concept learning and differentiation take place, learners
share concepts in the group discussions within small groups. In
learning, knowledge convergence is often considered crucial to
forming shared, public concepts (Weinberger et al., 2007; Jeong
& Chi, 2007). The
empirical data of concept differentiation indicate that
knowledge also converges during this process; the ways in which
one uses and understands the concepts become similar, at least
to certain degree (Koponen & Huttunen, 2013). In the
systemic view, this kind of knowledge convergence means that the
ways in which the C-constructs link to other elements of
knowledge and their attributes among the learning group become
more similar during the learning process. Here, the knowledge
convergence discussed only to the extent that it concerns
concept differentiation and via communication in small groups of
three. Therefore, in what follows, we concentrate on simple
triadic communication patterns (discussed in more detail in
section 4) and assume that the effect of convergence takes place
mainly through utility of models. The communication is described
simply as a consensus-based knowledge sharing where, through
communication, all group members always adopt the model with the
strongest utility. There is no threshold effect on adoption of
the model. Such a convergence model exaggerates the effect of
communication on learning, but it is an adequate model for
estimating the maximal expected effect of communication on
concept differentiation.
4.
Empirical findings revisited: Electric current
and voltage
Research on learning
scientific concepts and concept differentiation has been
conducted in several ways and on different topics, but perhaps
most extensively on the concepts electric current and voltage
(Cohen, Eylon & Ganiel; 1983; Shipstone, 1984; Engelhardt &
Beichner, 2004; Koumaras, Kariotoglou & Psillos, 1997; Lee
& Law 2001; McDermott & Shaffer, 1992; Reiner et al.
2000;). Some of the studies have focused on students’
explanatory models (Cohen et al., 1983; Engelhardt &
Beichner, 2004; McDermott & Shaffer, 1992; Koumaras et al.,
1997; Shipstone, 1984), while some other studies have focused on
ontological attributions (Lee & Law, 2001; Reiner et al.,
2000). The general outcome of these studies is that the concepts
electric current and voltage are often mixed with personal,
intuitive concepts or conceptions (or intrapersonal concepts)
and, furthermore, are poorly differentiated.
The impact of numerous
empirical studies on deeper theoretical understanding of concept
learning and differentiation, however, has been relatively
modest for at least two reasons. First, although these studies
have identified brought a variety of different types of models,
intuitive conceptions and ontological attributions, they have
failed to abstract from the empirical details general and
generic features which could provide a broad enough theoretical
perspective to understand the relationship between different
views. Therefore, for lack of a sufficiently broad theoretical
perspective, discussions have often focused on the differences
of a preferred theoretical perspective over that of some other
perspective, rather than trying to provide a more integrated,
progressive and broader theoretical framework that makes the
partial accounts understandable (see e.g. Chi & Brem, 2009;
Gupta et al., 2010; Ohlsson, 2009). We believe that many
empirical findings can be captured within the systemic model
when suitably idealised to reveal the essential generic features
behind the multitude of details. Furthermore, the systemic view
can help us to understand how different aspects of concept
learning are related. In
what follows, we focus only on features of interest to advanced
learners, typically those on an upper-secondary school level or
first-year university level.
The purpose of the present
work is to provide a new theoretical framework to discuss
concept differentiation when learners’ concepts take on a
scientific character. Rather than report new empirical results,
this study is re-uses and re-analyses already published
empirical data (Koponen & Huttunen, 2013). The empirical
data consists of nine students’ (upper secondary school)
interviews about their conceptions of electric current and
voltage in DC circuits. The students built DC circuits, observed
their behaviour, and then proposed explanations for the observed
brightness of light bulbs. The interviews were transcribed and
analysed to identify the models students use to explain the
behaviour. The nine students discussed their explanations in
groups of three. The study consisted of three different contexts
I-III:
I: Light bulbs in series. The participants
compared two variants (a single light bulb and two light bulbs)
in terms of the brightness of the bulbs. This comparison
produces evidence e1 and e2.
II: Light bulbs in parallel. The first
variant is again involves a single light bulb. The second
variant involves two light bulbs in parallel. Comparing the two
variants yields evidence e1’ and e2’.
III: Comparison of the
brightness of light bulbs in series (I) and in parallel (II). In
the first variant, participants compare the brightness of light
bulbs in series, and parallel circuits to the one-bulb case
only. In the second variant, participants compare series and
parallel cases to each other. This produces evidence e1’’ and
e2’’.
All six different types of
evidence are referred to as an evidence set E = {e0, e1, e2,
e0’,e1’ e2’, e0’’, e1’’, e2’’},
with e0, e0’ and e0’’ representing observations of the
brightness of a single light bulb in each context (the brightest
light bulb). Further details about the empirical setup, design
and excerpts from the student interviews are reported by Koponen
and Huttunen (2013). These
empirical studies reveal some common features, which answer the
following questions:
1. What are the models students use to make
predictions and explanations?
2. What are the determination (constraining
or causal) schemes students employ as part of their models?
3. What attributes do students associate
with the concepts, models or determination schemes they use?
4. How does communication in a small group
(3 students) affect the relationships between concepts, models
and determination schemes?
The results of the
re-analysis serve here to construct idealised sets of the M- and
D-constructs in contexts I-III, with a summary of the results in
Table 1. A summary of the attributes revealed by the analysis is
in Table 2.
M-constructs M1 and M2 are
well-known electric current-based intuitive models found in many
empirical studies (see Koponen & Huttunen, 2013, and
references therein), while constructs M1’ and M2’ represent
corresponding models, but are based on voltage (undifferentiated
from current). These appear in relatively fewer cases, but are
taken into account here. Constructs M3 and M3’ are partially
correct explanations, which take into account the role of
components in determining the current. Construct M3’, however,
appears only once in the empirical data. Construct M4 is the
correct scientific model based on Ohm’s law (D3) and Kirchhoff’s
laws I (D1) and II (D2) which correctly differentiates between
electric current and voltage.
Table 1
The M- and D-constructs
inferred from the empirical study (Koponen & Huttunen
2013)
Construct |
|
Construct |
|
M1 |
The battery as a
source of current. |
M1’ |
The battery as a
source of voltage. |
M2 |
M1+ components consume current. |
M2’ |
M1’+ components consume voltage. |
M3 |
M1 + voltage over
components creates current. |
M3’ |
M1’ + current over
components creates voltage. |
M4 |
Model based on
Ohm’s law + Kirchhoff’s laws KI and KII. |
|
|
D0 |
Constraining
laws: Conservation
(of “electricity” or current). |
D1 |
Constraint:
Current is conserved in junctions/branches (Kirchhoff
I). |
D2 |
Constraint:
Voltages in a closed loop equal zero (Kirchhoff II). |
D3 |
Ohm’s law: U = RI or U/I = R. |
Table 2
Attributes a1-a9 inferred
from the empirical study, with key word(s) used to
characterise and identify each attribute.
Attribute |
Key word |
Attribute |
Key word |
a1 |
Stored |
a2 |
Contained |
a3 |
Consumed |
a4 |
Conserved |
a5 |
Degraded or diminished |
a6 |
Divided and diminished |
a7 |
Maintained |
a8 |
Partitioned and conserved |
a9 |
Generated, supported |
|
|
Most of the relevant
elements found in the interviews can now be represented
according to the systemic view and by using a Directed Graph
Model (DGM) to relate different C-, D- and M-constructs to sets
of attributes and evidence. The DGM is a representation, where
connections between different elements are related through
directed links which are either congruent or dissonant. The
links and their direction provide information on how the
elements interact. This has the advantage that DGM can serve as
a computational template (Koponen, 2013). An example of how DGM
relates to different constructs and the most important links
connecting them appears in Figure 2. The links shown as solid
lines are mutually supporting, congruent links; dissonant links,
shown as dotted lines, point out contradictions. Congruent links
were recognised on the basis of how students combined these
elements in different situations. The recognition of dissonant
links was more problematic. Most of the dissonant links are the
interviewers’ interpretation of unavoidable logical
contradictions rather than notions expressed by the students
themselves (Koponen & Huttunen, 2013).
Different students’
conceptions can now be visualised as graphs with different node
strengths. Some
typical students’ conceptions A-D found in the interviews and
represented in this way appear in Figure 3. Cases A and B are the
most common in contexts I and II, while C and D usually occur
only in context III. D occurred in only two of the nine cases
studied, while C (or constellations close to it) occurred in
four cases (Koponen & Huttunen, 2013). An important aspect
of the DGM representations is that they represent the students’
understanding as a constellation of C-, D- and M-constructs and
associated attributes with various strengths. Of course, these
strengths are idealisations of the systemic model, which only
phenomenologically represents the apparent importance of a given
construct as it can be identified in interviews, and only
partial information about such strengths are available from the
empirical data. Nevertheless, such fine grained representations
of students’ conceptions contain more information than do
traditional ways based on written descriptions only.
Figure 2.(see
pdf file) Directed Graph
Model of all essential C-, D- and M-constructs based on the
empirical results, as reported in Table 2. C- and D-
constructs are linked to attributes {a1, a2, ..., ak},
M-constructs are linked to sets of evidence {e1, e2, ..., ek}.
Links can be congruent (solid line) or dissonant (dashed line).
Construct C1 is current, and C2 is voltage.
The information about
communication acts between the students (as it is available from
the interviews) can serve to construct idealised communication
patterns between students and to temporally locate the effects
of communication on the students’ choices of models and
attributions. Analyses of data on the individual students’
conceptions have been published previously (Koponen &
Huttunen, 2013), but data on communication is unpublished. A
summary of the changes in models and how communication takes
place is given in Table 3. Here, the re-analysed data, ordered
in temporal sequences to reveal the communication acts, allows
the identification of changes in C- and M-constructs.
Unfortunately, the original data offer no detailed information
on changes in sets of attributes.
The results in Table 3 show
that the communication patterns in groups G1 and G2 are
reciprocal in that all students exchange information in all
directions. Nevertheless,
some one-person dominated patterns are evident, where a single
student (s3 in G1 and s6 in G3) is more active than others.
Formally, such communication patterns between students P, P’ and
P’’ can be modelled as a triad (see Figure 4).
With group G3,
communication takes place reciprocally between all students and
the communication pattern is dense. Unfortunately, the empirical
data here do not permit a more detailed analysis of the
communication patterns. In
what follows, the effect of communication is modelled as a
triad; in one case as relatively sparse and one-person
dominated, and in other case as dense and reciprocal.
Table 3
Evolution of nine students’
conceptions 1-9 in groups G1-G3 of three students (s1-s3,
s4-s6 and s7-s9) in contexts I, II and III,
as idealised in terms of the DGM. Communication events are
shown as directed dyads i→ j from student i to student j, or
as reciprocal dyads i ↔ j.
Group G1, with students 1-3 |
|
Group G2, with students 4-6 |
|
Group G3, with students 7-9 |
|||||||
Context
|
s1 |
s2 |
s3 |
|
s4 |
s5 |
s6 |
|
s7 |
s8 |
s9 |
I |
A |
A |
(D) |
|
A |
A |
(D) |
|
A |
C |
A,C |
|
1←3 |
2←3 |
3→1,2 |
|
4←5,6 |
5←6 |
6↔4 |
|
7←8,9 |
8↔9 |
9↔7,8 |
|
A |
C |
D |
|
C,B |
B |
D |
|
A,C |
C,A |
A,C |
II |
1↔3,2 |
2↔3,1 |
3↔1,2 |
|
4←6 |
5↔6 |
6→4,5 |
|
7↔8,9 |
8↔7,9 |
9↔7,8 |
|
C |
(D) |
D |
|
C |
B |
D |
|
|
|
|
|
1←3 |
|
3→1 |
|
4←6 |
5←4 |
6↔4 |
|
|
|
|
|
C |
D |
D |
|
C |
C |
D |
|
C |
C,A |
A,C |
III |
1↔3,2 |
2↔2,3 |
3↔1,2 |
|
4→5,6 |
5←4 |
6←4 |
|
7↔8,9 |
8↔7,9 |
9↔7,8 |
|
D |
D |
D |
|
D |
D |
D |
|
C |
C |
C |
The Directed Graph Model
(DGM) can serve as a computational template; as a computational
embedding of the systemic view to produce generically similar
features found in empirical situations. In what follows, we
briefly describe the computational features of such embedding;
with similar type updating rules (see Appendix) that have
previously been introduced and motivated elsewhere (Koponen,
2013). Computational embedding transforms the qualitative
notions contained in the systemic view into computational rules,
quantifies the roles of model utility and theoretical guidance.
Concept learning and the degree of concept differentiation are
monitored through two quantities: Theoricity T and Separability
S. Theoricity T describes the theoretical complexity of the
concept, while Separability S is connected to differentiation
and, thus, to ontological shift. The pair of values (S,T) then
specifies the learning path.
In the DGM, C-, D- and
M-constructs are nodes connected by directed links. Each node
has a dynamically evolving strength, which determines its effect
on the other nodes to which it is connected and, thus, the
dynamics of the system. M-constructs are also connected to sets
of evidence (see Figure. 2). Node strengths are updated after
comparing M-constructs with the evidence, which means obtaining
new evidence or reconsidering existing evidence. The DGM also
has a memory effects in that the new strengths of the links and
nodes and depend recursively on the previous values.
Furthermore, the simulations take into account also effects of
communication between learners. In computational embedding
information contained in one graphs affects the strengths of the
nodes and thus the dynamics of another graph. To describe the
state of the system and to characterise the evolution of the
concepts, we must define several quantities in terms of node
strengths and links. Below is a short overview of these
quantities. A complete description of the update rules and their
definitions in terms of link and node strengths are given
separately in the Appendix. The details given in the Appendix
are not essential for understanding in general level how the
model works, but the mathematical details give are needed to
fully appreciate how the memory effects arise through
connectivity from the global state of the network.
The most important of the
quantities are the Theoricity T and Separability S of
C-constructs, which serve to specify the learning paths.
Theoricity T is a measure of the theoretical complexity of a
C-construct that roughly describes the number of paths from
C-constructs to M-constructs while taking into account the
strengths of the links and nodes (see the Appendix for details).
Separability S describes the degree of dissimilarity between
C-constructs with regard to the different attributes associated
with them (see Table 2). If two C-constructs are connected to
completely different sets of attributes, S is at its maximum
value. Both quantities are defined in a range from 0 to 1 so
that T = 1 means full theoretic complexity (corresponding to the
scientific use of given concept) and S = 1 means complete
differentiation.
The dynamics of the DGM
depend crucially on the Utility U of M-constructs, and the
Utility is the basis for model selection (the strengths of
M-constructs depend on their Utility, see Appendix). First and
foremost, the Utility is proportional to the ratio of explained
evidence to the theoretical complexity T of the C-construct
while taking into account the relevant strengths of nodes and
links. If the model explains most of the available evidence and
its Theoricity is low, the Utility will be high and the model
will be favoured in explanations. With more evidence to explain,
the less complex models will generally explain less, thereby
their Utility is reduced. The extent to which the model takes
into account conflicting evidence can be controlled with the
parameter K, which also controls the effect of D-constructs on
Utility. If K is set to a high value, the state of the system is
heavily guided by evidence and theoretical knowledge (i.e.
D-constructs). With low values for K, conflicting evidence and
theoretical information will be more or less ignored, thus
(since D-constructs describing causal conservation laws will be
less important), favouring simpler models. The importance of
evidence (whether conflicting or not) can be also adjusted by
weakening the links between M-constructs and evidence. One can
also alter the order in which one encounters the evidence. In
practice, parameter K is related to the potential of an
individual student to make use of theoretical knowledge to
construct explanatory models.
Learning also depends on
the state of an individual student’s initial knowledge. The
state of initial knowledge is taken into through the initial
strengths of the different models M1-M4, usually so that simple
models (such as M1 and M2) have high a priori strengths (they
are then the preferred models), while complex models have low
initial strengths. Also, the initial strength of D-constructs
affects how strongly theoretical knowledge will guide the
learning process, an effect taken into account through parameter
D’. In addition, students differ in their attentiveness to
evidence, so that part of the evidence receives more weight than
some other parts of the evidence.
This is taken into account by giving weights to the
evidence also. Finally, the sequence of evidence and the order
one encounters the evidence (i.e. the training sequence) affect
the dynamics of the learning paths. The values of these
parameters and their initial values serve to model the
individual learners’ initial knowledge and their potential to
make use of theoretical knowledge.
In addition to the
individual characteristics described above, communication
between individuals affects the dynamics of learning paths. In
the DGM, communication can also affect the strengths of the
M-constructs. Assuming pairwise communication between students,
the stronger M-constructs affect the weaker ones so that the
lower value is increased by the communication impact factor C. A
value C = 1 means complete adoption of the highest utility
models in communication, and C = 0 means ignoring completely the
information provided through communication. The Appendix
explains the details of the communication model as part of the
DGM.
The computational model is
idealisation and takes into account only the roughest features
of concepts, models and communication. However, the model is
constructed to include the most essential generic features and,
as such, is capable of providing important insight into how
different parts of a conceptual system and its internal
connectivity affect concept differentiation.
6.
Results: DGM simulations of concept differentiation
The Directed Graph Model
(DGM) and simulations based on it must make understandable the
following generic features of concept learning and
differentiation:
1)
Context-dependent dynamics of concept learning and
differentiation (learning paths). The students’ conceptual
states (as shown in Figure 4) are context dependent in that they
appear mostly in given contexts I-III, with a given set of
evidence (or observations) to be explained, and the state
changes with the changes in set of evidence.
2)
The dynamics and persistence of ontological change in
attributions. Changes in ontological attributions are indicative
of concept differentiation. When it takes place, it leads robust
and persistent learning outcome.
3)
The effect of communication on concept differentiation. In two
of three cases, a given group has a student with a more
sophisticated conception and more differentiated concepts than
the two other students have, but who eventually partially adopt
that sophisticated conception.
Of course, the learning
process entails many other details, but these generic features
1-3 are the most important and interesting ones that any model
of concept learning and differentiation should explain. In what
follows, we concentrate on simulating just such a process of
concept learning and differentiation by using the DGM and
monitoring the learning process through the Theoricity T and
Separability S of concepts.
The DGM allows
parameterisation of many different initial stages. The initial
stages are described through the initial strength of
M-constructs and D-constructs, and through the strength of the
evidence. For
initial model strengths, we studied here cases where M1 and M2
are strong models (strength 1.0 - 0.75), and M3 is of moderate
strength (strength 0.5 - 0.25), and other M constructs are weak
(strength 0.25 - 0.05). These cases are interesting, because
they provide information on how initial, rather unsophisticated
models such as M1 and M2 evolve during the learning process
towards sophisticated models such as M4, and how concept
differentiation relates to this change. This is also the
learning path of most practical interest, a path from intuitive
to scientific concepts.
For initial D-construct
strengths, we studied cases where D1, D2 and D3 are of equal
strengths D’, varying from 1.0 to 0.4. In addition to initial
values for D’, theoretical knowledge operates through congruent
and dissonant connections, which can be tuned by parameter K
(see the Appendix and Table 4) so that value K = 1 denotes the
strongest guidance and, K = 0, no guidance at all. In addition
to these parameterisations, the dynamics of the DGM and the
learning paths depend on what we call here the training
sequence, meaning evidence and the order in which one encounters
it.
The training sequences are
constructed to correspond to empirical contexts I-III (see
section 4.1) so that each sequence IàIIàIII
consists
of evidence {e0, e1, e2, e0’, e1’ e2’, e0’’, e1’’, e2’’} where
each element e0, e1, … is associated with strength e, specifying
how much attention one pays to the evidence. If e = 1, then
evidence is taken fully into consideration, but for 0.0 < e
< 1.0, only partially. In the simulations, each event is
repeated N times, (N = 3
or N = 4) and the sequence is then reversed in order to verify
the permanence of learning (i.e. no reduction of values T and S
for the reversed sequence, and no “hysteresis” effect). Thus,
the computation consists of training sequences of form
{Nx(e’,e’,e’); Nx(e’’,e’’,e’’); Nx(e’’’,e’’’,e’’’);
Nx(e’’’,e’’’,e’’’); Nx(e’’,e’’,e’’); Nx(e’,e’,e’)}, with N = 3
consisting of 54
events and for N = 4 of 72 events.
The training sequences of that form are completely
specified by N and the set of values O = (e’,e’’,e’’’). In some
cases, to test the hysteresis, an additional 38 events are added
in random sequence, denoted by R. In summary, the parameters
that specify the initial conditions are N and O, and the
parameters affecting the dynamics are K and D’.
The personal (individual,
without communication) learning paths are studied first in a
case, where initial conditions favour models M1 and M2 with
initial strengths of 0.75, but where model M2’ also has a
substantial strength of 0.5. The learning paths begin from
unsophisticated models, which closely correspond to patterns
such as A and B (see Figure 3), and then progress towards a more
sophisticated patterns of type D. The learning paths of concept
differentiation are monitored through the evolution of
Theoricity T and Separability S. For comparison, estimates of
the values of T and S corresponding to the empirical cases idealised as graphs
A-D (see Figure 3) are: T = 0.35 - 0.45, S = 0.10 - 0.20 for A;
T = 0.35 - 0.45, S = 0.55 - 0.65 for B; T = 0.55 – 0.70, S =
0.60 – 0.70 for C; and
T = 0.90 – 1.00, S = 0.95 – 1.00 for D.
The learning paths are
shown in Figure 5 for training sequence parameterisations
O1=(1.0,1.0,1.0), O2=(1.0,0.8,0.8) and O3=(1.0,0.5,0.1), with N
= 3. The positions, where sequences corresponding to contexts I,
II and III end, are denoted. Theoretical guidance is studied for
strong guidance K = 1.0 and 0.8, and for weaker guidance K =
0.50, while parameter D’ (for D-constructs) ranges from 1.0 to
0.4. The evolution of M-construct strengths, which reflects the
competition between models that must explain more evidence, is
shown in Figure 6. The situation shown in Figures 5 and 6 is
asymmetric with respect to C1 (current) and C2 (voltage), with
C1 always having a higher Theoricity T than C2. This asymmetry
stems from asymmetry in the initial strengths of M1 and M2, as
shown in Figure 6. This corresponds to the most frequent
empirical situation in which students initially favour
current-based models over voltage-based ones. One can also
interpret the results in reverse way, with C2 having a higher
Theoricity and the roles of M1 and M2 reversed. However, this
situation where voltage-based model is initially preferred over
current-based model seldom occurs in empirical cases.
For strong theoretical
guidance (K = 1.0 or 0.8), together with close attention to
observations (training sequences O1 and O2), learning and
concept differentiation are successful. In such cases (Figure 5,
in the upper row, two cases on the left) learning is complete
for concept C1 (current), which is fully scientific (T = 1) and
completely differentiated (S = 1) from concept C2 (voltage).
Concept C2 is nearly scientific (T = 0.6), and with some extra
training (shown in grey in Figure 5), it rapidly becomes a fully
scientific (T = 1) concept.
The learning paths are step-wise, with clearly
distinguishable stable stages in Theoricity T with increasing Separability S. A
sequence corresponding context I is already enough for
relatively advanced differentiation (i.e. ontological shift),
although Theoricity T may remain low. There appears to be a
threshold of S = 0.7-0.8, which one can reach even with moderate
development in theoricity. This threshold shows that one can
achieve nearly complete separability and good differentiation
(i.e. nearly complete ontological shift) even though the
learning is otherwise still incomplete.
When theoretical guidance
decreases, K = 0.5 and D’ = 0.60 (Figure 5, lower row, middle)
or K = 0.8 and D’ = 0.4 (Figure 5, upper row, right), the
theoricity of concepts C1 and C2 remains low for the training
sequence (black dots), but again, with extra training (grey
dots), improvement is possible. This trend shows that,
eventually, even moderate theoretical guidance is effective, but
then more training is needed. However, the order one encounters
the evidence in training is not crucial, if the context III is
involved. When theoretical guidance is low (K = 0.5, D’ = 0.4) and little attention
focuses on evidence in case of context III, very little learning
takes place, irrespective of the amount of training. This
situation is shown in Figure 5 in the lower right corner.
The evolution of
M-constructs in Figure 6 corresponds to learning paths in Figure
5. The initially dominant M-constructs M1 and M2 remain dominant
until the end of the sequence corresponding to context I. During
the sequence corresponding to context II, models M1’ and M2’
also become active and grow stronger. However, when sequence III
begins, with strong theoretical guidance, the initial models
cannot compete with M4 (fully scientific model), which
eventually dominates when the sequence corresponding to context
III ends. This does not occur with low theoretical guidance
(Figure 6, right column). However, with extra training, M4 is
eventually enforced in cases of moderate theoretical guidance
also (Figure 6, lowest row), but not for the least guidance
(Figure 6, lower right corner). It is noteworthy that M3’ is
never activated, which is in line the empirical finding that
such a voltage-based model is only seldom encountered.
Figure 5. The
Theoricity T and
Separability S of
concepts C1 (bullets) and C2 (boxes) in the case of six
different learning paths with given parameters K and D’ that control the
strength of theoretical guidance. The upper row shows cases
where K ≥ 0.8 is always
relatively high but D’
varies from 1.0 to 0.4. In the lower row, K also varies from a
high value of 0.8 to a lower value of 0.5. The initial values of
the model strengths and strengths of the observations are
different in cases shown in the left, middle and right columns
(corresponding model strengths are shown in Figure 6). Left
column: Initial values of model strengths favour models M1 and
M2’ with strengths of 0.5, while other models have a weaker but
equal strength of 0.25. The observations of events I-III are
strong (link strengths have a value of 1). Middle column: Model strengths as in
the left column, but M3, M3’ and M4 are reduced to 0.15,
observations I-II are strong (1), but III is only moderately
strong (0.75). Right column: Otherwise similar to the middle
column, but the observations in case III are weak (0.10). The training sequence
from I to III (end points of each sequence are marked in the
figure), with three repetitions for each event appear in black
dots. The training sequence testing the permanence of learning
from I to III, then back from III to I, and one random sequence
appears in grey dots. The
values corresponding to the empirical results of configurations
A-D (see Figure 3) are indicated (two symbols for each are
located in the pairs of the lowest estimated and highest
estimated values for T
and S).
Figure 6. (see pdf file)
Model evolution of the learning paths shown
in Figure 5 with parametrisations for K and D and different
training sequences O1, O2 and O3 as indicated. The color
represents the strength of a given model. The number of steps in
the simulation appears on the vertical axis thus indicating the
ordering of the sequence.
Learning paths with
slightly different initial conditions from the cases in Figure 5
are shown in Figure 7. In the cases shown in the upper row, the
M-constructs M3 and M4 are slightly weaker than in the case
shown in Figure 5. In the cases shown in the lower row,
M-constructs M1 and M2 are of nearly equal strengths, which
makes the initial stage more symmetric with respect to C1 and
C2. In both cases, the attention to events corresponding to
contexts II and III becomes weaker from left to right,
represented as parameterisations O3 (as in Figure 5) and O4 =
(1.0,0.8,0.5). In addition, the training sequences is now such
that each event is reproduced three (N = 3, black dots) or four
times (N = 4, grey dots). The corresponding evolution of
M-constructs is shown in Figure 8.
Compared to the cases shown
in Figure 5, one can observe some interesting differences. In
the case shown in Figure 7, in the upper left corner, learning
consists mostly of ontological shift through end of the sequence
corresponding to context II. After that, when the sequence
corresponding context III begins, Theoricity T rapidly increases
because M4 rapidly gains strength (see Figure 8) due to strong
theoretical guidance. Eventually,
when the training sequence ends, learning is again complete. The
training sequence with N = 4 leads to higher Theoricity T of
concepts in I and II, but interestingly, to a slower increase in
theoricity in III than in cases with N = 3 because with N = 4,
M3’ grows strong during I and II, which slows the adoption of
M4. This “overlearning” effect is most pronounced in the case
shown in Figure 7, in the upper right corner, where theoretical
guidance is low and attention paid to events is also low. In
this case, more frequent repetition of events with N = 4 leads
to deterioration of the learning results, and learning stagnates
on the low Theoricity T of C1 and C2. Ontological shift,
however, advances and eventually, Separability S = 0.8 is
reached. Such a situation corresponds to what occurs in real
learning; too much focus on overly simple tasks, which
reinforces unsophisticated models, may lead to the persistent
and robust use of under-developed models and conceptions.
Figure 7. (see pdf file)
Three
deterministic cases of learning paths with a given K. The figures show
Theoricity T and
Separability S of
C-constructs C1 (bullets) and C2 (boxes) from Figure 2 and
indicates the values corresponding to empirical results A-D (see
Figure 3). Construct
C1 is current, and C2 is voltage. The initial conditions favour
models M1, and M2’ (voltage based).
Figure 8. (see pdf file)
Model evolution
of the learning paths in Figure 7 with parameterisations for K and D and different
training sequences O3 and O4 as indicated. The darkness
represents the strength of a given model. The numbers of steps
in the simulation appear on the vertical axis, thus indicating
the ordering of the sequence.
The lower row in Figure 7
shows some interesting situations, where repetition temporarily
leads to the deterioration of learning results, when simple
situations recur after the sequence IàIIàIII.
We
briefly refer to this as “hysteresis” in learning. Eventually,
however, (Figure 7,
lower row, two cases on left) complete learning with T = 1 and S
= 1 occurs and learning becomes permanent, no longer affected by
further repetitions. In the case of weak theoretical guidance
and weak learning from events (Figure 7, lower right corner),
incomplete learning occurs, with stable learning resulting at T
= 0.4 and S = 0.6. This is again due to “overlearning” of
incomplete models, which prevents the adoption of the more
sophisticated model M4. Similar results can also be observed for
other cases with moderate K and moderate attention to
observations; repeating simple situations I and II many times
before encountering more complex situation III, may reinforce
the incomplete models M1-M3 or M1’-M3’ so much that further
development can no longer take place. This shows that
repetitions of training sequences can have detrimental
consequences on learning if initial theoretical guidance is too
low.
The examples discussed
above are asymmetrical situations, where concept C1 (current) is
the favoured concept, while C2 (voltage) is initially less
developed, and remains largely as is during further evolution.
This is the most common situation in learning, although the
roles can sometimes be reversed. However, because the DGM is
symmetrical with respect to C1 and C2 (see Figure 2), a reversed
situation where C2 has stronger Theoricity than C1 is quite to
similar if C1 and C2 simply switch roles. Also, the symmetrical
situations can occur can closely follow the results in Figures 5
and 7, with the learning paths for C1 and C2 then simply
overlapping.
The effects of
communication on learning paths are simulated by using the
sparse and dense communication pattern between members P, P’ and
P’’ in a group of three (see Figure 4), and two impact factors C
= 0.2 and C = 0.75 for communication. The effect of
communication is tested on cases where theoretical guidance is
strong or moderate (Figures 5 and 7, upper row, in the right
column). The results of learning paths are shown in Figure 9,
and the evolution of M-constructs in Figure 10. These figures
show that even the effect of dense communication with a high
impact C = 0.75 only moderately affects the learning paths. The
most obvious effect is that if one member P in the group has a
learning path which is strongly theoretically guided, thus
reaching high values for T and S, the other cases tend to learn
from that specific case and improve their learning and,
consequently, reach higher values for T and S than without
communication. Eventually, members P’ and P’’ who are less
successful (Figures 5 and 7, in the lower right corner) than
member P also achieve complete learning owing to the
communication. This happens equally well for sparse and
low-impact communication as for dense and high-impact
communication. Of course, this occurs only in cases with one
successful learner in the group. These features appear to be in
concordance with the empirical findings, although the empirical
findings presently allow no more detailed comparisons.
In the learning model,
which is simply biased toward adopting the strongest model, the
good learning result may temporarily worsen (see Figure 9, upper
right corner). However, this is a transient effect, and the
learning path eventually evolves toward complete learning.
Figure 9. (see
pdf file) Learning paths with the
effect of communication taken into account. Different figures
represent paths with different parameterisations for K and D and different
training sequences O1, O2, O3 and O4.
Figure 10. (see
pdf file) Model
evolution of the learning paths in Figure 9 with different
parameterisations for K
and D and different
training sequences O1, O2 and O3. The darkness represents the
strength of a given model. The number of steps in the simulation
appears on the vertical axis, thus indicating the ordering of
the sequence.
The results based on the
DGM agree with the following central empirical findings of
concept learning and differentiation:
1. Context-dependent dynamics. This is
apparent in the strong dependence of paths on the learning
sequences. Complete learning takes place only in sufficiently
rich contexts (e.g. case III), whereas in narrow contexts (e.g.
cases I and II), learning is moderate or incomplete. This is a
consequence of model competition and the greater utility of
complex models in complex contexts.
2. The
persistence of ontological shift and concept differentiation (S ≈ 1). In the DGM,
this is a direct consequence of the guidance of D-constructs and
their “memory effect”, retaining the memory of successful
applications of D-constructs. The persistence of the ontological
shift agrees with the empirical findings. However, the
ontological shift in attributions is not a driving force of
concept learning, but an outcome of a learning process driven by
theoretical knowledge.
3. Communication affects individual
learning paths and enables less advanced members of the group to
adopt more advanced M-constructs from the most advanced member
of the group. Thus communication improves learning, although the
effect in the cases studied here is not particularly strong.
In summary, the DGM model
reproduces the generic features of interest in concept learning
and differentiation, and demonstrates that these features are
associated with the guidance of theoretical knowledge, model
utility and the memory effects of success in using models.
7.
Discussion and conclusions
The model presented here is based on the systemic view,
where concepts are viewed as complex, dynamically evolving
structures. The model is constructed to capture generic aspects
of concept learning and differentiation as exemplified in the
case of learning two closely related scientific concepts – here,
electric current and voltage. The generic features of most
interest in need of explanation are: 1) the robustness of
certain simple ways to use concepts to provide explanations in
simple situations, a phenomenon usually assigned to robustness
of misconceived ontological classes, 2) the context-dependent
dynamics of change and requirement to encounter complex enough
situations to effect in the change, and 3) the robustness of
ontological shift once it has occurred. We suggest here that in
order to understand these features and the dynamics of the
change, we must develop a rich and complex enough model of
concepts. On the one hand, the model of concepts takes a form of
simple and nearly self-explanatory concepts, but on the other
hand, a form of complex structures, dependent on other concepts.
The systemic view is embodied by the use of the
well-known case of the differentiation of electric current and
voltage as concepts describing the behaviour of simple DC
circuits. In that, we use here re-analysed empirical data. The
re-analysed (and partly re-interpreted) empirical results are
then represented by using different conceptual elements, or
constructs: C-constructs, which stand for concepts, D-constructs
for causal schemes and law-like theoretical schemes, and
M-constructs, which are model-like structures that use C- and
D-constructs as integrated parts.
As a formal representational model for these constructs
and their mutual relationships, we introduce a Directed Graph
Model (DGM). In the DGM, concepts are nodes in the graph
connected by directed links to other conceptual elements.
The DGM serves as a computational template to simulate
concept learning and differentiation and their dynamics. The
stability of certain properties of concepts, traditionally
considered robust “misconceptions”, and their dependence on
contexts is now seen as related to the complex interplay of
different conceptual elements. Change is driven by competition
between M-constructs (models) and by how available evidence
governs it. However, how this is reflected in the theory- and
attribute-relatedness of concepts depends on how those
M-constructs employ concepts (i.e. how the concept projects onto
the actual evidence). Thus, D-constructs (theoretical knowledge)
are central. All these aspects are recognised in current
cognitively oriented views of concept learning, but are usually
discussed separately or as unrelated views. The present study
strongly suggests unifying these views and treating concepts as
complex, multifaceted and dynamic structures.
Finally, the present work suggests that the theoretical
background developed for research on concept development has
important implications for the ways in which researchers and
instructors view the learning process and how, on this basis,
they design teaching solutions. The results point to the crucial
role of theoretical knowledge in guiding concept learning and,
furthermore, show that ontological shift, while an important
part of learning, is not the primary driving force of learning,
but is rather a consequence of more fundamental changes in the
conceptual system. This suggests that theoretical structures and
model utility should receive more attention in designing
instructional solutions. On the other hand, it is clear that
initial conceptions need not be actively “unlearned”; they can
instead serve as a natural and useful starting point for the
transformation.
For teaching and instruction, one important message lies
in the role of the training sequence in determining learning
paths. The details of the training sequence, and their
repetitions, do matter in the initial stages of learning if the
guidance of theoretical knowledge is low. Then, too much
repetition of overly simple situations to explain may lead to
“overlearning” of unsophisticated concepts and models, and
effectively prevent the acquisition of more advanced model. The
results of the simulations, interpreted within the theoretical
framework of the systemic model put forward here, suggests that
designing specific training sequences which help one to
“unlearn” unsophisticated models is unnecessary; rather, what is
needed is a training sequence which gradually and at suitable
stages of learning introduces more challenging learning
situations, where the utility of more advanced and scientific
concepts and models becomes apparent.
The theoretical positions discussed and suggested here
directly impact learning and instruction by clarifying the
degree to which degree ontological shift drives the learning
process and to which degree it should be considered a
consequence of more fundamental, theory-driven learning process.
Also, the question of to what extent differentiation and concept
learning take place through the evolution of existing
structures, and to what extent the learner must receive these
structures instead of constructing them receives clarification.
Briefly, if the systemic view is correct, it suggests that
ontological shift takes place, but is a consequence of
theory-driven learning. The learner must receive complex
theoretical structures through instruction and see their utility
in complex enough situations to warrant adopting them. Such
results are practical in that they guide teaching and the
development of teaching solutions; they provide support to some
of the well-known teacher-centred solutions (the role of the
teacher in providing models to organise new knowledge and in
familiarising the students with complex theoretical models),
while showing the indispensability of a rich context and context
variation in the construction of explanatory models and the role
of predictions and observations in learning.
These notions, even without detailed suggestions for
training sequences and instructional solutions, demonstrate that
the choices to employ a certain theoretical framework to
understand concept learning and differentiation are not neutral.
Rather, they have fundamental consequences for how learning and
instruction are conceived, how their purposes and goals are
viewed, and how our attention is guided towards crucial generic
features of learning and its dynamics.
Keypoints
Concepts are considered complex structures, which
are projected differently in different contexts.
Concept differentiation can be modelled when
embedded within a systemic view on concepts.
Theoretical guidance and theoretical schemes are
crucial for concept differentiation. Ontological shift is a
consequence of theory-guided learning process.
Robust misconception are stable dynamic states of
the concept system
Attention must be paid on training sequences in
learning. Too frequent use of overly simple situations in
training will stagnate the concept learning in robust states
corresponding misconceptions.
References
Andersen, H. Barker, B. and Chen, X. (2006). The
Cognitive Structure of Scientific Revolutions. Cambridge, MA:
Cambridge University Press.
Andersen, H. and Nersessian, N. J. (2000). Nomic
Concepts, Frames, and Conceptual Change. Philosophy of
Science, 67, S224-S241.
Brown, D. E., & Hammer, D. (2008). Conceptual
Change in Physics. In S. Vosniadou (Ed.), International
Handbook of Research on Conceptual Change (pp.
127–154). New York: Routledge.
Carey, S. (2010). The Origin of Concepts. New
York, NY: Oxford University Press.
Chi, M. T. H., & Slotta, J. D. (1993). The Ontological Coherence of Intuitive Physics.
Cognition and
Instruction, 10, 249-260.
Chi, M. T. H. (2005). Commonsense Conceptions of
Emergent Processes: Why Some Misconceptions Are
Robust. The Journal of the Learning Sciences, 14,
161-199. DOI: 10.1207/s15327809jls1402_1.
Chi, M. T. H. (2008). Three Types of Conceptual
Change: Belief Revision, Mental Model Transformation,
and Categorical Shift. In S. Vosniadou (Ed.),
International Handbook of Research on Conceptual
Change (pp. 35–60). New York, NY: Routledge.
Chi, M. T. H., & Brem, S. K. (2009).
Contrasting Ohlsson's Resubsumption Theory With Chi's
Categorical
Shift Theory'. Educational Psychologist, 44, 58 — 63. DOI: 10.1080/00461520802616283.
Cohen, R., Eylon, B., & Ganiel, U. (1983).
Potential Difference and Current in Simple Electric Circuits:
A
Study of Students’ Concepts. American Journal of
Physics, 51, 407-412.
Danks, D. (2010). Not different kinds, just
special cases. Behavioral and Brain Sciences 33, 208-209.DOI:
10.1017/S0140525X1000052X
Engelhardt, P. V., & Beichner, R. J. (2004). Students’ Understanding of Direct Current
Resistive Electrical
Circuits. American Journal of Physics, 72, 98-115. DOI:
10.1119/1.1614813.
Gopnik, A., & Meltzoff, A. N. (1997). Words, Thoughts, and Theories. Cambridge, MA: MIT
Press.
Gupta, A., Hammer, D., & Redish, E. F. (2010). The Case for Dynamic Models of Learners’
Ontologies in
Physics. The Journal of the Learning Sciences, 19,
285-321. DOI: 10.1080/10508406.2011.537977.
Henderson, L., Goodman, N. D., Tenenbaum, J. B.,
& Woodward, J.
F. (2010). The
Structure and
Dynamics of Scientific Theories: A Hierarchical
Bayesian Perspective. Philosophy of Science, 77,
172–200.
Hoyningen-Huene, P. (1993). Reconstructing
Scientific Revolutions: Thomas S. Kuhn’s Philosophy of Science.
Chicago, IL: The University of Chicago Press.
Jeong, H & Chi, M. T. H. (2007). Knowledge
convergence and collaborative learning. Instructional Science,
35, 287–315. DOI: 10.1007/s11251-006-9008-z.
Keil, F. C. (1989). Concepts, Kinds and
Conceptual Development. Cambridge, MA: MIT Press.
Koponen, I. T. (2013). Systemic View of Learning Scientific Concepts: A
Description in Terms of Directed Graph
Model. Complexity, 19, 27-37. DOI: 10.1002/cplx.21474.
Koponen I. T. and Huttunen L. (2013). Concept Development in Learning Physics: The Case
of Electric Current
and Voltage. Science & Education, 22, 2227-2254. DOI: 10.1007/s11191-012-9508-y.
Koumaras, P., Kariotoglou, P. & Psillos, D. (1997).
Causal
Structures and Counter-intuitive Experiments in
Electricity. International Journal of Science
Education, 19, 617–630.
Lee, Y., & Law, N. (2001). Explorations in
Promoting Conceptual Change in Electrical Concepts via
Ontological Category Shift. International Journal of
Science Education,
23, 111- 149.
Machery, E. (2009). Doing without Concepts.
Oxford: Oxford University Press.
Murphy, G. L. (2004). The Big Book of Concepts.
Cambridge, MA: MIT Press.
Nersessian, N. (2008) Creating Scientific
Concepts. MIT Press: Cambridge, MA.
Ohlsson, S. (2009). Resubsumption: A Possible
Mechanism for Conceptual Change and Belief Revision.
Educational Psychologist, 44, 20-40. DOI:
10.1080/00461520802616267.
Ohlsson, S. (2011). Deep Learning: How the Mind
Overrides Experience. Cambridge, MA: Cambridge University
Press.
Rehder, B. (2003). Categorization as Causal
Reasoning. Cognitive
Science, 27, 709–748.
Reiner, M., Slotta, J. D., Chi, M. T. H., &
Resnick, L. B. (2000). Naive Physics
Reasoning: A Commitment
to Substance Based Reasoning. Cognition and
Instruction, 18, 1-34.
Shipstone, D. M. (1984). A Study of Children’ s
Understanding of Electricity in Simple DC
Circuits. European Journal of Science Education, 6,
185- 198.
Slotta, J. D., & Chi, M. T. H. (2006). Helping Students Understand Challenging Topics in
Science Through
Ontology Training. Cognition and Instruction, 24,
261-289.
Smith, C., Carey, S., & Wiser, M. (1985). On
differentiation: A Case Study of the Development of the Concept of
Size, Weight and Density. Cognition, 21, 177-237.
Smith, E. E., & Medin, D. L. (1981).
Categories and Concepts. Cambridge MA: Harvard University
Press.
Weinberger, A., Stegmann, K., Fischer, F. (2007)
Knowledge convergence in collaborative learning:
Concepts and assessments. Learning and Instruction, 17,
416-426.
Appendix:
DGM update rules
The dynamics of the DGM is determined by
the update rules of the node strengths and weights of connecting
links between the
nodes. In the DGM, C-, D- and M-constructs are nodes, which are
connected by directed links. Each node i has dynamically
evolving strength si,
which determines its effect on the other nodes to which it is
connected and, thus, the dynamics of the system. Node strengths
s (the subscript is
omitted if not essential) are updated after each “event” e in the set of all
events E, which means obtaining new evidence or reconsidering
the evidence (i.e. any kind of comparison with the evidence).
Variable e is treated
as a running index that keeps track of the encounter with
evidence. The strength of the previous step s(e-1) is then updated to
a new value s(e-1) à
s(e) that corresponds to
evidence e. The
congruent link between nodes i and j is described by the
value aij
=1, while dissonant link has aij = -1.
The following quantities are then defined entirely in terms of
node and link weights.
1.
Theoricity T is
the theoretical complexity of the C-construct. The more there
are D-constructs and M-constructs, which are connected to
C-constructs, the greater is the theoretical complexity of the
C-construct (i.e. the greater is its Theoricity).
Quantitatively, within the DGM, Theoricity T can be quantified as
the number of directed paths from the C-construct to the
M-construct and to their respective strengths. In some of the
paths, the D-constructs are also involved, which increases their
Theoricities. Theoricity Tc
of the C-construct at node c ϵ C (index c refers to
C-constructs) is:
The first term represents one-step paths
from the models (m ϵ
M) to the C-construct. The second term represents two-step paths
through the D-construct (d
ϵ D). Note here that sc
= 1. The theoricity of a model, needed in what
follows as part of
the dynamic update rules for the DGM, is defined similarly, but
now sc à
sm and
inverted directed paths from the model to the C- and
D-constructs are counted (see Table IV).
2.
Separability S measures the degree of dissimilarity
between the set of attributes associated with two C-constructs
represented by nodes c
and c’. It is defined in
regard to attributions only, as in the prototype theories.
Separability is operationalised as a suitably normalised number
of unshared elements so that for fully differentiated concepts,
S = 1, while for
similar concepts, S =
0, defined as:
Here, the element Aac represents
attributions (i.e. the values of attributes a, to be defined later)
linked to C-construct c,
and N is a
normalisation factor
3.
Utility U of
the models in providing explanations depends on the ratio of
explained facts to the model’s Theoricity T. The utility of model
m ϵ M is defined as
where e is an event in set E,
and e’ is a node
which groups events together (see Figure 2). Dissonant
(negative) links are denoted by a’ij. The
first term in the sum represents direct congruent paths to the
models (i.e. explanations), the second term direct dissonant
paths, the third term two-step paths through e’, and the last term
two-step paths through D-constructs. Parameter K ϵ [0,1] controls the
effect of dissonant links and the D-constructs on utility. If the model explains
most of the available evidence and its Theoricity T is low, its utility
is high. However, with a growing set of evidence to explain,
models with low theoricity generally fail, thereby reducing
their utility. Utility is the basis for model comparison and
selection.
The
updating rules for the node strengths determine the
dynamics of the graph and, consequently, the evolution of
quantities U, T, and S, which depend
dynamically on the node strengths. The updating rules also
contain a memory effect, and the new values s(e) with evidence e depend recursively on
the past values s(e-1), or s in shorthand. The updating rules are
defined as follows:
4. The update rule for strength sm of
M-constructs m is
based on Bayesian-type selection criteria (Koponen, 2013;
Henderson et al., 2010) and depends on the utilities of the
models. Each M-construct has a certain expected plausibility or
probability sm,
which is updated to sm(e) when more evidence e in the form of
observations becomes available. The new plausibility with
evidence e is
evaluated according to the Bayesian rule so that if Um(e) is the
utility of model m
when e is known, the
model strength is then updated to
where the sum in the denominator ensures
its normalisation. Generally, the more complex C-constructs
provide more alternatives for M-constructs, so the Bayesian rule
favours “simple” M-constructs. However, this may change when
observations accumulate and more complex M-constructs explain
more. The initial
conditions of the system are its prior strengths and
utilities, which must be deduced from available empirical data
(based e.g. on interviews).
5. Strength sd with
evidence e depends on
the connections of the D-construct to other D-constructs and
M-constructs. The update rule for it is
where the first and the
last sums take into account the fact that dissonant connections
reduce strength, while the second sum takes into account the
fact that connections to successful M-constructs increase
strength. The factor (1-sd)
takes into account the “memory effect”; the use of given
D-construct in successful M-constructs increase the value of sd, which
does not decrease again. This models the known effect that the
successful application of theoretical knowledge increases
confidence in that knowledge.
6. Attribute strengths sa are
updated by taking into account one-step congruent (second sum)
and dissonant (first sum) paths from the models,
where parameter K controls the weight
of dissonant paths (compare to Utility). Attributions Aac are then
defined on the basis of attribute strengths,
where the first term
represents two-step paths, and the last term, three-step paths
from the attributes to the C-constructs through the D-constructs
(see Figure 2). Attributions
serve as a basis for calculating the Separability S of a given pair of
concepts (see definition 3 above).
7. The effect of communication
on the dynamic of the DGM operates through the strengths of
the M-constructs. At each step, where pairwise communication
between students P and P’ is assumed, the stronger M-constructs
affect the weaker ones. This is done by setting new values to
the sm and
s’m of P
and P’, respectively, so that the larger value of them remains
unchanged, but the lower value increases by factor C Max{0,sm - s’m},
where C is the
communication impact factor. The update takes place before
applying the Bayesian rule in step 4. This means that the update
rule can be interpreted as learning by adopting better
explanatory models before comparing utilities.
The update rules for the
strengths of the M- and D-constructs drive the dynamic evolution
of the graph. Table 4 summarises these strengths and other
quantities defined in 1-6.
Table 4
Definitions
of quantities T, S and U, as well as update rules for node
strengths s, are given for D- and M-constructs and their
attributes. The attributes of the C-construct are given by aac,
and the separability of C and C’ by SCC’.
Subscripts c, m, d and a denote C-, M- and D-constructs and
attributes, respectively.
Congruent links between i and j are denoted by aij
= 1, while for dissonant links,
a’ij = -1.
Name |
|
Definition
|
Theoricity |
Tk |
|
Strength of
D |
sd |
|
Utility |
Um |
|
Strength of
M |
sm |
|
Strength of
a |
sa |
|
Attribution of
C |
Aac |
|
Separability |
Scc’ |
|