Tuesday, January 20, 2015

Misconceptions Never Die. They Just Fade Away.

In a post on my precision principle, I made a fairly humdrum observation about a typical elementary-level geometry question:

Why can we so easily figure out the logics that lead to the incorrect answers? It seems like a silly question, but I mean it to be a serious one. At some level, this should be a bizarre ability, shouldn't it? . . . . The answer is that we can easily switch back and forth between different "versions" of the truth.

What happened next of course is that researchers Potvin, Masson, Lafortune, and Cyr, having read my blog post, decided to go do actual serious academic work to test my observation. And they seem to agree--non-normative 'un-scientific' conceptions about the world do not go away. They share space in our minds with "different versions of the truth." (I may be misrepresenting the authors' inspirations and goals for their research somewhat.)

The Test

Participants in the study were 128 14- and 15-year-olds. They were given several trials involving deciding which of two objects "will have the strongest tendency to sink if it were put in a water tank." The choices for the objects were pictures of balls (on a computer), each made of one of 3 different materials: lead, wood, or "polystyrene (synthetic foam material)" and having one of 3 different sizes: small, medium, or large. The trials were categorized from "very intuitive" to "very counter-intuitive" as shown in the figure from the paper at the right.

Instead of concerning themselves with whether answers were correct or incorrect, however (most of the students got above 90% correct), the authors were interested in the time it took students to complete trials in the different categories. The theory behind this is simple: if students took longer to complete the "counter-intuitive" trials than the "intuitive" ones, it may be because the greater-size-greater-sinkability misconception was still present.

Results

Not only did counterintuitive trials take longer, trials that were more counterintuitive took longer than those that were less counterintuitive. The mean reaction times in milliseconds for trials in the 5 categories from "very intuitive" to "very counter-intuitive" were 716, 724, 756, 784, and 804. This spectrum of results is healthy evidence in favor of the continued presence of the misconception(s).

So why doesn't the sheer force of the counterintuitive idea overwhelm students into answering incorrectly? The answer might be inhibition—i.e., being able to suppress "intuitive interference" (their "gut reaction"):

[Lafortune, Masson, & Potvin (2012)] concluded that inhibition is most likely involved in the explanation of the improvement of answers as children grow older (ages 8–14). Other studies that considered accuracy, reaction times, or fMRI data . . . . concluded that inhibition could play an important role in the production of correct answers when anterior knowledge could potentially interfere. The idea that there is a role for the function of inhibition in the production of correct answers is, in our opinion, consistent with the idea of persistence of misconceptions because it necessarily raises the question of what it is that is inhibited.

Further analysis in this study, which cites literature on "negative priming," shows that inhibition is a good explanation for the increased cognitive effort that led to higher reaction times in the more counterintuitive trials.

So, What's the Takeaway?

In my post on the precision principle, my answer wasn't all that helpful: "accuracy within information environments should be maximized." The authors of this study are much better:

There are multiple perspectives within this research field. Among them, many could be associated with the idea that when conceptual change occurs, initial conceptions ". . . cannot be left intact."

Ohlsson (2009) might call this category "transformation-of-previous-knowledge" (p.20), and many of the models that belong to it can also be associated to the "classical tradition" of conceptual change, where cognitive conflict is seen as an inevitable and preliminary step. We believe that the main contribution of our study is that it challenges some aspects of these models. Indeed, if initial conceptions survive learning, then the idea of "change", as it is understood in these models, might have to be reconsidered. Since modifications in the quality of answers appear to be possible, and if initial conceptions persist and coexist with new ones, then learning might be better explained in terms of "reversal of prevalence" then [sic] in terms of change (Potvin, 2013).

This speaks strongly to the idea of exposing students' false intuitions so that their prevalence may be reversed (a "20%" idea, in my opinion). But it also carries the warning—which the researchers acknowledge—that we should be careful about what we establish as "prevalent" in the first place (an "80%" idea):

Knowing how difficult conceptual change can sometimes be, combined with knowing that conceptions often persist even after instruction, we believe our research informs educators of the crucial importance of good early instruction. The quote "Be very, very careful what you put in that head because you will never, ever get it out" by Thomas Woolsey (1471–1530) seems to be rather timely in this case, even though it was written long ago. Indeed, there is no need to go through the difficult process of "conceptual changes" if there is nothing to change.

This was closer to my meaning when I wrote about maximizing accuracy within information environments. There is no reason I can see to simply resign ourselves to the notion that students must have misconceptions about mathematics. What this study tells us is that once those nasty interfering intuitions are present, they can live somewhat peacefully alongside our "scientific" conceptions. It does not say that we must develop a pedagogy centered around an inevitability of false intuitions.

What's your takeaway?



ResearchBlogging.org Potvin, P., Masson, S., Lafortune, S., & Cyr, G. (2014). Persistence of the Intuitive Conception that Heavier Objects Sink More: A Reaction Time Study with Different Levels of Interference International Journal of Science and Mathematics Education, 13 (1), 21-43 DOI: 10.1007/s10763-014-9520-6

Sunday, January 11, 2015

Common Core Is Good Because of 'Common'

To the point, this video is still at the top of my 'Common Core' pile, because it highlights what I consider to be the most important argument for the standards: just being on the same page.

  1. Interview with Gates Foundation Education Director Vicki Philips about the CCSS at the Aspen Ideas Festival. (See 2:38+.)

I'm seeing this firsthand online in conversations among teachers and product development professionals. For the first time, we're on the same page. That doesn't mean we agree--that's not what "being on the same page" has to mean. It just means in this case that we're literally looking at the same document. And that's a big deal.

(Speaking of agreement, to be honest, I'd like to see more 'moderate traditionalist' perspectives in education online and elsewhere speak in support of the Common Core. There's no rock-solid evidentiary reason why the 'No Telling' crowd should be completely owning the conversation around the CCSS. The 8 Practice Standards are no less methodologically agnostic than the content standards, unless one assumes (very much incorrectly, of course) that it's difficult for a teacher to open his mouth and directly share his awesome 'expert' knowledge of a content domain without simultaneously demanding cognitive compliance from students. And finally, politically, the national standards movement suffers when it becomes associated with more radical voices.)

Years ago, as I was formulating for myself what eventually became these principles of information design, I was originally somewhat firm on including what I called just the "boundary principle" (I'm not good at naming things). This was motivated by my perception at the time (2007, I think) that in any argument about education, there was no agreed upon way to tell who was right. And so the 'winner' was the idea that was said the loudest or the nicest or with the most charisma, or was the idea that squared the best with common wisdom and common ignorance, or it had the most money behind it or greater visibility.

The boundary principle, then, was just my way of saying to myself that none of this should be the case--that even though we need to have arguments (maybe even silly ones from time to time), we need to at least agree that this or that is the right room for the arguments. I think the Common Core can give us that room.

The Revolutionary War Is Over

It is painful to read about people who think that the Common Core Standards are a set of edicts foisted on schools by Bill Gates and Barack Obama. But I get it. And, honestly, I see it as the exact same sentiment as the one that tells us that a teacher's knowledge and a student's creativity are mutually exclusive and opposing forces. That sentiment is this: we hate experts.

But that "hatred" is just a matter of perception, as we all know. We can choose to hear the expert's voice as just another voice at the table (one with a lot of valuable experience and knowledge behind it)--as a strong voice from a partner in dialogue--or we can choose to hear it as selfish and tyrannical. And in situations where we are the experts, we can make the same choice.

I want to choose to see strong and knowledgeable people and ideas as a part of the "common" in education.


Tuesday, December 30, 2014

Education-Ish Research

Veteran education researcher Deborah Ball (along with co-author Francesca Forzani) provide some measure of validation for many educators' frustrations, disappointments, and disaffections with education research. In a paper titled "What Makes Education Research 'Educational'?" published in December 2007, Ball and Forzani point to education research's tendency to focus on "phenomena related to education," rather than "inside educational transactions":

In recent years, debates about method and evidence have swamped the discourse on education research to the exclusion of the fundamental question of what constitutes education research and what distinguishes it from other domains of scholarship. The panorama of work represented at professional education meetings or in publications is vast and not highly defined. . . Research that is ostensibly "in education" frequently focuses not inside the dynamics of education but on phenomena related to education—racial identity, for example, young children's conceptions of fairness, or the history of the rise of secondary schools. These topics and others like them are important. Research that focuses on them, however, often does not probe inside the educational process.

Certainly many of us have read terrible "studies" that are, in fact, "inside education," as we might intuitively understand that term—they are situated in classrooms, they focus on students or teachers or content, etc. Nevertheless, Ball and Forzani make an important point, and the consequences of ignoring problems "inside education" may already be playing out:

Until education researchers turn their attention to problems that exist primarily inside education and until they develop systematically a body of specialized knowledge, other scholars who study questions that bear on educational problems will propose solutions. Because such solutions typically are not based on explanatory analyses of the dynamics of education, the education problems that confront society are likely to remain unsolved.

Us Laypeople

Here is a key point from the introduction to the paper. And although the authors do not explicitly link this point to their criticism of education research, I see no reason to consider the two to be unrelated:

One impediment is that solving educational problems is not thought to demand special expertise. Despite persistent problems of quality, equity, and scale, many Americans seem to believe that work in education requires common sense more than it does the sort of disciplined knowledge and skill that enable work in other fields. Few people would think they could treat a cancer patient, design a safer automobile, or repair a bridge, for these obviously require special skill and expertise. Whether the challenge is recruiting teachers, motivating students to read, or improving the math curriculum, however, many smart people think they know what it takes. Because schooling is a common experience, familiarity masks its complexity. Powell (1980), for example, referred to education as a "fundamentally uncertain profession" about which the perception exists that ingenuity and art matter more than professional knowledge. Yet the fact that educational problems endure despite repeated efforts to solve them suggests the fallacy of this reliance on common sense.

Ball and Forzani here accurately describe the environment in which many of our discussions of and debates about education take place. Instruction itself is shielded from our view by ideas—some of which may indeed be correct—that are too often based on common-sense notions about education. As a result, good questions and reasoned arguments that challenge fundamental assumptions about instruction are brushed aside without consideration.

Keith Devlin makes a point similar to that put forward by Ball and Forzani at the end of his September 2008 article:

While most of us would acknowledge that, while we may fly in airplanes, we are not qualified to pilot one, and while we occasionally seek medical treatment, we would not feel confident diagnosing and treating a sick patient, many people, from politicians to business leaders, and now to bloggers, feel they know best when it comes to providing education to our young, based on nothing more than their having themselves been the recipient of an education.

One may presume, given that Ball and Forzani and then Devlin ascribe this common-sense view of education to "many Americans," or to "politicians, business leaders, and bloggers," that these people consider, or are justified in considering, education researchers or teachers or other education professionals to be immune from similar assumptions and common-sense notions. Of course, they don't and aren't.

Thus, if education researchers are as susceptible as the rest of us to a "common sense-y" view of instruction impervious to reasoned probing, this may explain, in part, Ball and Forzani's criticism of education research as dealing with questions "related to" education rather than questions "inside" education. Many researchers may simply avoid questions inside education because they believe that their common sense has already answered them.

Asking Students to Ask Tough Questions Is Comfortable. Now You Try It.

Here Ball and Forzani expand their criticism of education research, pointing to a lack of good research that not only looks at teachers, students, or content but also at the interactions among these three:

Education research frequently focuses not on the interactions among teachers, learners, and content—or among elements that can be viewed as such—but on a particular corner of this dynamic triangle. Researchers investigate teachers' perceptions of their job or their workplace, for example, or the culture in a particular school or classroom. Many excellent studies focus on students and their attitudes toward school or their beliefs about a particular subject area. Scholars analyze the relationships between school funding and student outcomes, investigate who enrolls in private schools, or conduct international comparisons of secondary school graduation requirements. Such studies can produce insights and information about factors that influence and contribute to education and its improvement, but they do not, on their own, produce knowledge about the dynamic transactions central to the process we call education.

And their critique of the now-famous Tennessee classroom-size study illustrates clearly this further refinement of the authors' concept of research "inside education":

Finn and Achilles (1990) investigated whether smaller classes positively affected student achievement in comparison with larger classes. . . . The results suggest that reducing class size affected the instructional dynamic in ways that were productive of improved student learning. The study did not, however, explain how this worked. Improvement might have occurred because teachers were able to pay more attention to individual students. Would the same have been true if the teachers had not known the material adequately? Would reduced class size work better for students at some ages than at others, or better in some subjects than in others?


ResearchBlogging.org

Reference:
Ball, D., & Forzani, F. (2007). 2007 Wallace Foundation Distinguished Lecture--What Makes Education "Research Educational"? Educational Researcher, 36 (9), 529-540 DOI: 10.3102/0013189X07312896

Image credit: ©dzingeek

Text Coherence and Self-Explanation

The authors of the paper (full text) I will discuss here, Ainsworth and Burcham, follow the lead of many researchers, including Danielle McNamara (2001) (full text), in conceiving of text coherence as "the extent to which the relationships between the ideas in a text are explicit." In addition to this conceptualization, the authors also adopt guidelines from McNamara, et al. (1996) to improve the coherence of the text used in their experiment—a text about the human circulatory system. These guidelines essentially operationalize the meaning of text coherence as understood by many of the researchers examining it:

(1) Replacing a pronoun with a noun when the referent was potentially ambiguous (e.g., replacing 'it' with 'the valves'). (2) Adding descriptive elaborations to link unfamiliar concepts with familiar ones and to provide links with previous information presented in the text (e.g., replacing 'the ventricles contract' with 'the ventricles (the lower chambers of the heart) contract'). (3) Adding connectives to specify the relation between sentences (e.g., therefore, this is because, however, etc.).

Maximal coherence at a global level was achieved by adding topic headers that summarised the content of the text that followed (e.g., 'The flow of the blood to the body: arteries, arterioles and capillaries') as well as by adding macropropositions which linked each paragraph to the overall topic (e.g., 'a similar process occurs from the ventricles to the vessels that carry blood away from the heart').

Many studies have found that improving text coherence (i.e., improving the "extent to which the relationships between the ideas in the text are made explicit") can improve readers' memory for the text. Ainsworth and Burcham mention several in their paper, including studies by Kintsch and McKeown and even the study by Britton and Gülgöz that I wrote up here.

What Britton and Gülgöz find is that when "inference calls"—locations in text that demand some kind of inference from the reader—are "repaired," subjects' recall of a text is significantly improved over that of a control group. These results may sum up the advantages seen across research studies in improving text coherence: in general, although there are certainly very few if any simple, straightforward, unimpeachable results available in the small collection of text-coherence studies, researchers consistently find that "making the learner's job easier" in reading a text by making the text more coherent provides for significant improvement in readers' learning from that text.

Self-Explanation

In some sense, the literature on self-explanation tells a different story from the one that emerges from the text-coherence research. Ainsworth and Burcham define self-explanation in this way:

A self-explanation (shorthand for self-explanation inference) is additional knowledge generated by learners that states something beyond the information they are given to study.

The authors then go on to describe some advantages offered to readers by the self-explanation strategy, according to research:

Self-explanation can help learners actively construct understanding in two ways; it can help learners generate appropriate inferences and it can support their knowledge revision (Chi, 2000). If a text is in someway [sic] incomplete . . . then learners generate inferences to compensate for the inadequacy of the text and to fill gaps in the mental models they are generating. Readers can fill gaps by integrating information across sentences, by relating new knowledge to prior knowledge or by focusing on the meaning of words. Self-explaining can also help in the process of knowledge revision by providing a mechanism by which learners can compare their imperfect mental models to those being presented in the text.

So, whereas text coherence advantages learners by "repairing" (i.e., removing) inferences, self- explanation often produces gains even when—and perhaps especially when—text remains minimally coherent.

Thus, on the one hand, a comprehensive—though shallow—read of the text coherence literature tells us that improved text comprehension can be achieved by "repairing" text incoherence—by closing informational gaps in text. On the other hand, research shows that significant improvements in learning from text can come from employing a strategy of self-explanation during reading—a method that practically feeds off textual incoherence.

What shall we make of this? Which is more important—text coherence or self-explanation? And how do they (or can they) interact, if at all? These are the questions Ainsworth and Burcham attempt to address in their experiment.

The Experiment

Is maximally or minimally coherent text more beneficial to learning when accompanied by self- explanations? Two alternative hypotheses are proposed:

  1. The minimal text condition when accompanied by self-explanation training will present the optimal conditions for learning. Minimal text is hypothesized to increase self-explaining, and self-explanation is known to improve learning. Consequently, low knowledge learners who self-explain will not only be able to overcome the limitations of less coherence but will actively benefit from it as they will have a greater chance to engage in an effective learning strategy.
  2. Maximally coherence [sic] text accompanied by self-explanation will present the optimal condition for learning. Although maximal text is hypothesized to result in less self-explanation than minimal text, when learners do self-explain they will achieve the benefits of both text coherence and self-explanation.

Forty-eight undergraduate students were randomly separated into four groups, each of which was assigned either a maximally coherent text (Max) or a minimally coherent text (Min) about the human circulatory system. Each group was also given either self-explanation training (T) or no training at all (NT).

All forty-eight students completed a pretest on the subject matter, read their assigned text using self-explanation or not, and then completed a posttest, which was identical to the pretest. The results for each of the four groups are shown below (the posttest results have been represented using bars, and the pretest results have been represented using line segments).

The pretest and matching posttest each had three sections, as shown at the left by the sections of each of the bars. Each of these sections comprised different kinds of questions, but all of the questions assessed knowledge of the textbase, which "contains explicit propositions in the text in a stripped-down form that captures the semantic meaning."

As you can see, each of the four groups improved dramatically from pretest to posttest, and those subjects who read maximally coherent text (Max) performed slightly better overall than those who read minimally coherent text (Min), no matter whether they used self-explanation during reading (T) or not (NT). However, the effect of text coherence was not statistically significant for any of the three sections of the tests. Self-explanation, on the other hand, did produce significant results, with self-explainers scoring significantly higher on two of the three sections than non–self-explainers.

In addition to the posttest, subjects also completed a test comprised of "implicit questions" and one comprised of "knowledge inference questions" at posttest only. The results for the four groups on these two tests are shown below.

Each of these two tests assessed students' situation models: "The situation model (sometimes called the mental model) is the referential mental world of what the text is about." The researchers found that self-explainers significantly outperformed non–self-explainers on both tests. Those who read maximally coherent text also outperformed their counterparts (readers given minimally coherent text) on both tests. However, this effect was significant for only one of the tests, and approached significance for the other test (p < 0.08).

Analysis

If we stop here, we would be justified in concluding that (a) was the winning hypothesis here. It would seem that self-explanation has a more robust positive effect on learning outcomes than does text coherence. And since the literature tells us that minimally coherent text produces a greater number of self-explanations than does maximally coherent text, minimizing text coherence is desirable for improving learning.

Luckily, Ainsworth and Burcham went further. They coded the types of self-explanations made by participants and analyzed each as it correlated with posttest scores. While they did find that students who read minimally coherent text produced significantly more self-explanations, they also noted this:

Whilst using a self-explanation strategy resulted in an increase in post-test scores for the self- explanations conditions compared to non self-explanation controls, there was no signficant correlation within the self-explanation groups between overall amount of self-explanation and subsequent post-test performance. Rather, results suggest that it is specific types of self-explanations that better predict subsequent test scores.

In particular, for this study, "principle-based explanations" ("[making] reference to the underlying domain principles in an elaborated way"), positive monitoring ("statements indicating that a student . . . understood the material"), and paraphrasing ("reiterating the information presented in the text") were all significantly positively related to total posttest scores, though only the first of those was considered a real "self-explanation."

Now, each of those correlations seems pretty ridiculous. They all seem to point in one way or another to the completely unsurprising conclusion that understanding a text pretty well correlates highly with doing well on assessments about the text.

What is interesting, however, is the researchers' observation that the surplus of self-explanations in the "minimal" groups could be accounted for primarily by three other types of self-explanation, none of which, in and of themselves, showed a signficant positive correlation with total posttest scores: (1) goal-driven explanations ("an explanation that inferred a goal to a particular structure or action"), (2) elaborative explanations ("inferr[ing] information from the sentence in an elaborated manner"), and (3) false self-explanations (self-explanations that were inaccurate).

To put this in perspective, there were only two other types of "self-explanation" coded that I did not mention here. Out of the remaining six, three showed no significant positive correlations with posttest scores (or, in the case of false self-explanations, a significant negative correlation), yet those were the self-explanations that primarily accounted for the significant difference between the minimal and maximal groups.

Or, to put it much more simply, the minimal groups had significantly more self-explanations, but those self- explanations were, in general, either ineffective at raising posttest scores or actually harmful to those scores. It is possible that the significant positive main effect for self-explanation in the study could, in fact, have been greatly helped along by the better self-explanations present in the maximal groups. All of this leads to this conclusion from the researchers:

This study suggests that rather than designing material, which, by its poverty of coherence, will drive novice learners to engage in sense-making activities in order to achieve understanding, we should design well-structured, coherent material and then encourage learners to actively engage with the material by using an effective learning strategy.


ResearchBlogging.org

Reference:
Ainsworth, S., & Burcham, S. (2007). The impact of text coherence on learning by self- explanation Learning and Instruction, 17 (3), 286-303 DOI: 10.1016/j.learninstruc.2007.02.004


Monday, December 29, 2014

Inference Calls in Text

ResearchBlogging.org

Britton and Gülgöz (1991) conducted a study to test whether removing "inference calls" from text would improve retention of the material. Inference calls are locations in text that demand inference from the reader. One simple example from the text used in the study is below:

Air War in the North, 1965
By the Fall of 1964, Americans in both Saigon and Washington had begun to focus on Hanoi as the source of the continuing problem in the South.

There are at least a few inferences that readers need to make here. Readers need to infer the causal link between "the fall of 1964" and "1965," they are asked to infer that "North" in the title refers to North Vietnam, and they need to infer that "Hanoi" refers to the capital of North Vietnam.

The authors of the study identified 40 such inference calls (using the "Kintsch" computer program) throughout the text and "repaired" them to create a new version called a "principled revision." Below is their rewrite of the text above, which appeared in the principled revision:

Air War in the North, 1965
By the beginning of 1965, Americans in both Saigon and Washington had begun to focus on Hanoi, capital of North Vietnam, as the source of the continuing problem in the South.

Two other versions (revisions), the details of which you can read about in the study, were also produced. These revisions acted as controls in one way or another for the original text and the principled revision.

Method and Predictions

One hundred seventy college students were randomly assigned one of the four texts--the original or one of the three revisions. The students were asked to read the texts carefully and were informed that they would be tested on the material. Eighty subjects took a free recall test, in which they were asked to write down everything they could remember from the text. The other ninety subjects took a ten-question multiple-choice test on the information explicitly stated in each text.

It's not at all difficult, given this set up, to anticipate the researchers' predictions:

We predicted that the principled revision would be retrieved better than the original version on a free-recall test. This was because the different parts of the principled revision were more likely to be linked to each other, so the learner was more likely to have a retrieval route available to use. . . . Readers of the original version would have to make the inferences themselves for the links to be present, and because some readers will fail to make some inferences, we predicted that there would be more missing links among readers of this version.

This is, indeed, what researchers found. Subjects who read the principled revision recalled significantly more propositions from the text (adjusted mean = 58.6) than did those who read the original version (adjusted mean = 35.5). Researchers' predictions for the multiple-choice test were also accurate:

On the multiple-choice test of explicit factual information that was present in all versions, we predicted no advantage for the principled revision. Because we always provided the correct answer explicitly as one of the multiple choices, the learner did not have to retrieve this information by following along the links but only had to test for his or her recognition of the information by using the stem and the cue that was presented as one of the response alternatives. Therefore, the extra retrieval routes provided by the principled revision would not help, because according to our hypothesis, retrieval was not required.

Analysis and Principles

Neither of the two results mentioned above are surprising, but the latter is interesting. Although we might say that students "learned more" from the principled revision, subjects in the original and principled groups performed equally well on the multiple-choice test (which tests recognition, as opposed to free recall). As the researchers noted, this result was likely due to the fact that repairing the inference calls provided no advantage to the principled group in recognizing explicit facts, only in connecting ideas in the text.

But the result also suggests that students who were troubled by inference calls in the text just skipped over them. Indeed, subjects who read the original text did not read it at a significantly faster or slower rate than subjects who read the principled revision and both groups read the texts in about the same amount of time. Yet, students who read the original text recalled signficantly less than those who read the principled revision.

In repairing the inference calls, the authors of the study identified three principles for better texts (all of which are practically the opposite of "less helpful"):

Principle 1: Make the learner's job easier by rewriting the sentence so that it repeats, from the previous sentence, the linking word to which it should be linked. Corollary of Principle 1: Whenever the same concept appears in the text, the same term should be used for it.

Principle 2 is to make the learner's job easier by arranging the parts of each sentence so that (a) the learner first encounters the old part of the sentence, which specifies where that sentence is to be connected to the rest of his or her mental representation; and (b) the learner next encounters the new part of the sentence, which indicates what new information to add to the previously specified location in his or her mental representation.

Principle 3 is to make the learner's job easier by making explicit any important implicit references; that is, when a concept that is needed later is referred to implicitly, refer to it explicitly if the reader may otherwise miss it.


Reference
Britton, B., & Gülgöz, S. (1991). Using Kintsch's computational model to improve instructional text: Effects of repairing inference calls on recall and cognitive structures. Journal of Educational Psychology, 83 (3), 329-345 DOI: 10.1037//0022-0663.83.3.329


Do We Have a "Bullseye" in Education?

The following is a brief interview Bill Gates did on the Daily Show in 2010. There is an interesting exchange about education starting at about 3:25.

Interview from January 25, 2010. See 3:25 for education discussion.

I've highlighted some of the more interesting comments to me in the transcript below.

Stewart:
Why is it so difficult to get change in the educational system in our country? That seems to be one of the most intractable systems, either because of the boards that are there or the unions or the—what is it about our education system that makes it so difficult to reform?

Gates:
Well, until recently there was no room for experimentation. And charter schools came in—although they're only a few percent of the schools—and they tried out new models. And a lot of those have worked. Not all of them. But that format showed us some very good ideas, and among those ideas is that you measure teachers, you give them more feedback. And--but people are afraid you'd put in a system that will fire the wrong person or have high overhead, and that's a legitimate fear. So actually having some districts where it works and then getting the 90% of the teachers who liked it, who thrived, who did improve to share that might allow us to switch—not have capricious things but really help people get better.

Stewart:
But don't public things like schools and medical care need to have the power to fail, need to fire the wrong person every now and again? It's never going to be perfect. Aren't people's expectations of what it's supposed to be so precious that you never get change in the positive direction?

Gates:
That's right. But you have to have a measure. And it's very tough to agree on a measure. You know, right now the health system rewards the person who just does more treatment, so it's quantity of output, not the kind of preventative care and measuring and saying, "Okay, you do that well." Or, "You teach this kid really well." We haven't been able to agree on that. And without that it's a problem.

Jon's comment—or question, rather—about the education system's lacking the power to fail struck me as being similar to something I wrote here (apologies for being so gauche as to quote myself):

Education seems unable to help but vacillate between its skepticism, which holds every idea (or none of them) to be right, and its particularism, which holds all of its own ideas to be right. This inability, in the end, makes it nearly impossible for education to decide before the fact that something can be wrong.

Okay, So, Less Philosophical

To make the similarity less philosophical, I can think of an analogy involving these two dartboards. Using the dartboard on the right—a typical dartboard—we obviously do have the power to fail if our goal is to hit the bullseye. Using the dartboard on the left, we don't really have the power to fail—not because every throw will be considered a bullseye, but because we have not set out ahead of time what failure and success mean.

An important question we wrestle with in education, specifically with regard to instruction, is What kind of dartboard are we throwing at? Can we explain, before ever throwing a dart, what it means to hit the bullseye and how to get closer to it? If so, then we're throwing to the right; if not, then we're throwing to the left.

It seems right—er, correct—to say no, we can't really describe "bullseye" instruction before we deliver it (particularism) or at all (skepticism), because every student learns differently, there are multiple ways of delivering the same content, etc. For what seems like the same reasons, we can't really describe "bullseye" ice cream flavors or "bullseye" back massages. In other words, when it comes to instruction, the dartboard on the left seems to be the most appropriate.

Jon challenges this notion by asking, "But don't public things like schools and medical care need to have the power to fail, . . .? It's never going to be perfect. Aren't people's expectations of what it's supposed to be so precious that you never get change in the positive direction?" For education—specifically, for instruction—shouldn't we be using the dartboard on the right, not the one on the left? Shouldn't we have the courage to draw the bullseye somewhere, even if we know that we will sometimes unfairly exclude some good instruction and unfairly include some bad instruction? I would say yes.

Gates responds: "That's right. But you have to have a measure. And it's very tough to agree on a measure." Or, using the dartboard analogy, we must have a way to decide exactly where to draw the circles, including the bullseye.

I agree that it is difficult to agree on a measure and that quality of instruction is non-quantifiable, but I disagree that we should be looking for something so narrow as a measure (or even group of measures) or something necessarily quantifiable in the first place. What we should be looking for first are clear, specific, acceptable principles of instructional quality.


Friday, December 26, 2014

Can We Know the Good from the Bad in Education?

In a beautiful article written nearly 10 years ago and titled One Side Can Be Wrong, Richard Dawkins and Jerry Coyne rather tidily do away with what had then become the "teach the controversy" argument for intelligent design creationism--the notion that IDC should be taught in science classrooms because it offers an alternative to the theory of evolution by natural selection as an explanation for the origins of different species on Earth.

As the authors point out, their stance against "teach the controversy" seems counterintuitively closed-minded, but is demanded of them by the evidence--or, rather, lack of evidence:

So, why are we so sure that intelligent design is not a real scientific theory, worthy of "both sides" treatment? Isn't that just our personal opinion? It is an opinion shared by the vast majority of professional biologists, but of course science does not proceed by majority vote among scientists. Why isn't creationism (or its incarnation as intelligent design) just another scientific controversy . . .? Here's why.

If ID really were a scientific theory, positive evidence for it, gathered through research, would fill peer- reviewed scientific journals. This doesn't happen. It isn't that editors refuse to publish ID research. There simply isn't any ID research to publish. Its advocates bypass normal scientific due process by appealing directly to the non-scientific public and--with great shrewdness--to the government officials they elect.

Intelligent design creationists theorize that "certain features of the universe and of living things are best explained by an intelligent cause," yet they have never produced any positive evidence for this intelligent cause. It is this lack of evidence--not its character as an alternative explanation--which precludes IDC from acceptance in scientific circles and from "both sides" consideration. As Dawkins and Coyne note in the article linked above, alternative explanations based on actual evidence abound within evolutionary science and are thus far more worthy of debate than is IDC.

Methodists, Particularists, and Apples

Yet some may argue that while it may be true that IDC is unscientific, it does not follow from that observation alone that it is wrong. And, indeed, Dawkins and Coyne make no such claim explicitly in the article. Instead (again, one may argue), the authors simply hold up IDC to certain criteria of philosophical empiricism--that knowledge is derived from sense experience in and reasoning about the natural world--and then describe how the theory fares (not well).

Philosopher Roderick Chisholm categorized empiricism of this variety as a form of what he termed "methodism"--one of three possible solutions to the problem of distinguishing what is true from what is not {1}:

(A) What do we know? What is the extent of our knowledge? (B) How are we to decide whether we know? What are the criteria of our knowledge?

If you happen to know the answers to the first of these pairs of questions, you may have some hope of being able to answer the second. Thus, if you happen to know which are the good apples and which are the bad ones, then maybe you could explain to some other person how he could go about deciding whether or not he has a good apple or a bad one. But if you don't know the answer to the first of these pairs of questions--if you don't know what things you know or how far your knowledge extends--it is difficult to see how you could possibly figure out an answer to the second.

On the other hand, if, somehow, you already know the answers to the second of these pairs of questions, then you may have some hope of being able to answer the first. Thus, if you happen to have a good set of directions for telling whether apples are good or bad, then maybe you can go about finding a good one--assuming, of course, that there are some good apples to be found. But if you don't know the answer to the second of these pairs of questions--if you don't know how to go about deciding whether or not you know, if you don't know what the criteria of knowing are--it is difficult to see how you could possibly figure out an answer to the first.

Particularists and particularist philosophies (described in the second paragraph above; called epistemological particularisms) decide first which are the good and bad apples--or what is true and what is not, or what we know and what we don't--and then shop around for a sorting system that reliably turns out results consistent with those decisions. Empiricist, or "methodist," philosophies, in contrast, (described in the final paragraph above) find their answers to the first question (which are the good apples?) by first answering the second question (how are we to decide whether we have a good or bad apple?).

Thus, Dawkins and Coyne, as loyal empiricists, reject IDC as a bad apple--not, as the argument might go, because they believe it actually is a rotten apple (the authors subscribe to a philosophy which does not permit them to discern that directly) but because the method they have decided upon to sort the apples (quantity or quality of evidence, naturalism, the scientific method, etc.) leads them almost inevitably to this conclusion.

(Our third choice, according to Chisholm, by the way, is skepticism. The skeptic adroitly recognizes that in order to determine whether or not we possess in each case a good or bad apple we require a method to justify our choice and that in order to select a reliable method we need to know the difference, ab initio, between good and bad apples, and she therefore concludes that there is no way to decide.)

The Truth Is Out There

It is an admixture of the skeptic's and particularist's philosophies which most closely resembles the weak orthodoxy of American K-8 education--a system (if one could be so generous as to describe it as such) often characterized, certainly not thoroughly but perhaps most aptly, by its ability to not distinguish between good and bad apples.

One can see evidence for this strange orthodoxy not only in the way the "system" administers itself, but in more abstract ways as well. This, for example, is part of a "skeptico-particularist" argument that is, in one form or another, very popular among professional educators as a defense against the evils of generalization and standardization:

It is simply not possible to prove that an approach to teaching and learning will be effective before the fact.

Education as a scientific discipline is a young field with an active community focused on R&D--research on learning coupled with the development of new and better curriculum materials. In truth, however, much of the work is better described as D&R--informed and thoughtful development followed by careful analysis of results. It is in the nature of the enterprise that we cannot discover what works before we create the what.

Similarly, James and Dewey—two of educational psychology’s founding philosophers— though not self-identified skeptics or "particularists" in any strict or relevant sense, were not exactly warm to a "methodist" approach to discerning truth. John J. McDermott said it this way {2}:

James has a name for . . . methodological anality. He calls it "vicious intellectualism" by which we define A as that which not only is what it is but cannot be other. Proceeding this way, answers abound and clarity holds sway. Missing is surprise, novelty, the wider relational fabric, often riven with rich meanings found on the edge, behind, around, under, over the designated, prearranged conceptual placeholders. Percepts are what count, and the attendant ambiguity in all matters important, presage more and deeper meaning not less. Following John Dewey, method is subsequent and consequent to experience, to inquiry. Method can help fund and warrant experience, but it does not grasp our doings and undergoings in their natural habitat. For that, we must begin with and experimentally trust our affections--dare I say it, trust our feelings. They may cause trouble, but they never lie.

The surest evidence, however, for the antagonism between Chisholm's "methodism" and American education can be found through experience and observation. A small helping only of each of these is enough, I think, to convince most rational people that at nearly every turn, education steers itself craftily away from the advisement of all but the vaguest and easiest criteria: How shall we teach? What shall we teach? Who shall we reward? punish? What shall we value and devalue? Education will provide answers to these questions or it won't, but it never has a way to decide, a methodology, a set of criteria it refers to.

To come, finally, full circle, education seems unable to help but vacillate between its skepticism, which holds every idea (or none of them) to be right, and its particularism, which holds all of its own ideas to be right. This inability, in the end, makes it nearly impossible for education to decide before the fact that something can be wrong.


References:

1. Chisholm, R.M. (1982). The problem of the criterion. In L. Pojman (Ed.), The theory of knowledge, second edition (pp. 26-35). Belmont, CA: Wadsworth.

2. McDermott, J. (2003). Hast Any Philosophy in Thee, Shepherd? Educational Psychologist, 38 (3), 133-136 DOI: 10.1207/S15326985EP3803_2


Wednesday, December 24, 2014

Programming: A Gateway Skill

The date that I created the folder in my Google Drive shows up as 10/23/14, which was a Thursday, but the work probably went through the weekend.

I was working on some notes about Varignon's Theorem, which, until I started writing the notes, I didn't refer to by name because I didn't know it had a name. Anyway, the theorem basically says that if I connect the midpoints of any quadrilateral to form another quadrilateral, that second quadrilateral will be a parallelogram (opposite sides congruent and parallel).

This is a neat result and surprising until you look at the proof, of course. So, I wanted to write something about it that included not only the proof but also a demonstration, allowing a user to create any quadrilateral and see that a parallelogram could be generated using the quadrilateral's midpoints as vertices.

This is what I came up with. Click (or tap) inside the canvas below to create four vertices. Press "C" to clear. (The full write-up with the proof is here if you're interested.)

Canvas not supported.

Don't Listen to People About What "the World Needs"

I hear a lot about programming these days, and most of it is attended by a weird and unnecessary existential intensity and fussbudgety worry (and dudebro-ish insecurity) about things that are either out of people's control or none of their damn business. Or both. If you want to learn how to talk to computers a little or a lot, just go find a place to start and then start. Make something and stop worrying about how "good" you are at it. Who cares? It's a low-floor, high- ceiling task that is enjoyable and personally rewarding.

And if it doesn't float your boat or it does, it's not that big a deal. The economy is not going to explode on the one hand, and you're not going to be able to buy a private island on the other. Take a breath and settle the hell down.

If you are interested in programming, though, (or even if you're not, I guess) allow me to highlight what I have found to be a wonderful benefit of trying to build stuff with code: it can force you to think more "mathematically." And rather than try to explain what I mean by that, I'm going to use the script I linked to above as an example.

"My freedom will be so much the greater and more meaningful . . ."

To start, I can take advantage of the fact that your browser knows where your mouse cursor is. For example, if you're using a mouse, the cursor is at (, ).

Canvas not supported.

But this is relative to your browser window. Here's its location relative to the canvas above: (, ).

It takes a tiny function to make that conversion between those two spaces. But we're still just dealing with coordinates (horizontal and vertical), although here the y-values are inverted: they increase as your cursor moves down from top to bottom.

This is really most of the "school" mathematics required to do the work of the program at the top of this post. When you press your mouse down on the canvas, I record the coordinates of that point and draw a little purple circle for you. To draw the line segment sides, I just have to know the starting point and ending point of each. And the midpoints: I just use the Midpoint Formula. The interesting stuff comes next. I'll leave it as a question.

Well, Interesting to Me

In what order should I make the computer draw the line segments between the points? The user clicks on the canvas to draw four points, and the program records the locations (coordinates) of the points one at a time. I don't want the lines to cross (even though that's not a big problem for the theorem), so I can't just use the order in which the points were entered. How can this be solved?

Coming up with a solution to that problem (one I had myself created) was a thoroughly enjoyable and thoroughly mathematical experience (in no small part because I gave myself the further constraint that I wasn't going to use trig to do it). And having that major constraint--of having to explain what to do to a computer--really set the stage for that experience. I recommend that constraint highly to everyone.

Image credit: Zachary Veach

Wednesday, December 17, 2014

Shouldn't Sturgeon's Law Also Apply to Education?

I was complaining to a teacher colleague in a Starbucks about the management where we both used to work—at the time, he had already left and I was still there.

"Everything to them is a 'code red'," I said, borrowing a phrase from yet another teacher colleague at the same place about the same issue. "Everything's important. But this [whatever I was complaining about at the time] is just not important."

He said, "I can't imagine telling my students that something wasn't important."

I don't remember ever being "struck" by something that someone said before that. But I was truly struck by his response—not in a reflexive, emotional way, but a dozen little rememberings happened, a hundred tiny connections, when I heard him say that. It was a perfectly timed, "obvious" truth that managed to pull together a number of pieces of my knowledge and experience and give them their own singular gravity.

Should the General Architecture of Reality . . .

"Ninety percent of everything is crap." This is the phrase most commonly referenced as Sturgeon's Law, formulated by science fiction writer Theodore Sturgeon to emphasize that the volume of lower-quality writing in science fiction was not a characteristic unique to science fiction writing. Indeed, when you substitute "crap" with something a little less bitter, and allow for some variability around 90%, this law is robustly observable:

Sensory Gating We filter out most of the stimuli that pound daily on our senses, attending only to a small percent of what we perceive. This is known as sensory gating. A measure of sensory gating used to help identify subjects with schizophrenia places the reduction in sensory stimuli for those without the illness at around 80%-90%. Also, see Dr. Bob about the myth of multi-tasking.

Dark Matter and Dark Energy Today it is thought that dark matter and dark energy together make up about 96% of all the mass and energy in the universe, with about 73% going to dark energy and 23% to dark matter. It's far from "crap" (maybe). But this is all stuff that we currently cannot "see, detect, or even comprehend."

Junk DNA Given what we now know, most of the genome is junk, sorry. Even here we don't stray too far at all from Sturgeon's 90%: "at most 10% of the human genome exhibits detectable organism-level function and conversely that at least 90% of the genome consists of junk DNA."

The Pareto Principle Related to Sturgeon's Law is the Pareto Principle, which says that "for many events, roughly 80% of the effects come from 20% of the causes." It has been observed in many natural and social contexts, including wealth distribution and Internet traffic.

. . . Not Be Allowed to Stand Up in Schools?

So, is there a similar 90% in education, in teaching, in schools—however you want to contextualize that? My purpose in writing this post is not to answer yes to that question and then start listing things that are unimportant. Instead, I'd simply like for us to stand back and consider that (1) since education is a part of our natural and social landscape, Sturgeon's Law very likely applies to it in some way, but (2) we are forced to pretend that this is not the case.

(Update 1.19.2015: Consider, too, the cognitive benefits of avoiding 100%.)


Sunday, July 20, 2014

Measuring Misconceptions

When you encounter a multiple-choice problem, you find yourself surrounded by choices with varying levels of influence over you:

If you are knowledgeable about the topic, well rested, and so on, the answer choice with the most influence over you will be the correct one. On the other hand, if you are unprepared, tired, or emotionally distraught (or just don't know), you will probably do less well, not because these characteristics raise the levels of influence of the incorrect answer choices, but rather because they lower the level of influence of the correct answer to match the levels of the incorrect ones. The three incorrect answer choices above, for example, are not more salient when you don't know any German. They are simply as salient as the correct answer, and this--a more even distribution of influence among the choices--is what makes the question difficult.

However, no matter what the question is, one or two of the incorrect answer choices will almost certainly call to you a little more than the others (if the assessment writer has done his job right). The word klar, for example, which is Choice D above, sounds and looks a lot like the English word clear (which is its most immediate translation). And clear bears some semantic resemblance to honest. The word liebe, too, may influence you to choose it simply because it is a bit more well known to English speakers than the other three choices.

Thus, in general, what we expect (and get) among the incorrect answers on multiple-choice tests in a sample is a distribution that is not completely even (all the wrong answer choices have the exact same percentage of responses) and also not completely lopsided (100% of respondents choosing the wrong answer choose the exact same wrong answer).

Misconception Detection

I would argue, however, that lopsided distributions among incorrect answer choices can point to the presence of one or more misconceptions among respondents, especially when--in contrast to the situation outlined above--we can make the further assumption that the population of respondents has been exposed to instruction on the topic.

For example, if we find, in a hypothetical survey of 1,000 fourth graders, that responses to the question 2 + 3 = ? break down as follows--A 1 (1%), B 5 (80%), C 6 (18%), D 10 (1%)--it should raise eyebrows that almost all of the students who answered incorrectly did so in one direction (toward the answer given by multiplication instead of addition).

STAAR

This very imperfect measure is the one I brought to bear on an analysis of the 2013 results of the State of Texas Assessments of Academic Readiness (STAAR) to see if it could help me identify possible misconceptions, which would lead students to strongly favor one of the incorrect answers among the 3 given for each multiple choice item.

The site linked above has a remarkable amount of information and data related to the assessment. For my analysis, I used the item analysis reports (which give the strand and the percent choosing each answer choice for each question), the student expectations tested, and the actual released tests from 2013:

This graph summarizes answers across Grades 3-8 to 285 multiple choice items in 5 mathematics strands for over 2 million students across Texas. The "misconception strength" was determined for each of the 285 items and then these measures were collected together here in box plots.

The good news here is that the grade-level averages for "misconception strength" drop from Grade 3 to Grade 8. (This is shown in the small inset line graph in the grouped box plot above.) Surprising--to me at least--though, is that there is no strong trend at all among the strands across grade levels. There is a slight decline across the grade levels for each strand except for Measurement, which shows a very slight increase, but none of these changes is noteworthy.

The Items

We like to focus on means and medians when analyzing data, but here the extremes and outliers are probably more interesting. (Outliers in the plot above are points above or below the box plots.) So, let's take a look at some of the items which contained a strongly influential incorrect answer choice--enough to suck in a fair number of students who had ostensibly been prepared to avoid such influence.

Edges Are Not Points, Grade 3
(A: 66%*, B: 30%, C: 1%, D: 1%; MS = 0.42721)

The asterisk above shows the correct answer choice. The percent of students who chose each answer is given along with our "misconception strength" measure (MS = 0.42721). Students here seemed to be carrying the misconception that "edges" are the outer points of figures (like they are in the real world).

Primes > 2 Are Odd but Not Vice Versa, Grade 5
(F: 59%*, G: 7%, H: 3%, J: 30%; MS = 0.29744)

Answer Choice J is exactly what you'd get if you crossed out all the even numbers and counted up what was left. This may also point to an issue with basic multiplication facts. I can imagine many students thinking of 45 and 49 as prime simply because they can't remember their factors.

Forget About Mirrors, Grade 8
(A: 5%, B: 4%, C: 38%, D: 52%*; MS = 0.33611)

Thirty-eight percent of students could not live with the x-axis being the line of reflection, so they moved it up 2 units in their minds.

Not Everything Is Left-to-Right, Grade 4
(F: 6%, G: 5%, H: 48%*, J: 41%; MS = 0.32192)

It seems that what happened here was that 41% of students (that's over 143,000 students, by the way) read the table from left to right, got the correct expression, and then didn't notice that the answer choices all read out the table from right to left.

Symbols Are Important Too, Grade 7
(F: 3%, G: 46%, H: 45%*, J: 5%; MS = 0.36696)

They knew that the answer was 4, and they didn't care about the symbolic gobbledy-gook around it.

Stop Being Fooled by Position, Grade 7
(A: 2%, B: 71%*, C: 25%, D: 2%; MS = 0.37387)

Students used the positions of the triangles to determine correspondence rather than corresponding angle measures. A well known misconception.

Superficial Mapping, Grade 7
(A: 8%, B: 58%, C: 33%*, D: 2%; MS = 0.36918)

Here we have a relatively rare case of a significant majority of students choosing one wrong answer (compared with those who chose the correct answer). The information in the diagram in B maps superficially to the bullet points in the question, and that was enough for most students.

Division Is Unnatural, Grade 5
(F: 21%, G: 75%*, H: 3%, J: 1%; MS = 0.35975)

A large majority answered this correctly, but 21% still wanted to multiply instead of divide.

More Left-to-Right Fixedness, Grade 6
(F: 35%, G: 2%, H: 7%, J: 56%*; MS = 0.33005)

Order of operations difficulties appear frequently in the STAAR results. The misconception scores for this one and the other order of operations questions are moderated by the apparent difficulty students have with these questions (44% answered this one incorrectly, which is 155,000 sixth graders). Given some of the difficulties grownups have with the order of operations, it is no surprise to see the same problems reflected in the next generation.

Why Are Misconceptions Important?

We often look at misconceptions as the products of "doing things wrong." They are, after all, associated in our minds with wrong answers as opposed to right ones. But I prefer to see misconceptions as the complete opposite--they are the products of "doing things right" when it comes to teaching, even though those local "right"s are situated inside global "wrong"s.

It's true, for example, that the "answer" to 6 + 3 = ? is 9 and the answer to 2x + 3 = 9 is x = 3. If you repeat problems like this hundreds of times in classrooms over the course of a student's elementary and middle school education, you are doing something "right" hundreds of times, so long as your focus is on the discrete set of problems. Taken as a whole, however, you have developed the misconception that math problems are to be analyzed from left to right, always or almost always. Or perhaps more accurately, you have developed a notion of "normal" math and "weird" math, with left-to-right-isms belonging to the normal category.