Saturday, May 7, 2016

Making "Connections"

I'm nearing the end of my read of James Lang's terrific book Small Teaching, and I've wanted for the last 100 pages or so to recommend it highly here.

While I do that, however, I'd also like to mention a confusion that piqued my interest near the middle of the book—a very common false distinction, I think, between 'making connections' and knowing things. Lang sets it up this way:

When we are tackling a new author in my British literature survey course, I might begin class by pointing out some salient feature of the author's life or work and asking students to tell me the name of a previous author (whose work we have read) who shares that same feature. "This is a Scottish author," I will say. "And who was the last Scottish author we read?" Blank stares. Perhaps just a bit of gaping bewilderment. Instead of seeing the broad sweep of British literary history, with its many plots, subplots, and characters, my students see Author A and then Author B and then Author C and so on. They can analyze and remember the main works and features of each author, but they run into trouble when asked to forge connections among writers.

What immediately follows this paragraph is what one would expect from a writer who has done his homework on the research: Lang reminds himself that his students are novices and he an expert; his students' knowledge of British literature and history is "sparse and superficial."

But then, suddenly, the false distinction, where 'knowledge' takes on a different meaning, becoming synonymous with "sparse and superficial," and his students have it again:

In short, they have knowledge, in the sense that they can produce individual pieces of information in specific contexts; what they lack is understanding or comprehension.

And they lack comprehension, even more shortly, because they lack connections.

Nope, Still Knowledge

As we saw here, with the Wason Selection Task, reasoning ability itself is dependent on knowledge. Participants who were given abstract rules had tremendous difficulties with modus tollens reasoning in particular, yet when these rules were set in concrete contexts, the difficulties all but vanished.

One might say, indeed, that in concrete contexts, the connections are known, not inferred. Thus, if you want students to make connections among various authors, it might help to tell them that they are connected, and how.

Monday, May 2, 2016

The Wason Selection Task, Part II

a spruce bough

Before we sink our teeth even deeper into the Wason Selection Task, we should look briefly at conditional reasoning arguments.

Conditional reasoning generally starts with the statement "if P is true, then Q is true" or "if P, then Q" (P → Q). For example, the statement "If this tree is a spruce (P), then it has needles (Q)" is a conditional statement.

There are four types of conditional reasoning arguments that apply to the Wason Selection Task—two of them valid and two of them logically invalid. Each of these introduces a different second statement, after the "if P, then Q" formulation. That is, each of the following four types of conditional reasoning arguments can be identified according to the statement that comes right after the "if P, then Q" statement.

  • Modus Ponens (P is true): This argument proceeds as follows: If P is true, then Q is true. P is indeed true. Therefore, Q is true. This is a valid form of reasoning. Example: If this tree is a spruce (P), then it has needles (Q). This tree is indeed a spruce (P). Therefore, this tree has needles (Q).
  • Denying the Antecedent (P is not true): This is a fallacy and proceeds as follows: If P is true, then Q is true. P is not true. Therefore, Q is not true. Example: If this tree is a spruce (P), then it has needles (Q). This tree is not a spruce (not P). Therefore, this tree does not have needles (not Q).
  • Affirming the Consequent (Q is true): This is also a fallacy and proceeds as follows: If P is true, then Q is true. Q is indeed true. Therefore, P is true. Example: If this tree is a spruce (P), then it has needles (Q). This tree indeed has needles (Q). Therefore, this tree is a spruce (P).
  • Modus Tollens (Q is not true): This argument proceeds as follows: If P is true, then Q is true. Q is not true. Therefore, P is not true. Example: If this tree is a spruce (P), then it has needles (Q). This tree does not have needles (not Q). Therefore, this tree is not a spruce (not P).

It is important to note that the terms valid and invalid used to describe these arguments tell us nothing about the correctness of their conclusions. For example, each of these lines of reasoning is logically invalid . . .

  • Affirming the Consequent: If today is June 1, then tomorrow is June 2. Tomorrow is indeed June 2. Therefore, today is June 1.
  • Denying the Antecedent: If today is June 1, then tomorrow is June 2. Today is not June 1. Therefore, tomorrow is not June 2.

. . . even though they are undeniably "correct," as far as that goes. Formal logic does not concern itself necessarily with the contents of arguments, only their form.

Applying the Rules to the Selection Task

The rule given in any Wason Selection Task is considered to be a statement of the form "if P, then Q." So, the rule "every person that has an alcoholic drink is of legal age" that I included in my previous post might be recast as "if alcoholic drink (P), then legal age (Q). Similarly, in the more formal version, the rule "every card that has a D on one side has a 3 on the other" might be recast as "if D (P), then 3 (Q)."

Accordingly, each of the four answer choices in a Wason Selection Task is seen as the second statement in a conditional reasoning argument—either a statement about P (i.e., P is true [P] or P is not true [∼P]) or a statement about Q (i.e., Q is true [Q] or Q is not true [∼Q]).

In the "drinking" version of the selection task, for example, statements about drink type are the Ps and statements about age are the Qs:

Not Alcoholic
Not Legal Age
Legal Age

The Ps and Qs for the more formal task would be assigned this way:

Not D
Not 3

In each case, the statements which form a valid argument (i.e., P and ∼Q) indicate those cards (or people) that must be checked to determine the validity of the if-then statement.

The Wason Selection Task, Part I

The Pope, a nun, Kermit the Frog, and Bruce Lee are all sitting at a bar. Well, actually it's just four people, represented by the cards below.


Each person has an age and a drink type, but you can see only one of these for each person. Here is a rule: "every person that has an alcoholic drink is of legal age." Your task is to select all those people, but only those people, that you would have to check in order to discover whether or not the rule has been violated.

Most people have little trouble picking the correct answer above. But, "across a wide range of published literature only around 10% of the general population" finds the correct answer to the infamous Wason selection task shown below:


Each card has a letter on one side and a number on the other, but you can see only one of these for each card. Here is a rule: "every card that has a D on one side has a 3 on the other." Your task is to select all those cards, but only those cards, which you would have to turn over in order to discover whether or not the rule has been violated.

In fact, Matthew Inglis and Adrian Simpson (2004) found that mathematics undergraduates as well as mathematics academic staff, though performing significantly better than history undergraduates, performed unexpectedly poorly on the task, with only 29% of math undergrads and a shocking 43% of staff finding the correct answer.

In a chapter from The Cambridge Handbook of Expertise and Expert Performance, Paul Feltovich, Michael Prietula, and K. Anders Ericsson indicate the one factor that explains these differential results: knowledge.

Some studies showed reasoning itself to be dependent on knowledge. Wason and Johnson-Laird (1972) presented evidence that individuals perform poorly in testing the implications of logical inference rules (e.g., if p then q) when the rules are stated abstractly. Performance greatly improves for concrete instances of the same rules (e.g., 'every time I go to Manchester, I go by train'). Rumelhart (1979), in an extension of this work, found that nearly five times as many participants were able to test correctly the implications of a simple, single-conditional logical expression when it was stated in terms of a realistic setting (e.g., a work setting: 'every purchase over thirty dollars must be approved by the regional manager') versus when the expression was stated in an understandable but less meaningful form (e.g., 'every card with a vowel on the front must have an integer on the back').

Reference: Inglis, M. & Simpson, A. Mathematicians and the Selection Task. Proceedings of the 28th Conference of the International Group for the Psychology of Mathematics Education, 2004. (3) 89-96.

Sunday, May 1, 2016

Being Explicit About Symmetry

After reading this piece about mathematician Terence Tao, I was turned on to a book about mathematical problem-solving Tao wrote as a 15-year-old, so I decided to check it out. The book contains a nice little nugget about symmetry in the first chapter, the importance of which sails by the author (and thus the reader) a bit, I think. Here's the problem that occupies all of the discussion in Chapter 1:

A triangle has its lengths in an arithmetic progression, with difference \(\mathtt{d}\). The area of the triangle is \(\mathtt{t}\). Find the lengths and angles of the triangle.

And here's the nice move with regard to symmetry that Tao makes, almost in passing. It doesn't seem to be a "code-cracking" idea in the context of the problem, but I think it highlights a key process in mathematical thinking that we can work to make more explicit for students. It might be helpful to tinker with the problem a little to put this move in some context:

We can use the data to simplify the notation: we know that the sides are in arithmetic progression, so instead of \(\mathtt{a}\), \(\mathtt{b}\), and \(\mathtt{c}\), we can have \(\mathtt{a}\), \(\mathtt{a + d}\), and \(\mathtt{a + 2d}\) instead. But the notation can be even better if we make it more symmetrical, by making the side lengths \(\mathtt{b - d}\), \(\mathtt{b}\), and \(\mathtt{b + d}\).

I can easily imagine, given what I know about most current curricula, a student writing down the side lengths of the triangle as \(\mathtt{a}\), \(\mathtt{a + d}\), and \(\mathtt{a + 2d}\), because this is how we teach arithmetic progressions. But we don't often explicitly make the fairly simple observation that any term, with the exception of the first, in an arithmetic progression can be written as \(\mathtt{a_n}\), with the previous term written as \(\mathtt{a_n - d}\) and the next term as \(\mathtt{a_n + d}\). It seems that we can find this observation by thinking a bit more about symmetry when we craft our explanations and instruction.

Another Observation

Working to orient oneself to the symmetries available in mathematical situations seems like one appropriate remedy to what I've called "left-to-rightism," or "cinemathematics"—a syndrome that makes us teach concepts like the equals sign (unwittingly) in a left-to-right way, such that students take away (unwittingly) the misconception that the equals sign indicates that some answer is to follow, rather than that two expressions are equal. Some recent research points to the benefits of thinking about symmetry when teaching negative numbers as well. Tsang, J., Blair, K., Bofferding, L., & Schwartz, D. (2015). Learning to “See” Less Than Nothing: Putting Perceptual Skills to Work for Learning Numerical Structure Cognition and Instruction, 33 (2), 154-197 DOI: 10.1080/07370008.2015.1038539

Saturday, April 23, 2016

Evident Even to an Ass

Suppose there is a straight line segment \(\overline{\small\mathtt{AB}}\) between you, at point \(\small\mathtt{A}\), and some point \(\small\mathtt{B}\) you want to reach. Is there a shorter way to point \(\small\mathtt{B}\) that uses 2 line segment paths?

If you think the answer is yes, or if you think that the answer is no but it's not obviously no, the Epicureans have some pretty mean things to say about you:

It was the habit of the Epicureans, says Proclus, to ridicule this theorem as being evident even to an ass and requiring no proof, and their allegation that the theorem was "known" even to an ass was based on the fact that, if fodder is placed at one angular point and the ass at another, he does not, in order to get to his food, traverse the two sides of the triangle but only the one side separating them (an argument which makes Savile exclaim that its authors were "digni ipsi, qui cum Asino foenum essent"). Proclus replies truly that a mere perception of the truth of the theorem is a different thing from a scientific proof of it and a knowledge of the reason why it is true.

The theorem mentioned is the Triangle Inequality Theorem: the sum of the lengths of any two sides of a triangle is always greater than the length of the remaining side. And it is fascinating to me how seldom I have seen this theorem in textbooks phrased as a "shortest straight-line distance" statement. Most often, I see investigations about what side lengths can make up a triangle, which turns the ridiculously obvious into a complicated issue. But let's discuss that below. For now, a proof!

It's Okay To Be Both Obvious and Require Proof

We want to show that \(\small\mathtt{BA + AC}\) is greater than \(\small\mathtt{BC}\), that \(\small\mathtt{AB + BC}\) is greater than \(\small\mathtt{AC}\), and that \(\small\mathtt{BC + CA}\) is greater than \(\small\mathtt{AB}\).

Euclid starts by extending \(\overline{\small\mathtt{BA}}\) to a point \(\small\mathtt{D}\) such that \(\small\mathtt{DA = AC}\) as I've shown in the diagram. This means that \(\small\Delta\mathtt{ADC}\) is an isosceles triangle, and \(\small\measuredangle\mathtt{ACD}\) and \(\small\measuredangle\mathtt{ADC}\) are congruent. And since "the whole is greater than the part," m\(\small\measuredangle\mathtt{BCD}\) is greater than m\(\small\measuredangle\mathtt{ACD}\), which also makes m\(\small\measuredangle\mathtt{BCD}\) greater than m\(\small\measuredangle\mathtt{ADC}\).

Dizzy yet? Just remember the last rung of the ladder we got to: m\(\small\measuredangle\mathtt{BCD}\) is greater than m\(\small\measuredangle\mathtt{ADC}\). Therefore, we can say something about the line segments that those two angles "catch." Specifically, we can say that \(\overline{\small\mathtt{DB}}\) is longer than \(\overline{\small\mathtt{BC}}\), because Proposition 19. This is the same as saying that \(\small\mathtt{DA + AB > BC}\). And since \(\small\mathtt{DA = AC}\), we can make a substitution to get \(\small\mathtt{AC + AB > BC}\), which is the first of the three statements above that we wanted to prove: \(\small\mathtt{BA + AC}\) is greater than \(\small\mathtt{BC}\). Euclid tells us that we can prove the other two statements with a similar method, and I believe him.

Internalizing the Idea That Mathematics Is Complex and Intuition-Free

In the link is an example of what we often put students through to investigate the Triangle Inequality Theorem. This is consistent with what I have seen in a lot of lesson plans.

Were the authors of this document aware that there is a very intuitive way of looking at this theorem? It might be enough to suppose that they weren't and that they, like all of us, sometimes just repeat what we see or hear elsewhere.

But it's also worth entertaining the possibility that they did know and pressed on anyway. What good reasons might they have for doing so? I think the best answer is simply Proclus's reply above: "a mere perception of the truth of the theorem is a different thing from a scientific proof of it and a knowledge of the reason why it is true." To which I would respond, Indeed, a different thing. Not necessarily a better thing.

If we can give students the "mere perception of the truth" of a theorem (or of any mathematical idea), we should do so, even if it doesn't make us feel smart, leaves a lot of class time to fill, or runs counter to a set of standards. I would argue that students still have to prove those theorems and justify those ideas. But they can then do so correctly oriented to the reality of what they are doing: drawing on their own perceptions and knowledge to make their ideas plain to others.

Sunday, April 10, 2016

Spaced Practice

Spaced practice, also called distributed practice, refers to the practice—after initial learning—of leaving gaps in time between sessions devoted to reviewing previously learned material. This stands in contrast to its much less effective counterpart, massed practice, which "crams" all of the review together in time.

If, for example, you're learning about deriving the formula for the area of a circle today, and you know that in 100 days you will be tested on this learning, then spacing out your review of the material up to test day will be much better for your final score than cramming the night before.

Indeed, this recent article by Kang, along with providing a review of the evidence for the effectiveness of spaced practice, goes into some detail about optimal timings:

Retention of . . . the study material . . . was highest when the lag was about 10% to 20% of the tested retention interval (see also Cepeda et al., 2009). In other words, there is no fixed optimal lag—It depends on the targeted retention interval. If you want to maximize performance on a test about 1 week away, then a lag of about 1 day [14% of a week] would be optimal; but if you want to retain information for 1 year, then a lag of about 2 months [16% of a year] would be ideal.

So, in my area-of-a-circle example, the optimal lag time between initial study and test is between 10 and 20 days. And this is if there is only one review session. (And, needless to say, that review session has to be of high quality; I should likely work to retrieve what I can and then comprehensively review the material.)

If there are multiple opportunities for reviewing material (as is almost always the case), it is not quite clear whether it is better for the review sessions to be spaced out equally or for the interval to expand over time.

Yes, Virginia, There Is a 'Higher Order'

There are a few ancient dualist sects still in existence, I hear, who believe that higher-order thinking is in competition with memory and who would dismiss evidence about spaced practice on the grounds that it ignores the former 'in favor of' the latter. While this is barely worth a response except to point to a modern library or university, Kang does take the time to type the obvious: "acquiring foundational knowledge and being able to quickly access relevant information from memory are often prerequisites for higher order learning."

In addition, the author cites some of the evidence that spaced practice has benefits not only for memory but also for generalization and transfer:

In one study, college students attended a 45-min lecture on meteorology and then reviewed the information (in a quiz with corrective feedback) either 1 or 8 days later (Kapler, Weston, and Wiseheart, 2015). On a final test 35 days after the review session, students in the 8-day condition performed better than those in the 1-day condition not just on the factual recall questions but also on the questions that required application of knowledge. Other studies support spaced practice of mathematics problems (Rohrer and Taylor, 2006) and ecology lessons (Gluckman, Vlach, and Sandhofer, 2014; see also Vlach and Sandhofer, 2012). In addition to improving mathematics problem solving and science concept learning, spaced practice benefits the long-term learning of English grammar in adult English-language learners (Bird, 2010). In all cases, the students were not just memorizing solutions but were instead applying their learning to solve new problems.

Some Classroom Evidence Too

Last but not least, we need to place spaced practice in an environment that is entirely unique and as chaotic as the inside of a quasar, with no control over any variables. That way we'll really know if it works:

Although most studies on spaced or interleaved practice have been conducted in laboratory settings (for better control over extraneous variables), students in actual classrooms can benefit from instructors using these learning strategies (e.g., Carpenter et al., 2009; Sobel, Cepeda, & Kapler, 2011). A few studies were conducted not only in real-world educational settings but also in the context of a regular curriculum (i.e., instructional manipulation on course content).

In one classroom-based study, the mathematics homework assignments for seventh-grade students were manipulated across 9 weeks (Rohrer, Dedrick, & Burgess, 2014). . . . On a surprise test containing novel problems (on the same topics), given 2 weeks after the final homework assignment, the students were substantially better at solving the types of problems that had been practiced in an interleaved manner than those under blocked practice. Kang, S. (2016). Spaced Repetition Promotes Efficient and Effective Learning: Policy Implications for Instruction Policy Insights from the Behavioral and Brain Sciences, 3 (1), 12-19 DOI: 10.1177/2372732215624708

Saturday, April 9, 2016


Inductive teaching or learning, although it has a special name, happens all the time without our having to pay any attention to technique. It is basically learning through examples. As the authors of the paper we're discussing here indicate, through inductive learning:

Children . . . learn concepts such as 'boat' or 'fruit' by being exposed to exemplars of those categories and inducing the commonalities that define the concepts. . . . Such inductive learning is critical in making sense of events, objects, and actions—and, more generally, in structuring and understanding our world.

The paper describes three experiments conducted to further test the benefit of interleaving on inductive learning ("further" because an interleaving effect has been demonstrated in previous studies). Interleaving is one of a handful of powerful learning and practicing strategies mentioned throughout the book Make It Stick: The Science of Successful Learning. In the book, the power of interleaving is highlighted by the following summary of another experiment involving determining volumes:

Two groups of college students were taught how to find the volumes of four obscure geometric solids (wedge, spheroid, spherical cone, and half cone). One group then worked a set of practice problems that were clustered by problem type . . . The other group worked the same practice problems, but the sequence was mixed (interleaved) rather than clustered by type of problem . . . During practice, the students who worked the problems in clusters (that is, massed) averaged 89 percent correct, compared to only 60 percent for those who worked the problems in a mixed sequence. But in the final test a week later, the students who had practiced solving problems clustered by type averaged only 20 percent correct, while the students whose practice was interleaved averaged 63 percent.

The research we look at in this post does not produce such stupendous results, but it is nevertheless an interesting validation of the interleaving effect. Although there are three experiments described, I'll summarize just the first one.

Discriminative-Contrast Hypothesis

But first, you can try out an experiment like the one reported in the paper. Click start to study pictures of different bird species below. There are 32 pictures, and each one is shown for 4 seconds. After this study period, you will be asked to try to identify 8 birds from pictures that were not shown during the study period, but which belong to one of the species you studied.

chickadee finch nuthatch sparrow swallow thrush warbler wren

Once the study phase is over, click test to start the test and match each picture to a species name. There is no time limit on the test. Simply click next once you have selected each of your answers.

Based on previous research, one would predict that, in general, you would do better in the interleaved condition, where the species are mixed together in the study phase, than you would in the 'massed,' or grouped condition, where the pictures are presented in species groups. The question the researchers wanted to home in on in their first experiment was about the mechanism that made interleaved study more effective.

So, their experiment was conducted much like the one above, except with three groups, which all received the interleaved presentation. However, two of the groups were interrupted in their study by trivia questions in different ways. One group—the alternating trivia group—received a trivia question after every picture; the other group—the grouped trivia group—received 8 trivia questions after every group of 8 interleaved pictures. The third group—the contiguous group—received no interruption in their study.

What the researchers discovered is that while the contiguous group performed the best (of course), the grouped trivia group did not perform significantly worse, while the alternating trivia group did perform significantly worse than both the contiguous and grouped trivia groups. This was seen as providing some confirmation for the discriminative-contrast hypothesis:

Interleaved studying might facilitate noticing the differences that separate one category from another. In other words, perhaps interleaving is beneficial because it juxtaposes different categories, which then highlights differences across the categories and supports discrimination learning.

In the grouped trivia condition, participants were still able to take advantage of the interleaving effect because the disruptions (the trivia questions) had less of an effect when grouped in packs of 8. In the alternating trivia condition, however, a trivia question appeared after every picture, frustrating the discrimination mechanism that seems to help make the interleaving effect tick.

Takeaway Goodies (and Questions) for Instruction

The paper makes it clear that interleaving is not a slam dunk for instruction. Massed studying or practice might be more beneficial, for example, when the goal is to understand the similarities among the objects of study rather than the differences. Massed studying may also be preferred when the objects are 'highly discriminable' (easy to tell apart).

Yet, many of the misconceptions we deal with in mathematics education in particular can be seen as the result of dealing with objects of 'low discriminability' (objects that are hard to tell apart). In many cases, these objects really are hard to tell apart, and in others we simply make them hard through our sequencing. Consider some of the items listed in the NCTM's wonderful 13 Rules That Expire, which students often misapply:

  • When multiplying by ten, just add a zero to the end of the number.
  • You cannot take a bigger number from a smaller number.
  • Addition and multiplication make numbers bigger.
  • You always divide the larger number by the smaller number.

In some sense, these are problematic because they are like the sparrows and finches above when presented only in groups—they are harder to stop because we don't present them in situations that break the rules, or interleave them. Appending a zero to a number to multiply by 10 does work on counting numbers but not on decimals; addition and multiplication do make counting numbers bigger until they don't always make fractions bigger; and you cannot take a bigger counting number from a smaller one and get a counting number. For that, you need integers.

Notice any similarities above? Can we please talk about how we keep kids trapped for too long in counting number land? I've got this marvelous study to show you which might provide some good reasons to interleave different number systems throughout students' educations. It's linked above, and below. Birnbaum, M., Kornell, N., Bjork, E., & Bjork, R. (2012). Why interleaving enhances inductive learning: The roles of discrimination and retrieval Memory & Cognition, 41 (3), 392-402 DOI: 10.3758/s13421-012-0272-7

Sunday, March 13, 2016

A Thought About the Chinese Room

In 1980, philosopher John Searle proposed a thought experiment called The Chinese Room. Here's a brief—and a bit fast-moving—summary of it:

Although the thought experiment was apparently meant to illustrate the impossibility of "strong AI," or artificial intelligence capable of consciousness and human-like "mind," it has clear relevance for thinking about education as well. What differentiates "meaning-based" understanding and mimicry? for example.

Three Little Words: "I Don't Know"

You are invited to notice that the room's occupant doesn't actually know Chinese, but simply matches characters from an input message to a page in a rule book in order to generate an output message. Presumably, the particular method for processing inputs is not important, so long as the person inside the room does not understand the characters he is receiving or sending out.

The problem here is that this observation is not interesting unless you smuggle in some assumption about how the human mind works. It is not interesting that the person in the room doesn't understand Chinese unless you make the assumption that something in a Chinese speaker's brain does understand Chinese. Needless to say, this is not the case. Neurons don't understand Chinese. They are precisely as clueless as the man in the room.

The correct comparisons—at the comparable scales—should be Chinese room : Chinese speaker and person in room : inner workings of the mind. The thought experiment misleads us into comparing the person in the room with the Chinese speaker. This would work if the speaker were identical with the inner workings of her mind. Which is exactly the assumption we make about people, even though we know it has to be wrong.

Exactly how the human mind differs from elaborate mechanical rule-following is beside the point. The point is that for the Chinese Room thought experiment to have its intended effect, it seems you must assume that the human mind has some kind of "meaning" mechanism which is not built out of dumber parts. But John Searle didn't know this in 1980. And you don't know it now. It is an unwarranted assumption.

And almost certainly false.

Saturday, March 5, 2016

Retrieval Practice Effective for Young Students

These results are about as straightforward as they come in the social sciences. In an article published in Frontiers in Psychology, researchers report the results of three experiments which show that the benefits of retrieval practice (practice with retrieving items from memory) extends to children as much as to adults.

To get some sense of what retrieval practice is like, go read the abstract of the paper linked above. Then, crucially, without re-accessing the article, write down everything you can remember about the abstract on a blank piece of paper. So long as you can remember something from the abstract, this act of retrieving from memory what you read (i.e., retrieval) is much more beneficial to your remembering the information in the long term than just re-reading the passage.

Research tells us that there is surprisingly little else to say to qualify this statement about the benefits of retrieval. It works even in the absence of feedback, and, as is reported in the current study, it works (better than repeated study) perhaps regardless of students' processing speed and reading comprehension ability.

Still in Phase 2

The idea that retrieval must be at least in part successful for it to work seems to suggest a bit of a chicken-and-egg issue:

Successful retrieval is essential for retrieval practice; if students cannot recall much, there will likely be little or no benefit of retrieval practice (Karpicke, Blunt, Smith, & Karpicke, 2014). For example, Karpicke and colleagues (2014, Experiment 1) examined free recall—a retrieval-based learning strategy that has been shown time and again to be extremely effective for adult learners (e.g., Roediger & Karpicke, 2006)—in 4th graders and found no benefit on learning measured 5 days later. Initial retrieval success in this experiment was very low (around 8%).

This may read as though successful retrieval is essential for retrieval to be successful. But this is mostly because of the two different measurements of success being used—successful retrieval refers to recalling some part (greater than 8%, presumably) of already learned material at any given moment or short window of time; retrieval practice is considered successful when you remember the material over longer stretches of time than with other strategies.

Still, this potential confusion may bring into relief the importance of recognizing the place for retrieval practice in learning: it strengthens memory; it doesn't create memory. As such, it is situated firmly in Phase 2—it seems to work best once students have progressed at least part way, if not all the way, through the acquisition phase. We have to be careful about this. "Input less, output more" is a great message for Phase 2 learning. In a 'culture' (I guess) that has a hard time dealing seriously with acquisition issues of learning, it's a potentially dangerous one. We still have to provide students with rich inputs in order to make retrieval any good.

Not the Only Interpretation

To be honest, what prompted me to write up this study was my wondering how the success of retrieval might be interpreted in a regular classroom, by a teacher without access to the hypotheses and previous results that informed this study. What might a "layperson" conclude was the explanation for the superior performance of the retrieval group, given no information about the different groups in the study except for their final scores and their overt behavior during the study?

There are all kinds of different behavioral interpretations possible when hypotheses about cognition are out of reach. One might conclude, for example, that the retrieval group was allowed to "take more ownership of their learning"—you see, because they weren't helped as much; they had to fill in the missing letters of the words to be remembered rather than just reading them off of a list. One might say that students in the retrieval group were "learning by doing," especially if participants in that group were writing when other participants weren't. And this writing is very 'kinaesthetic', don't you know.

It's not even all that hard to see how parts of that Pyramid of Learning nonsense could be a folk-psychological, behavioral misinterpretation of benefits more appropriately accorded to retrieval practice. Engaging in retrieval "in the wild" may often look like those activities closer to the bottom of the pyramid than to the top. But of course those activities don't necessarily cause better performance; a positive confound, such as retrieval practice, may be responsible for both the activity and the better performance. When science is absent and we must rely on observations alone, misunderstanding the underlying cause of better retrieval may then naturally result in trying to replicate superficial features of situations correlated to improved performance. Misunderstanding then slowly becomes common wisdom and philosophy. Jeffrey D. Karpicke, Janell R. Blunt, & Megan A. Smith (2016). Retrieval-Based Learning: Positive Effects of Retrieval Practice in Elementary School Children Frontiers in Psychology, 2-28 : 10.3389/fpsyg.2016.00350