XaiJu
Andy Matuschak
Andy Matuschak

patreon


Studying myself studying linear algebra

After the past few months digging into research on problem-solving practice and reading comprehension, I felt lost in a fog of abstraction. I needed to ground all those ideas in something real and concrete. Are these all just threshold effects? Do these problems basically just go away with moderate reading strategy skills and appropriately-leveled problem sets? If so, what issues remain?

So this month, rather than observing another student, I decided to observe myself. What important barriers remain in my learning experience—even when I use my memory system, apply “expert” reading comprehension strategies, and solve every exercise? I want to create an alien sense of ease and confidence when learning from an explanatory text. Alex, the student I worked with a few months ago, had troubles which seemed to be rooted in reading comprehension and a need for more problem-solving scaffolds. If I make it past those issues, where do my systems and strategies fall short?

As it happens, I’ve been looking for a good excuse to study linear algebra. My last attempt was about 17 years ago at Caltech. I’d attended a liberal arts-focused secondary school, which left me wildly unprepared for Caltech’s theory-laden math program. I absorbed very little. These days, when I dig into computer graphics and machine learning papers, I’m often frustrated: my fragmentary understanding of linear algebra spreads like contagion into fragmentary understanding of ideas in graphics and machine learning. I’d really like to understand these topics more solidly!

It’s also nice for my experiments that linear algebra involves a good mix of declarative, conceptual, and procedural knowledge. That is, the subject involves: a zoo of notation and terminology; interrelated mathematical objects and their myriad properties; and essential methods for manipulating those concepts. Problem-solving practice emphasizes procedural knowledge; memory systems traditionally emphasize declarative knowledge, but I’ve been trying to stretch them into conceptual (and to a lesser extent) procedural domains.

I chose Jim Hefferon’s Linear Algebra because it’s both well-regarded and licensed permissively: I knew I might want to build experiments adapting whatever book I used. For my initial study period, I read the first 35 pages in a careful, interrogative fashion (like in my video with Dwarkesh), writing and reviewing 65 memory prompts along the way. I solved all 57 problems included in those sections, checked my answers against the solution set, and wrote notes about any errors or interesting differences in my solutions. That whole process took about 20 hours.

I emerged with what feels like a strong understanding of the material. But my internal experience was far from “an alien sense of ease and confidence”. I felt unsupported in a number of important ways by the learning environment I’d created. Happily, many of these observations point to interesting paths for future prototyping and exploration.

Comprehension support

Before I can worry about building rich understanding or reinforcing my long-term memory, I need to ensure that I simply comprehend what the text is saying.

I’ve built habits around reading comprehension strategies like questioning and elaborating the text. These put me in a better position than many readers, but I know from past experience that they’re not enough: I still often discover that I’ve missed important points from the text.

For this book, though, I had extra help from my memory system and the problem sets. Let’s look at the impact of those supports on my reading comprehension.

Prompt-writing and comprehension support

My reading comprehension is much more reliable when I’m writing thorough memory prompts about a text. The process puts me into an active state of mind. I’m on the lookout for anything that seems important; I’m less likely to gloss over key details. And in order to transform those details into retrieval tasks, I usually need to comprehend them in at least some basic way.

That’s all great. But I do notice a few important limitations.

Sometimes I don’t know how to write prompts. Learning to write good prompts is like learning to write good prose: you build up an enormous library of micro-strategies for dealing with different situations. But I don’t (yet) have strategies for writing memory prompts about some kinds of material. For instance, in this book, an explanation of a preferred form for solution sets begins by presenting an abstract symbolic representation; then it delves into several contrasting examples to demonstrate how that form plays out in practice and to give the reader a feel for why it’s expressed as it is. It’s easy to write some basic prompts exercising the abstract symbolic form. It’s much harder to write prompts which involve the subtle points which the examples demonstrate, and the various reasons why this form is preferred. If I really focus and get creative, I can generally figure something out. But often that feels too burdensome, so I’ll just keep moving, sometimes without feeling I’ve explicitly made that choice. In these cases, prompt-writing hasn’t really checked my comprehension. (And, of course, my memory of those details won’t be reinforced—more on that later.)

Prompts create obligations. I want to make sure that I comprehend everything that the author says. But I don’t necessarily want to sign up to repeatedly practice everything the author says. When I lean heavily on memory prompts to reinforce my comprehension, I’ll often feel burdened later by the density of prompts. I’ll find I don’t care about many of the finer details, or that those details are adequately reinforced by later synthesis prompts. I can delete the excess prompts, of course, but there’s a small cost to each deletion decision. And it takes a lot of work to write all those “unnecessary” prompts—much more than, for instance, just explaining the text aloud to myself as I read. Some of that effort produces a more elaborated understanding, but much of it feels like wasted energy.

Shallow prompts, shallow comprehension. When a key definition is provided, it’s easy to write a prompt by simply paraphrasing the definition as given in the text. And that’s a problem because it’s easy to paraphrase without comprehension, as the self-explanation literature has found. Now, prompts which simply paraphrase the text usually aren’t very good. Prompts often need to distill and elaborate to be effective. So one could respond to my complaint by saying: “just write better prompts!” Sure. But I’ll point out that the medium doesn’t help me do the right thing here. It’s easy to inadvertently write shallow prompts, and to avoid that, I need to both apply constant monitoring—difficult when learning new material—and also spend more effort on prompt-writing—but it’s not always obvious when it’s “worth it”.

The mnemonic medium and comprehension support

If prompt-writing is an important comprehension strategy, that seems to pose a problem for the mnemonic medium, where prompts are written by others. Now, the embedded memory prompts do (inadvertently) act as a kind of basic comprehension check. If a prompt about a passage you just read seems baffling, that’s unlikely to be an issue with forgetting. You probably skimmed over the relevant material. The embedded prompts have indirect effects, too: many readers have reported that after discovering that their comprehension was so poor, they start reading more carefully (as the adjunct question literature predicts).

But we designed the embedded review interface with memory practice in mind, not comprehension checks. I’ve observed enough mnemonic medium readers to see that memory failures and comprehension failures feel very different.

Memory failures usually feel like uncovering something you once knew but have since forgotten: “Ah… where did the imaginary term go again? (Reveal answer) Oops, it’s in the upper right cell. OK, better review that again.” Or: “I didn’t remember that, but I don’t care about remembering that. (Delete)” Readers are more comfortable declaring that they don’t care about something when they know what that something is.

By contrast, comprehension failures often feel arbitrary and capricious: “I don’t really know what this is talking about. (Reveal answer) O…kay? I don’t get it.” The interface presents two buttons: “Remembered” and “Forgotten.” Both of these feel bad to this reader. They feel they’re “supposed” to click “Forgotten”, but they also know that this will just make the exact same question reappear at the end of the session. They don’t want to review this question again: they know it won’t be any less confusing next time, and because they don’t understand the answer, they don’t value knowing it. They know that if they answer the prompt “correctly” when it’s repeated, it’ll only be because they’re parroting without understanding, and parroting feels bad. They can click “Remembered” to “make the prompt go away”, but that doesn’t feel good either: maybe it’s important? And they don’t necessarily want to sign up to answer it again in the future.

If we’re going to use adjunct questions as comprehension checks, I think we’ll want to differentiate them more from retrieval practice prompts, probably in both content and interface design.

Problem sets and comprehension support

With my recent literature reviews fresh in my mind, I noticed that problem sets (in this and other university-level texts) seem to serve three distinct goals:

  1. Check comprehension: ensure that you actually read and understood the text.
  2. Facilitate skill acquisition: practice applying what you’ve learned to induce patterns, reinforce long-term memory, and build procedural automaticity.
  3. Stimulate elaboration: induce you to draw connections, notice additional properties, think creatively. This promotes both richer understanding and higher memory stability.

Different problems mix these goals in different ratios, of course, but all problems implicitly involve some kind of comprehension check. You can’t solve the problem if you didn’t understand the part of the text it’s depending on. The advantage here is that you get a comprehension check by doing—by putting the material to some use. This tends to be much more enjoyable than directly answering rote comprehension (and memory) questions. That’s particularly true when the problems are authentically interesting for their own sake, rather than just drill questions.

But largely because of their overloaded goals, the problem sets in this book don’t offer as much comprehension support as I want. And I think these issues are true of problem sets in other similar books.

Bad failure modes. At times I found myself stuck, but it wasn’t necessarily clear that I was stuck because I glossed over some important point in the text. When comprehension isn’t the issue, stubborn perseverance is usually appropriate, so I’d wonder if I just needed to try harder. So I’d flail at the problem, and the flailing wasn’t constructive, because I simply lacked some relevant information. The trouble is that I often couldn’t tell which situation I was in. The best remedy for a comprehension gap is usually to re-read some explanation in the text, but even with the solution manual in hand, it wasn’t necessarily clear where I should focus. In these cases, I’d end up flipping through the chapter again, looking for something that might be relevant. This seems like a more or less universal issue with problems as comprehension checks.

Biased coverage. In the course of a chapter, you’ll often learn that something is true, why it is true, and why that matters. Problems emphasize application, analysis, and synthesis, so they’ll mostly check if you comprehended that the thing is true, but not so much the other discussions. For example, the text tells me that the solution set of a linear system can be expressed as the sum of a particular solution and a linear combination of free variables which represent the solution set to the associated homogeneous system. Much of the chapter was spent discussing a proof of this property and some of its implications. But none of the problems checked my comprehension of the proof, and the interpretative remarks were only partially probed. I tested myself later to see if I could explain the proof, and I realized that I hadn’t understood a central move, though I had successfully completed the problem set.

You could argue that this is just a flaw in the textbook—that a problem set should exercise your understanding of everything the corresponding book section says. But I think problems are naturally predisposed to check comprehension of certain kinds of material and not others. If you push against that grain, you’ll end up with a different kind of activity, something that doesn’t feel like a problem.

Which problems cover new ideas? Some problems look very similar, but each has been cleverly constructed to exercise some different facet of the underlying material. But sometimes similar problems are just about repetition, to promote fluency. Often I’d feel like I didn’t need to, say, solve yet another system of linear equations: I felt fluent enough! But some of those problems hid subtle differences, novel comprehension checks. So I did every subproblem, and many of them felt unnecessary. Or, to put it another way: many redundant subproblems should have been “smeared” across the subsequent weeks, to support long-term memory. But I wouldn’t want to delay the comprehension checks.

Got the answer, missed the point. In a few cases, a problem was constructed so that it could be straightforwardly solved by applying some property that was discussed in the text. I’d solve it laboriously, and get the same answer, but I’d missed the point. To notice this happening, I needed to not just check my answers but to retrace the steps of every solution, comparing them to my own and watching out for important differences.

Skill acquisition

When we learn to do something new, a very interesting transition occurs. At first, we think explicitly about the various objects and actions at play. There’s often explicit verbal rehearsal: “To perform Gaussian elimination, there are three operations I can use…” But with practice, we learn patterns. Then we “just know” what to do, without the feeling of active retrieval: “Alright, halve this equation, then use it to knock out the first term of the other one…”[1]

In my experience, retrieval practice doesn’t cause this transition by itself, though it seems to make the transition easier. Maybe that’s because reliable long-term memory for the relevant declarative knowledge reduces working memory load. Problem-solving practice is the traditional way to produce this transition. That mostly worked quite well for me, but I’ll mention a few challenges which seem like interesting opportunities for improvement:

Problem-solving practice should be smeared over time. I finished reading each section, then did all the problems associated with that section. Many of those problems were about repeatedly practicing the same skill: say, finding the solution set of a linear system with free variables. That’s great—it’s what I need to do to get fluency—but it’s awfully unpleasant to solve ten problems like this in a row. Courses traditionally solve this by assigning only a fraction of the book’s problems, but what if I need lots of practice to become fluent? Emotionally and practically (per the spacing effect), it would be better to spread problem-solving practice out over time. In reality, that’s tough to orchestrate. You might want, say, one problem a day for a few days, then every other day, then twice a week, etc? This requires a kind of programmable attention.

How much practice should I do? Maybe I only needed to do half of the book’s problems. I’m not sure. “Mastery-based” learning systems will have rules like “keep practicing until you can do five in a row.” The trouble here is that it’s possible for me to achieve this criterion without having learned the kinds of patterns which let me “recognize directly what [I] formerly had to think through.” Then I might struggle on some more difficult skill which depends on this prior skill, because the prior skill still requires so much cognitive load, and it wouldn’t necessarily be clear why. “Khan Academy says I mastered all the prerequisites!” The literature on procedural knowledge has probably identified some useful answers here, but I haven’t yet read deeply in that topic. As an example, I’d guess that time pressure (e.g. Number Munchers?) would reveal a sharper difference.

How to practice proof problems? This book includes plenty of problems which require proofs. These are less rote, of course: you’re not learning to perform some consistent operation. Proofs are in part about pushing you to understand more deeply, which we’ll discuss in the next section. But these proof problems are also about learning to recognize certain patterns, about becoming fluent in exploiting specific properties of these mathematical objects. So, what should I do when I fail to recognize the pattern, or when I fail to see how to exploit the property? I can read the solution and write a note about the insight I missed—maybe even a memory prompt about it. But how to practice further? How to ensure I recognize that pattern? Proof problems aren’t as fungible as more routine problems; I can’t easily arrange to slip in another proof problem which exploits the same property.

Building deeper understanding

I don’t want to just know what the text says. I want to know what it means, and why it matters. I want to connect these ideas to my prior experiences, and to deploy them creatively in the future. All in all: I want to internalize a richly integrated and elaborated representation of the material.

Writing memory prompts helps with this a little: the process encourages me to generate examples, to infer details which the text omits, to consider implications, to clarify why I find a detail important. These are all relatively “expert” prompt-writing practices, not things most memory system users would experience. A mnemonic medium reader wouldn’t get this benefit through prompt-writing, but they could get this benefit if the author’s prompts were designed to cause them to think those thoughts.

This linear algebra book’s problem sets did that for me. Many of the problems were clearly designed to encourage deeper consideration of details in the main text. One simple example: “In the proof of lemma 3.6, what happens if there are no 0=0 equations?” I hadn’t thought about that case when I read the proof, despite my prompt-writing practice.

But the problem sets were mostly focused on problem-solving practice in service of skill acquisition. Perhaps more than straight memory support, I’d like to answer dozens more new questions which encourage elaboration and inference, perhaps one or two per review session, for weeks. Ideally, these questions would slowly increase the transfer distance involved.

Remembering what I’ve learned

Having gone to all this effort, I’d like to make sure that I remember what I’ve learned, for the long term. My memory practice will help with this. But there are a number of interesting issues, some of which we first discussed in “Fluid practice for fluid understanding.”

To what extent is explicit retrieval practice obviated by good problems? Many of my prompts aim to make me retrieve a single detail: e.g. what is the definition of singularity? But if I solve a problem which requires using that definition, then that problem also makes me retrieve it from long-term memory, which will make it more stable in a similar way—or better, via elaboration. The problem-solving task is likely to feel less rote, more authentically interesting. But it also takes more time and effort than a simple flashcard interaction. How should we think about this tradeoff? Which kinds of prompt are best practiced in focused isolation, rather than through integrative practice? Does the answer vary with familiarity?

Prompt clusters should be spread out. Given a concept like “singularity”, I should be able to give the definition. But I should also be able to: given the definition, provide the term; given an example, determine whether it’s singular, nonsingular, or neither; given that a matrix is nonsingular, conclude something about the solution set size of its associated linear system; answer some prompt which emphasizes the constraint that singular matrices must be square; etc. The trouble is that when I answer several of these questions consecutively, I don’t have to retrieve as much from long-term memory for the later prompts. Parts of the answer are still in my working memory from the earlier prompts. I think it would be better to spread highly related prompts like this out over multiple review sessions. And a right or wrong answer on one should probably influence the schedules of the others. This is one important way in which memory systems for conceptual knowledge seem to want to be organized around some different primitive from their declarative-focused forebears.

Some prompts want variation. I like prompts which involve examples: “is this matrix singular?” But you wouldn’t want the prompt to use the same matrix every time—you’d just memorize the answer. The point is to reinforce memory for the procedure. This sort of question seems like a good opportunity for randomization, or at least for choosing from a set of pre-made possibilities, as in Quantum mechanics in a nutshell’s “application prompts”.

Can problem-solving be integrated into review sessions? By the end of these problem sets, I felt quite able to apply what I’d learned in a range of situations. How can I ensure that this remains true? One approach would be to insert similar problems into my review sessions. But they do involve a different stance: many required a paper and pencil, and much more time than a typical prompt. I usually do my reviews on my phone, and I don’t necessarily have paper around. Maybe I need to set aside one or two sessions per week for more involved practice? Alternately, as an experiment, I was able to rewrite many of these problems into congruent forms I could solve “in my head”, without paper. How broadly can I apply that strategy? Are those “reduced” problems sufficient to maintain the skills I learned?

How to handle insights which arise during problem-solving? The most interesting problems in this book seem designed to produce specific insights as I solve them. Many of these insights are difficult to describe—a subtle emphasis or connection in the realm of symbols or abstractions. And yet I have the strong sense that I’m learning something important. How can I ensure that I retain that lesson? I often struggle to write prompts about these insights.

How to treat proofs? I’ll confess that I don’t have a strong theory of knowledge for proofs. What I want isn’t really just to know the proof, in the sense of being able to parrot it back—but rather, to develop enough intimacy with its key moves and insights that I could use them to prove some related claims. For example, the book presents a proof that reordering linear equations doesn’t change the system’s solution set. I was able to use analogous proof-strategies to demonstrate the validity of the scaling and combination Gaussian operations. But how to make sure that I retain that understanding? I feel unsure about which prompts to write, or how to write them[2]. Testing myself now, two weeks after my initial read, I find that I can reproduce the most complex proof discussed so far—but only with a fair amount of struggle.

Prompt-writing is a lot of work. I’d estimate that prompt-writing consumed around 2 of the 20 hours I spent on these sections. A 10% tax doesn’t seem so bad, given the comprehension and long-term memory benefits. But prompt-writing requires much more mental effort than mere reading, and more than much of the problem-solving. It felt like at least a third of the overall effort, and in some sections as much as half.

Some high-order bits

That was a lot of detail. Stepping back a moment, if I had a magic wand, these are my main wishes:

If I got these wishes, I think I’d feel something closer to an “alien sense of ease and confidence”. I expect I’ll embark on some prototypes in these directions in the coming weeks.

————————

Thanks to Gary Bernhardt, Elliott Jin, and Russel Simmons for helpful discussion of these topics.

————————

[1]: For a good summary of this process, see John R. Anderson’s “Learning and Memory”, chapter 9.

[2]: Michael Nielsen’s “Using spaced repetition systems to see through a piece of mathematics” is very stimulating here, but I’m not yet quite able to connect the dots myself.

Comments

Oh, and the playful title of your essay reminded me of one of my own from quite a few years ago: "What I’m learning from learning to juggle". Nowhere near as deep as your work, but a reflection of the process of learning the basic three ball juggle. Quick read and probably nothing new for your, but it may give the gratitude I have for your work some context. https://www.ahundredquirkylegs.com/2018/07/09/what-im-learning-from-learning-to-juggle/

Tobias Reber

First time poster, long time fan here. Listened to this older one the other day and just wanted to say thank you, Andy, for this one in particular. Feels like it was a core piece of self-reflection for you. Also just found and listened to the Dwarkesh interview and have a lot of notes and insights to process. In both pieces of media, and in all your work, I admired the rigor and intellectually humility and honesty. The reassurement and kinship that I find in this while doing my own work alongside someone who's working like that, is of just as much value as the insights of the content of your work. Deep, heartfelt thanks for sharing this with us.

Tobias Reber

A thought about the obligations: I often have a lot of premature ideas popped in my head. Since I'm not sure when to deal with them and fear missing them, I feel I should to write them immediately. But if I'm using Incremental Reading, I have more chances to revisit the text, and I believe the loop is close. I don't have to rush to write down every single thoughts and related cards. It's better to wait until I really understand the material before I start writing cards.

Jarrett Ye

Good piece! I noticed that this process puts a lot of pressure on the quality of the book's practice questions. When studying something, especially new or esoteric, there isn't necessarily a set of nice practice problems.

O L

"Can problem-solving be integrated into review sessions?... many required a paper and pencil, and much more time than a typical prompt. I usually do my reviews on my phone, and I don’t necessarily have paper around." I really feel this remark, as I've been using Anki for math problems (and other "practice tasks") for a year and a half now. Practice problems are definitely harder to keep up with. I saw one person on Reddit who uses Anki in this way recommend that you learn no more than *one exercise a day* this way, else the review load becomes unbearable. To cast it OR terms, it turns the programmable attention medium into a _resource-constrained_ scheduling problem: unlike flash cards, I can't do math problems (or piano practice, etc.) "in line at the (proverbial) grocery store." It takes certain conditions: pencil & paper, peace and quiet, commitment have at least 5–15 uninterrupted minutes (instead of 30 seconds), etc. One strategy is to schedule it into dedicated work time: make reviews/practice part of my scheduled work day (just like keeping on top of my work email!). I try to do this with my math & programming exercise cards—though current project demands always trump skill maintenance, so it can be hard to find the time to avoid falling behind. ——— "Maybe I need to set aside one or two sessions per week for more involved practice?" This is the exactly the conclusion I came to a couple weeks ago! For hobby practice, I've been experimenting with *batching* my practice sessions. Even if each activity only requires 2–3 minutes of upkeep daily, it's just too much to try and keep up **every day** with exercises for piano (requires a piano and focus time), guitar (requires a guitar and focus time), languages (requires focus time and quiet for reading or listening), and martial arts (requires a dummy and focus time). Switching between all of those activities is a big barrier to habit formation (unlike with flash cards, which can all be tossed into the same central deck!). So I've relaxed the "do your reviews every day" principle, and am currently trying to schedule myself *once a week* for each different activity. This reduces the context-switching costs and helps me mentally prepare in advance each day for just one activity ("oh right, today is piano day—I'll think about when I can steal away to the keyboard later this evening after the baby goes to bed"). Aside: perhaps this is a good way to organize decks. Rather than a "math deck" and a "Chinese writing" deck, make it a "pencil and paper" deck so you can do everything that requires that resource while you have it! Consolidating decks usually helps simplify habit formation and keep flow. ——— "Alternately, as an experiment, I was able to rewrite many of these problems into congruent forms I could solve “in my head”, without paper. How broadly can I apply that strategy? Are those “reduced” problems sufficient to maintain the skills I learned?"" I've done this quite a few times over the last 3–4 years: created flash cards for 1) simpler derivations that have just a few steps (ex. "how do you get from Newton's 3rd law to conservation of momentum?") or 2) an abstract basic idea of a proof that I can recite verbally without too much brain-stretching (ex. "to compute the maximum likelihood estimator of a distribution, start by rewriting it as a joint distribution, using the i.i.d. assumption to break it into a product, taking the log to make it a sum, then set the derivative to zero and solve for the minimum"). Conclusion: I don't really like these complex cards (as compared to more typical atomic mathematics-concept cards, which I find great value in). They *kind of* work? Certainly better than nothing! But the details remain hazy in my head, and the cards feel tedious to answer after the review intervals get large. Ever since I started scheduling practice cards instead, I find I like that much better, and I've even moved some of my old "do it in your head" cards *to* my practice deck because I would rather do them with pen and paper. The upside is that you can do them in line at the grocery store *in principle.* In practice, though, when a not-so-easy-to-do-in-your-head math card comes up in my flash cards, I almost always set the phone down and say "this is a great time to take a break" (because I have an aversion to that card!).

Eric 'Siggy'


More Creators