Memory

Do Not Read (Again)

What books and resources should I not read again? That is, which ideas do I want to memorize? For example, I look at Gwern’s essays from time to time. But, to remember those ideas effectively, I must test my memory by trying to retrieve the concepts upon suitable cues. Straight rereading is rarely useful (except maybe to refresh exemplars).

Asking the Right Questions

Are you confident in the success of this plan? No, that is the wrong question, we are not limited to a single plan. Are you certain that this plan will be enough, that we need essay no others? Asked in such fashion, the question answers itself.

– Professor Quirrell, HPMOR Chapter 92

Questions

How do you explicitly store a memory upon a certain cue? How long will that memory last?

Ideas

To remember different aspects of an event or a memory, run through several cues. For example, when I was thinking about programming languages and then thought about my month-long stay at a programming company, I remembered that I thought I was too dumb to learn the new language (JScript, I think it was). I didn’t even really try. I assumed it would be beyond me and just stuck to C (“which is the best language anyway” – my 13-year-old self). I didn’t remember that aspect of my low childhood self-rating until I used the particular trigger “programming language”.

Perhaps keep a standard list of cues you use on everything. Just like Feynman (apparently) recommended with a list of 10 problems you always keep at the back of your mind and test for a match against any new concept you learn.

Skills

One cue I have in Emacs is that if I find myself doing the same task over and over, I automate it. So, the thought “I’ve done this so many times now” must trigger the memory “automate it”. One way to do that, I think, is to not allow you to do anything else when that thought comes. You’re not allowed to move forward. Even if you did go ahead, retrace your steps, do it the right way, and only then resume. That way, you strengthen that memory on this cue.

Similarly, when you’re writing a new function, add a new unit test. Don’t allow yourself to add a new function without a unit test.

Exemplars and Being Stuck

Hypothesis: stuck => lack of exemplars.

Why? What do exemplars do? Well, why do you get stuck?

My answer: It’s because you can’t abstract a useful cue from the situation, can’t trigger a useful high-level action, or can’t execute the action.

Example: Why was I stuck when I tried to come up with practice examples? It was because I didn’t have any memories stored on the cue “practice”. So, I kept asking my mind “what do I do for practice”, “what do I do for practice”… and it kept returning nothing. This was a situation of not having any memory stored on the cue of “practice”.

How would an exemplar have helped? I guess I would look at different parts of the exemplar and come up with varied cues that actually have memories stored on them. For example, when I looked at Eliezer’s examples of the invisible dragon in the garage and belief “in belief”, I started thinking of other places where we have unshakable beliefs, like politics, romance, text editors, etc.

Corollary: If you want to find analogies to a phenomenon, abstract its instances in different ways, and hopefully you will have other exemplars stored on that abstraction. Like “invisible dragon in my garage” -> “unshakable belief” -> politics, etc.

Hypothesis: productivity is proportional to #exemplars used, because the fewer the exemplars, he more you get stuck.

So, how do you fix my problem with “practice”? Be clear about what you want. Do you want to abstract a concrete situation, trigger other concepts given a label, or execute an action? Here, I want to… get some… Man, I’m so confused about this. I don’t even know what I want.

I think that’s a problem with pulling a high-level concept out of nowhere. When you abstract a concept from a given situation, you’re likely to have encountered that concept before and thus have stored something on it. So, you can keep yourself on track and reach your destination.

Hypothesis: Key question - Where did you get this concept label from? Is it just some random floating label that you got or did you extract it from a concrete situation?

Why does it matter? I suspect it matters only when you don’t have any useful memory stored on that cue. The thing is if you start from concrete situations, you’re bound to abstract only the cues that lead to useful memories, cues that you’ve used in the past. Whereas if you take a foreign concept label, you may not be able to do anything with it.

Yes, if you’ve stored something on a floating label, like “three causes of the civil war” -> “X, Y, and Z”, and strengthened that memory by repeated testing, then you will be able to regurgitate it on the test. But if you haven’t stored anything directly on that floating label and if you aren’t likely to come up with it starting from some concrete situation, then you’re not going to retrieve anything when you query your mind.

This is probably why I fail when I try to memorize a fancy new word like “opprobrium” or “salacious”. I squeeze my brain to retrieve the dictionary definition but mostly in vain. Those terms didn’t form in my mind by looking at some concrete usage that I cared about. I just saw them somewhere and thought I should remember them if I want to be a real intellectual. However, I don’t recall them on any particular situation, so I never get to strengthen and refine my memory on, say, the cue “opprobrium”. On the other hand, the word “cachet” is clear in my mind, even though I didn’t realize understand it a year ago, because I saw PG use it:

The thing to be when I was a kid was an executive. If you weren’t around then it’s hard to grasp the cachet that term had. The fancy version of everything was called the “executive” model.

– The Refragmentation

I took a moment to guess what it meant and then looked it up in the dictionary. This made me store its meaning and this example on the cue “cachet”. So, the next time I see it, I’ll remember this example and infer that it’s “an indication of superior or approved status” (though not those exact words - I had to look them up now).

This could be why people struggle at school - you’re given strange labels (“neoliberalism” or “Gibbs free energy”) and you’re supposed to do something useful with them. Why would you expect to be able to do that? You’ve got nothing stored on those labels! And why? Because you didn’t form those labels on concrete situations, you were just given them as passwords to memorize. Thus you never strengthened those associations.

Corollary: This is why you should always think using exemplars. Because you will abstract them into concepts that you’ve seen before and have stored something on. Or you’ll realize that you don’t have anything stored on that concept and then store something. That way you’ll keep making useful inferences and never get stuck.

Corollary: When do you get stuck? When you get an abstract cue for which you have no associated memories. Otherwise, your thoughts keep going on like a rock rolling down a mountain. There’s no stopping them.

And when do you get an abstract cue for which you have no associated memories? I guess when you haven’t come across that abstract cue before. Or rather when you haven’t tried to retrieve on that abstract cue before. And I suppose that happens when you don’t have a concrete situation that reminds you of that abstract cue. If you did, you would think of the abstract cue, and then realize that you aren’t able to come up with a useful solution and thus store one when you do find it.

Summary: No exemplars; just a random concept label -> no memories stored on that cue -> stuck! exemplars -> concepts on which you’ve likely stored memories -> memories stored on that cue -> think with those memories

Why would exemplars give rise to concepts on which you’ve likely stored memories? Because you only get to concepts from exemplars by using existing abstractions. Like, the guy saying “the dragon is impermeable to flour” and refusing to accept that it doesn’t exist makes me abstract the situation as “unshakable belief”. Now, if I’ve strengthened an abstraction like this, I must have used it earlier. That is, I must have some memories stored on it. Like “politics”, “romantic love is awesome”, etc. Exemplars make you more likely to encounter and retrieve on that concept and thus strength it.

In other words, you need homegrown concepts, not foreign ones.

Will I ever have an abstraction on which I have no memories at all? Now… I’m stuck here. I need exemplars. Wait. Why am I stuck? “abstraction + no memories” -> [blank]. Wow. I think this is the crux of problem-solving. Sometimes, you ask questions for which you don’t have any stored answers. How do we deal with them?

Here’s the thing: Why do you want that abstract memory? You don’t. You want to solve some concrete problem. If you didn’t, you could just make stuff up and present that as the answer. No. The constraint is that this abstract solution you’re after should actually solve some concrete problem you have.

So, look at the concrete problem. If you abstract it in different ways, you might come up with a solution you’ve stored earlier. If not, experiment with all the possible responses, see which one solves your concrete problem, and then store that as your answer. In other words, focus on the decision that matters, not the abstract question of “truth”.

Lesson: In short, never try to solve abstract problems. Take up concrete problems and then, in the process of solving them, come up with good abstractions. If you don’t have a concrete problem, why bother trying to solve the abstract one?

What about abstract logical inferences, like “man => mortal”, “socrates is a man”, and thus “socrates is mortal”? Especially in math, where you do this a lot in theorem-proving. Well, there you would have hopefully stored the abstraction on the abstract concepts that you perceive in each lemma. You would look at the concrete steps and instead see “A => B” + “~B” => “~A” (contrapositive). The key is that you’re still not dealing with abstract foreign concepts. Sure, you might have a bunch of “lifeless” symbols at each step of the proof, but you know how to abstract them into sensible categories (like “A => B” and “~B” above) and then you would have the correct inference (“~A”) stored on the cue of those categories.

The trouble only comes when you have no clue how to categorize a symbol or infer using it. Like “opprobrium” - what are you supposed to categorize it as? Is it a compliment or an insult? A noun or a verb? What do you infer using it? Is the speaker going to treat you well or not?

My trouble with “abstraction + no memories” above was that I couldn’t categorize this as something else or infer anything from it. I needed different cues, which I could only get by peering at the concrete problem.

So, an abstract problem is one for which you don’t have any solutions directly stored on that key. If you just keep straining your mind with that blank cue, you will remain stuck. Your mind is not magic; it can’t answer random questions. Look at the concrete problem instead.

Corollary: You can’t solve abstract problems.

Like “design deliberate practice” or “come up with a great causal thinking algorithm”. You’ve got nothing stored on those abstract cues. So you will get stuck. Better look at the concrete problems and take it from there.

This could be a big reason why you’re recommended to take up concrete projects and finish them. Like “summarize a PG essay” or “answer these 11th-standard biology textbook questions”. You get to abstract them into useful cues and make progress instead of getting stuck.

Note that “concrete” is relative. All that matters is that you be able to categorize or infer from the data given. It can be abstract nonsense but still trigger lots of memories (theoretical math problems) or some hands-on thing like folding your laundry.

We call a problem “abstract” when you’re just given high-level labels and you have no clue how to proceed. Like “how do we solve politics?”. Scott Alexander posed that question obliquely in his post about Manufacturing Consent. And I was stuck - I couldn’t see how to solve such an enormous problem. But that’s just because I’m working with the abstract label “politics + how to ‘solve’ it?” I haven’t got anything stored on that cue, especially because of the “solve” part. Instead, we should look at the concrete problems, like Democrats vs Republicans, atheism vs religious groups, etc. and specify our exact requirements.

So, when you’re making a plan to attack a problem (like “design deliberate practice”), make a list of concrete problems that you want to solve (like “come up with three essays where you need to describe the decision involved”).

How Do We Form Categories?

Going by the above idea of how exemplars lead to useful concepts whereas unfamiliar concept names just get us stuck, maybe this is how we build categories.

Hypothesis: Maybe we look at several exemplars that are said to be in the same category (iPhone 6s, Galaxy Note, Micromax) and then try to categorize each one. If we end up with different labels for them, we update our categorization till we label them all the same (“smartphones”).

Or wait. Apparently getting exemplars from different categories is more helpful for category formation (I remember reading about this in Make it Stick). So, take Moto Razr, Nokia 1600, and the Blackberry, along with the earlier three phones.

TODO: Not sure how to proceed here.

Maybe you refine your categorization till you categorize the former and the latter phones differently.

Nothing will happen if you categorize things “correctly”. (When will that be? How will you know you’ve categorized “correctly”?) But if you categorized them wrongly - calling the Nokia 1600 a “smartphone” - then you’ll notice and change your abstraction.

Why is Memory Important?

Because you’re not just losing information, you’re losing valuable information. That’s money going down the drain! For example, every time you forget to write scripts to automate tedium, you spend hours doing it by hand and time is money. (Better examples coming.)

Scattered Inconsistent Models Are Fine

Unique predictions from cue-memory theory: Scattered, unparsimonious, and even contradictory ideas are fine; unified models not needed.

Differing prediction between my old understanding of memory and thinking, and my cue-memory understanding: Humans can easily handle scattered, unorganized pieces of information. You don’t need a unified model to “make sense” of a few situations. Earlier, I thought you needed to find a simple description of the entire field before you could go ahead and solve problems. That was probably one reason why I took to reading textbooks instead of working with what I’d heard in class. It was my perfectionism - if it wasn’t a clean, mathematically simple model, I wasn’t interested in it. TODO examples?

Key insight: You can start getting marginal value from a theory, without understanding the whole of it. You just have to remember what to do for a particular class of situations. This, I guess, is how the people who learnt well from lectures did it. They accepted a few isolated answers for a few isolated questions and made the most of them.

Problem: You might miss the forest for the trees when you accept such scattered ideas.

This is how we compartmentalize! Scientists apply rigid empirical and inferential standards in their journal papers (“a is correlated with b could be confounded by c”), but fail to run any such checks outside the lab (“spirituality is awesome”).

Corollary: This is how you can end up with inconsistent beliefs. There’s no consistency checker.

You could end up failing to go for a unified model if you just accept the individual pieces of information.

Mindfulness Is Also About Empiricism

It’s not just about stopping unwanted thoughts.

For example: “Domino’s” -> will be awesome, must have it;

I get cravings: don’t have it -> awful!

Your happiness was never part of the equation. Nor your health.

mindfulness -> were you really happy when you ate that pizza?

Why Courts Might Use Precedents

You need exemplars to make fine-grained inferences. You simply can’t specify the membership criteria for a category in sufficient detail.

For example, was defendant X guilty of “gross incompetence”? Well, you can argue all day in circles. Or you can cite old cases and show that this current case is very similar (or not) to them. The second method is far more tractable for the human mind. We have lots of details and thus a lot of variables on which we can compare the two cases - the profession, the amount of damage done, the reliability of witnesses, etc.

We can combine the old cases (along with the wording of the law itself) and come up with a decent way to categorize - like, all the successful cases had features A and B and all the unsuccessful cases had feature ~A or ~B (though more detailed, of course). We can use them to make future judgments. And if we see a truly unusual case, we think about it a lot and set a precedent, which will be used by judges in the future. Whereas if you just have an abstract rule saying that the person “should have screwed up a lot” (paraphrasing), you’re going to have a useless debate.

Also, why would we want to have different allegations mentioned separately? So that the defendant can argue, for each allegation, about how similar that particular aspect of his case is to that of previous cases. If you don’t mention the particular variable you’re concerned about, then he doesn’t get this chance and so is probably given short shrift.

Language Design Mode and Reductionism

This is one of my favourite Steve Yegge quotes:

After you write a compiler, … for weeks afterwards, you can’t look at your code without seeing right through it, with exactly the same sensation you get when you stare long enough at a random-dot stereogram: you see your code unfold into a beautiful parse tree, with scopes winding like vines through its branches, the leaves flowering into assembly language or bytecode.

– Steve Yegge

I’ve had that happen myself when I was writing some language tools (compilers, type inference program, etc.). Why would that happen?

I suspect it’s because you’ve been strengthening the categories of the parse trees and their effects. When you normally write a program, you just see the things you’ve put together to do something you want. But, when you’re in language-design mode, you categorize it differently from usual. You see how the design choices are helping or hindering your work (can add new fields to an object (like in Python) or not (like in Java)), and what the high-level instructions actually get turned into (like the do-notation in Haskell turning into a bind-ed chain of lambdas).

Hmmm… language design work makes you think reductionistically. You start getting a technical understanding of what a high-level language element actually means. And this is the purest form of reductionism because the high-level element gets turned into low-level elements (IR or assembly language) that you can easily examine (unlike atoms).

Corollary: When you use a language without knowing how the expressions actually get translated, you don’t fully understand it. You may not know its performance tradeoffs (for example, between a foldl and a foldr in Haskell).

Learning How to Learn

Diffuse mode of thinking (not focussing on anything in particular) can help connect distant ideas.

Why are Numbers Hard to Memorize?

Because we don’t pay attention to the features that distinguish them. It’s hard for us to do that, I guess. When we see “68 -> Erbium; 69 -> Thulium”, we don’t really encode the unique features of 68 or 69. But when we think of a palm tree or a giant cat, we can easily encode the unique features. So, the problem with memorizing numbers is that we usually encode them to the same final cue and thus can’t distinguish them.

Mnemonics make “Hard” Problems a Cakewalk

I’d read Mankiw’s 10 principles of economics for years (like “Principle #3: Rational people think at the margin”), but never quite been able to recall them on demand. I could recognize but not retrieve them. Then, yesterday, I tried to memorize them using peg words, and they trotted to long-term memory obediently. The previously hard-to-remember lists were now trivial. The 7 “wonders” of the world went down just as easily.

So, make a list of all the things you thought were hard to memorize, and then memorize them.

Have a concrete object for every “concept”

Hypothesis: To remember an association between A and B, use a concrete object for each and then link them.

Verbs don’t make the cut because you can’t remember abstract ideas. You need vivid nouns doing things to each other. Like a cow going round and round the beacon of a lighthouse, leaving a Batman-like silhouette for ships to see; thus representing the seventh wonder of the world (cow = 7), the Lighthouse of Alexandria.

Boost your processing capacity by eliminating inane personal thoughts

There are two thresholds for adding significant value with your knowledge. Getting over yourself and having enough cash to pay your bills. These hurdles will free up your processing capacity. It will enable you to choose which problems to process.

If 95% of your mental chatter relates to your self identity and paying your bills, you have five percent of your capacity left to process novel problems. People who have come to terms with this can achieve in one day what would take you 20. …

[Lesson:] Free up processing capacity by getting over yourself and by having enough cash to pay your bills.

– How to Learn (on Medium)

This is an excellent point. Most of my thoughts (and thus my information-processing capacity) gets wasted on the same old personal worries. If my aim truly is to learn at peak efficiency, then I must rid my mind of these self-obsessed thoughts and focus on building valuable concepts.

People who Memorize a Lot

Doctors - look at their textbooks

Lawyers - look at their bookshelves

Created: January 23, 2016

Last modified: March 20, 2017

Status: in-progress

Tags: memory