Narrow the Diff

Small Diffs help you find Relevant Factors

What would convince you that the accelerator pedal is indeed relevant to the speed of a moving car? First of all, we saw that we can’t call it a relevant factor or a cause unless we see it changing the output. So, we must observe the speed changing. But what if you turn the steering wheel, blast the horn, and crank up the music at the same time as you pressed the pedal? How would you know that the accelerator was the cause? Maybe the horn did it.

The ideal way to find out if something is truly a cause is to have one car where the accelerator pedal is pressed without changing anything else and another car where nothing is changed at all. If the car’s speed in the first case is greater than in the second case, you can be reasonably sure that the accelerator pedal was the cause. And, if it is the same in both, then you can say that, at least for that baseline car state, the pedal is not relevant to the car’s speed. This could happen if, say, the fuel tank is empty, the brakes are on, or the pedal has been disconnected. For those baselines, the accelerator pedal is not a relevant factor.

This is what we would call a controlled experiment. A simpler experiment would be to press the pedal in your car and see if the speed changes. This is less conclusive because there might be some unseen factor, such as the fuel running out, that also changed without your noticing it. If so, your simple experiment would make you think the accelerator pedal slows things down. That wouldn’t happen in a controlled experiment, because both cars would slow down, so you wouldn’t blame the accelerator pedal.

What happened there was that you got a change in the output variable - the speed of the car - and since there was only one thing that changed in the inputs, you could attribute the change in output to it. The diff, i.e., the set of things that differed between the inputs, was just one variable.

The smaller the diff, the fewer are the possible causes of any change in the output. For example, suppose you inflated your car’s tires and changed the type of fuel you used and found that your car went faster than before. Was the increase in speed due to the tires or the fuel? You can’t say for sure, but you know it’s one or both of them. You had a diff of only two variables and thus had only two suspects.

Contrast that to the case where you drive a friend’s car, which is of a different make, different engine, different brand of tires, running on diesel instead of gasoline, and five years old instead of just one year old like your car. If it goes much faster than your car, what is it due to? Is it because of the engine alone? Or the tires too? Perhaps the age of the car also has something to do with it. Suddenly you have a much larger diff and you can’t narrow down the possible causes of the changes. Maybe the tires don’t have anything to do with it. You can’t tell. It won’t help much even if you see a few other cars if they all have large diffs, i.e., changes in many inputs such as the engine, tires, and other kinds of stuff.

Imagine what it would have been like if you had been shown the changes one at a time, i.e., with a diff of one variable at a time. First, your friend takes your car and adds his engine. You see that your car goes much faster than before, even faster than his original car. So, the engine is an important relevant factor. Then, he adds his tires. No change in speed. Huh. The tires don’t quite matter, given this baseline of engine. Then he somehow adds the grime and dust and wear and tear that would have been added over four more years of usage. You find that it slows down the car to approximately that of his original car. Now, you have a much clearer idea of what’s going on. You know that the difference in engines was a big factor, that the tires were not a relevant factor given the other stuff, and that the wear and tear was a relevant factor. The small diffs helped a lot more than the large diffs in finding the relevant factors.

Hypothesis: If we get changes in the output caused by a small diff, we can find out the relevant factors more easily than with a large diff.

Note that we still don’t have a quantitative model of how exactly the engine affects the speed. All we care about right now is which factors affect the output, not how they affect the output.1

Close Negative Examples

We all know that scientists run controlled experiments to get narrow diffs along with a change in the output. We should do that whenever we can run experiments. But what happens when we can’t experiment and can only observe? Let’s focus on the less-often practiced technique of looking for an existing example that has a small diff.2

Example: Consider this fictional example from House M.D. Some babies in the maternity ward got sick with similar symptoms and he’s not able to diagnose them. There are so many possibilities; it could be a virus, bacterial infection, something called “MRSA”, contaminated food or water source, and so on. And there isn’t enough time to narrow it down by running the usual tests, because “[c]ultures will take 48 hours, might as well be post-mortem.”

The problem is that there is quite a large diff between the environment of the sick babies here and that of normal babies that you could find anywhere else. These babies were born in this maternity ward to these mothers at this time. The other babies were born in some other maternity ward to other mothers at other times. A bacterial infection, if any, could be in this particular ward, or it could have spread from one mother to the other babies. The same goes for the different food and water sources in the two hospitals. You would have to compare a lot of things between the two sets of babies.

How could Dr. House narrow down the relevant factors?

House: Wait a second. The kids on the floor who didn’t get sick. Are any of them still in the hospital?

Wilson: They got moved to the fifth floor. But they’re probably all checked out by now.

Cuddy: No, the Lindpert boy had a bit of jaundice. He should be checking out today.

House: I want to test his blood, too.

Cuddy: Why?

House: ’Cause we need all the information we can get. The healthy kid can be our control group.

S01E04, Maternity, House M.D. (emphasis mine)

Instead of focusing only on the babies who got sick, he asked for the close negative example of the babies born in the same ward who didn’t get sick. That way he could control for factors like maternity ward (it’s the same ward), food source (probably the same food source), and pediatrician (probably the same guys). Now, he’s got a change in the output (sick vs healthy) along with a smaller diff.

He finds out that the healthy kid tested positive for antibodies for three diseases: Echovirus, CMV, and parvovirus.

House: … The healthy kids survived because their mothers’ antibodies saved them.

Foreman: The mom had CMV in the past she’d have the antibodies for them, the kid would be immune from it. So we test the sick kids’ moms for Echovirus, CMV, and parvovirus.

House: And whichever they don’t have the antibodies for, that’s what’s killing their kids.

Narrowing the diff once again! Compare the antibodies between the moms and the babies. Whichever one the moms didn’t have must be the one the babies weren’t immune to and thus the one that is killing them.

Example: Take “talent”. You see a soccer star who can effortlessly dodge three defenders and score a goal, whereas when you try, you get deprived of the ball before you know what’s happening. What causes his superior soccer skill? The diff is so large! His upbringing was different from yours, as were his genetics and soccer coach and total hours of practice and matches played and his position and so on. What are the relevant factors?

We want a close negative example, one where the other soccer player is pretty similar but slightly less skilled. But if you compare one person to another, you’ll have a large diff because so many unrelated things differ, from the brand of soccer boots to favorite movies.

The best bet for getting a small diff is to compare a person to himself from the past. Suddenly, genes and upbringing and height and so on are pretty much all the same. Even then, his technique or match experience or practice hours may change a lot over time. So, look at a change in his skill that occurred over a short period of time. Only so many things can change in a few weeks. Figure out what it was. For example, Ronaldinho wasn’t born knowing how to do the elastico. He learned it at some point of time during his training. So, pin down the period in his life during which he picked it up and see what his practice looked like.

In other words, the soccer star went from practically zero skill to world-class skill in increments. He may have learned a new move or increased his speed on an old move by a certain percentage. Find out the factors that changed as he got those increments, such as closely watching another player demonstrate the new move or practicing first with a smaller ball, and you’ll be able to narrow down the relevant factors for those improvements and eventually his overall skill.

Example: The same principle applies to economic policies. It can be hard to compare two countries because they may differ not just in their economic policies but also in their natural resources and current GDP and major industrial sectors and infrastructure and culture. What to do? How to narrow the diff?

Look at the same country before and after a new policy, such as opening up of the economy to foreign investments as India did in 1991. Not much would have changed about its natural resources or infrastructure. Even if you can’t compare a country against itself, you can compare very similar countries. Instead of comparing free market policies vs command economy using America and North Korea, both of which differ in size and natural resources and so on, compare South Korea and North Korea.

Example: Suppose you want to understand how cultures differ in their productivity. You may compare a culture in Europe to one in South America, but the diff would be really large. They might differ in genetics, language, technology, natural resources, and so on. How would you figure out which factors were responsible for which changes in productivity? You would like a much smaller diff.

Scattered over the Pacific Ocean beyond New Guinea and Melanesia are thousands of islands differing greatly in area, isolation, elevation, climate, productivity, and geological and biological resources (Figure 2.1). For most of human history those islands lay far beyond the reach of watercraft. Around 1200 B.C. a group of farming, fishing, seafaring people from the Bismarck Archipelago north of New Guinea finally succeeded in reaching some of those islands. Over the following centuries their descendants colonized virtually every habitable scrap of land in the Pacific. The process was mostly complete by A.D. 500, with the last few islands settled around or soon after A.D. 1000.

Thus, within a modest time span, enormously diverse island environments were settled by colonists all of whom stemmed from the same founding population. The ultimate ancestors of all modern Polynesian populations shared essentially the same culture, language, technology, and set of domesticated plants and animals. Hence Polynesian history constitutes a natural experiment allowing us to study human adaptation, devoid of the usual complications of multiple waves of disparate colonists that often frustrate our attempts to understand adaptation elsewhere in the world.

– Chapter 2, Guns, Germs, and Steel, Jared Diamond (emphasis mine)

Notice how these examples narrowed the diff by controlling for factors like “culture, language, technology, and set of domesticated plants and animals”. Now, the main differing factors left were the environmental variables: “island climate, geological type, marine resources, area, terrain fragmentation, and isolation”.

By comparing islands that differed in those factors and looking at the differences in their outputs, the author Jared Diamond could figure out some of the relevant factors for productivity. For example, one measure of productivity is the amount of food collected and he found how much it varied based on whether the island had “poor soil and limited fresh water” or “large permanent streams”.

He didn’t have to worry about genetic or cultural differences in ability to farm or anything like that because they shared a lot of their genes and all descended from people who could farm. Because he had narrowed the diff, he could attribute the changes to the few factors that varied.

Note, however, that we can’t just look at the factors that differ. We also have to consider the factors that are shared, such as genes or technologies. The environmental differences discussed here mattered for that shared baseline of genes. Those Polynesian people with those genes got varying results on varying islands. But if they had had a different baseline of genes, perhaps some genetic mutations that made them be born without limbs, then even if their islands varied in soil type or fresh water availability, they would probably all have got the same results.

Even after noting this caveat, this is still a fantastic set of changes that narrows the diff between cultures and allow us to see the effect of certain relevant factors on productivity. This is what we should try to do when analyzing a domain where we can’t run a controlled experiment: get a narrow diff.

Example: Debugging basically runs on the idea of small diffs. Suppose you change a program in seven different places and a previously-passing test now fails. Which change was the culprit? It can be hard to tell. The diff is pretty large. But imagine if you had made just one change and ran the test. There is only one possible culprit.

[When] most of the change is small and incremental, … [it] means you know what to test most carefully when you’re about to release software: the last thing you changed.

The Other Road Ahead, Paul Graham

Example: This kind of small diff debugging applies outside programming. I was once utterly confused by a math proof in grad school. It was the end of a long class where the professor had covered many new concepts and he had gone several steps forward with this new proof. I couldn’t see how the steps followed. When he finished and asked if everyone got it, I just said that I was very confused. A lesser professor may have told me to go over the proof when I had time and come back to him if I still didn’t get it. But he didn’t.

He asked me what sounded like a strange question: which part is confusing? I was like, “The whole thing. I’m just not able to follow it.” But he wouldn’t let me get away with such a claim. He just brushed aside my despairing statement and pointed to the first step: Is this part clear? I nodded reluctantly, but felt like he was wasting time. The proof was just too confusing. What was the point of these questions? He then kept going down step by step till we found out the exact point at which I had gotten lost. Suddenly the diff had narrowed! Instead of the whole proof seeming confusing, it was now just one step. Once I learned about that one step, the rest was a piece of cake. I couldn’t believe I’d ever thought it was too hard to understand.

The lesson I drew from that incident was to be specific about your confusion. Get a narrow diff between the text that seems not confusing and the text that seems confusing. When you say that the whole thing is confusing, you leave it vague as to which factors are relevant in your being stuck. It can seem like you need an effort proportional to the size of the whole thing, which can make you give up. But when you say that I was able to follow up to this point and then I wasn’t able to follow any more, you know exactly which factor was relevant to your struggle and you can focus your attack on that one factor. It seems far more doable.

Notes

This is usually called controlling for the other variables. I focus on the size of the diff because it allows you to predict how many potentially relevant factors will remain. Saying that you must control for other variables is not as precise. In an experiment, you try to fix the rest of the inputs. In an observation, you try to compare things that have most of the other input factors as the same.

We’re ignoring prior knowledge for now. So, for the sake of our examples, nobody told you that the accelerator pedal accelerates the car. You have to figure it out yourself. This simulates situations where you have to explain by yourself how a system works.


  1. Yes, there is another way of narrowing down the relevant factors. You can come up with a quantitative model using a few factors, like in Newton’s second law a = F/m. If that model’s predictions precisely match your observed data, we can accept its factors as relevant and eliminate the rest of the factors, such as the color or shape of the object. However, the problem is that finding a model that fits a lot of data is hard. And, if you have only a little data, there can be many models that fit. For example, a model that stores all the input-output pairs seen so far will get 100% accuracy on the data so far, but will probably not do well in the future. This is where things like Bayes theorem and Occam’s Razor come in. They help you distinguish among different models based on their likelihood of producing the data observed and their prior probability. That kind of analysis can be hard, which is why it’s crucial to get examples with narrow diffs whenever possible. If you observe that more force makes a body accelerate more, you have found a relevant factor and eliminated all the models that say force is irrelevant.

  2. I call such an example as a “close negative example”. “Close” because the diff is small and “negative” because we want a change in the output, so that if we got a positive output (car moving fast), we want a negative output (car not moving fast).

Created: August 16, 2019
Last modified: August 21, 2019
Status: in-progress
Tags: narrow

comments powered by Disqus