Let's Make Predictions

Status: In Progress


The Great Predictions Experiment

I need to make more and more predictions.

Why?

Cos your Prediction Score determines your success in life completely.

How?

How about this? I make at least 10 predictions every day, with probability estimates.

This is just Version 1.0. We’ll do this for a week and take it from there.

Wait a minute.

Aren’t Measurements supposed to Matter? Shouldn’t there be some Decision to which this Measurement matters? What’s the point of making predictions (a form of measurement) in the air?

Wait. Are Predictions a form of Measurement?

A measurement is the quantitative reduction of uncertainty about something.

A prediction is a probability distribution over the possible outcomes in some event, as given by some hypothesis.

Why do I want to record my predictions and then get feedback on them?

I want to test my hypotheses

A button affords pushing. A hypothesis affords testing.

One way or another, I will get to know something new about my hypotheses. I will be able to update them. Bad hypotheses get thrown out. Good ones get upgraded.

By default, we never make predictions. We never test our hypotheses. And so, we rarely ever get direct evidence about our hypotheses. We can rest safely knowing that our cherished (and most probably, false) hypothesis can never be falsified.

Also, costly and hard-to-test predictions are useless to us.

So: Make easily testable predictions.

I want to calibrate my estimates

The previous case was about making predictions based on your hypotheses and, then, updating belief-levels of the hypotheses based on the results.

This is about making predictions in the first place from hypotheses.

Yo, won’t it be clear what predictions a hypothesis makes?

It would, ideally. But our head is hardly an ideal place. We keep hypotheses around implicitly. We don’t even know exactly which hypotheses we have, let alone what our probability estimates for them are.

Our overall prediction comes out from the predictions of all our hypotheses weighted by our beliefs in them.

But we neither know the quantitative predictions made by each hypothesis nor our belief-level for the hypothesis.

Then, how are we justified in pulling predictions out of our ass?

.

.

.

.

(Rhetorical)

Feeling quantitatively and Reasoning quantitatively

Ok. So shit’s bad right now. How do we fix things?

We need to calibrate our probability estimates. i.e., we need to know how to translate implicit belief-levels in our minds into actual probabilities.

i.e., intuitive feeling (“I’m pretty sure”) -> a quantitative probability (95%).

We need to know the hypotheses we have, know what quantitative predictions they make, know how much we believe in each hypothesis, and aggregate all that to come up with our actual predictions. That is the ideal reasoning process.

i.e., make our reasoning process more explicit and quantitative.

Alternatively, we could skip all that quantitative reasoning thing and just go with our final gut feeling, the implicit prediction in our head, and translate that into a quantitative prediction.

Wrong Translation vs Wrong Beliefs

I was finding it difficult to separate the two.

You could have the wrong belief-levels about some hypotheses - an honest mistaken belief about the world - and thus make a wrong prediction. You might believe Kohli is not playing in the match (i.e., you have low belief-level in the hypothesis that Kohli is playing) and predict that India won’t chase the total down, whereas he is playing and scores a century.

Or, you could have correct belief-levels about your hypotheses but go wrong in translating the belief-levels into probability - and thus make a wrong prediction. e.g., “not too sure whether it is A or B” into 30%-70% when it should be 50%-50%.

Whether it is a wrong translation or wrong belief, in both cases, the predictions you make will be wrong. How will you know where you went wrong? Was your hypothesis wrong or was your translation wrong?

Where do these two hypotheses (Wrong beliefs vs Wrong translation) differ in their predictions?

Will they lead to different prediction results?

Right belief + Right translation: Your explicit predictions will be correct.

Right belief + Wrong translation: Your explicit predictions will be wrong.

Wrong belief + Right translation: Your explicit predictions will be wrong.

Wrong belief + Wrong translation: Your explicit predictions may be right or wrong.

So, all that you can tell from wrong predictions is that you don’t have both the right belief and the right translation (else, the prediction would definitely have been right). It doesn’t tell you whether it was your belief that was wrong or your translation or both.

Note, even if you get the right predictions, it could still mean that your wrong belief and wrong translation have cancelled out luckily. But, that is highly unlikely to happen, right? So, it’s strong evidence that you have both the right belief and the right translation.

Will they lead to different levels of surprise?

I’m assuming that your mind makes you feel surprise only when things go against your implicit prediction. Else, you’re not surprised. Everything feels normal.

Even if something is against your prediction and should surprise you, you may still not notice it. That’s fine. Humans are dumb. But, when you do feel surprised, it is because it countered your implicit prediction.

Note: Your explicit predictions (the translation of your belief-levels to probabilities) have no effect on your surprise levels. The surprise is about the stuff in your brain. The numbers you wrote down have no effect there, in general.

Right belief: You won’t feel surprised at all, cos all your implicit predictions will come true. (say, when you are a technical expert in a specific domain)

Wrong belief: If you feel surprise at all, it must be because of wrong beliefs. Wrong belief means that your implicit prediction was different from the correct prediction.

It is possible to have Wrong beliefs and still not be surprised by anything you see that contradicts your beliefs (see: religious people). But, if you do get surprised, it has to come from Wrong beliefs.

I think that Noticing Surprises is a skill and can be trained. If you force yourself to look actively for surprises, if you consciously bring the outcome to your awareness and test it against your implicit prediction, I think you will know whether you are surprised or not. In most cases, we aren’t even aware that our model of the world has made a prediction and so we let surprises pass under our radar. But since this whole discussion is about explicit predictions, we can always do a conscious mental check and ask if we are surprised.

In this case, surprise becomes proper strong evidence. If you are surprised by the outcome, then your belief was wrong. End of story. If you aren’t surprised, then your belief was right.

(TODO Question: How do judge the level of surprise (especially in cases where the outcomes have a range)? Example: what will the final score be? 250 runs? 270? 300? 280-290 runs? How surprised will you be if you say 280-290 and you get 270? What if you get 324?)

What can we say about our translation accuracy?

Cool! We have a differing prediction between Right beliefs and Wrong beliefs. And, since it is independent of our translation accuracy (and thus our explicit predictions), we can use this to help solve the mystery of whether a wrong prediction was caused by a Wrong belief or a Wrong translation of belief-levels.

Now, what does this tell us about explicit predictions (and thus our translation accuracy)?

Say we got a wrong explicit prediction. We don’t know if it was cos of a wrong belief or cos of wrong translation.

So, we check our surprise level. If we did get surprised, it means that our belief was wrong. If not, our belief was right.

Now, let’s try out the different possibilities:

Right belief (confirmed)

Right Translation: Right explicit prediction

Wrong Translation: Wrong explicit prediction

Wrong belief (confirmed)

Right Translation: Wrong explicit prediction

Wrong Translation: Most probably Wrong explicit prediction (cos it’s unlikely that things cancel out the right way)


When you have a right belief (as confirmed by the surprise test), the correctness of the explicit prediction tells you perfectly whether your translation of belief-levels was right or wrong. Correct explicit prediction means correct translation. Wrong explicit prediction means wrong translation.

When you have a wrong belief (as confirmed by the surprise test), the explicit prediction result tells you pretty much nothing. In fact, in the unlikely case that you are surprised but your explicit prediction was correct, it means your translation was wrong. Weird, huh!

Calibration

I think they call “translation accuracy” as your level of calibration. Calibration is about correctly translating belief-levels into quantitative estimates. This includes correcting common errors that humans make in this process. The more calibrated you are, the better you will be at translating your belief-levels into quantitative estimates (explicit predictions).

If my above theory is true (about the surprise thing and the connection between implicit predictions and translation accuracy), then we now know some things about how to calibrate.

Most importantly, there is no point trying to calibrate using explicit predictions when you have Wrong beliefs about the thing. If the explicit prediction is wrong, then you can’t say anything about your translation, because your belief itself is wrong in the first place. If the explicit prediction is right, then, it means your translation was wrong - but this case will be pretty rare, IMHO.

So, you need to calibrate yourself (aka improve your translation accuracy) by making predictions only about things where you have correct beliefs.

And how would you know that? Well, you won’t be surprised by stuff that happens in that domain, even when you carefully take every event and consciously check whether you’re surprised or not. That is the only way to see if your implicit predictions are correct or not.

I’m not sure how this would work. Are there even any fields where I have correct beliefs?

Well, you don’t need perfectly correct beliefs. If you have reasonably correct beliefs, you will make mostly correct implicit predictions. So, you will have few surprises when you check consciously. That’s good enough. We can then go ahead and make explicit predictions and correct our translation processes.

Need to test out these hypotheses. Will do it soon.


Tentative solution: Make predictions about a bunch of events (could be trivia, college studies, whatever). Now, pick out those for which you were not surprised by the answer.

Those are predictions for which your implicit belief was right (or, at least, not wrong). Any error would be due to your translation. Fix it! Calibrate yourself.

Fixing your Prediction Process

We saw how we could calibrate our translation processes and get quantitative predictions that are close to our beliefs. Our explicit predictions will be wrong just as often as we are surprised. They will be closer and closer to our implicit predictions.

So far, it’s all about translating the final implicit prediction that our mind gives us into an explicit prediction.

Cool. But what if our implicit predictions are wrong in the first place?

How do we correct the prediction process by which our mind comes up with its implicit prediction?

The default process our mind uses is evidently not even close to the ideal way to do it.

How should we come to a prediction (ideally)?

Assuming you have a bunch of hypotheses in your head, and a bunch of belief-levels about those hypotheses, and a bunch of predictions that each hypothesis makes, what you need to do is take the average of all those predictions weighted by the belief-levels of the hypotheses. That is the final prediction, using the full force of your knowledge.

Which means, you need to be able to enumerate your hypotheses correctly. You need to translate their belief-levels accurately into probabilities. You need to enumerate the outcomes of an event and translate each hypothesis’ likelihood for each outcome into a probability. And, then, you need to calculate the weighted-average.

(Of course, all this is assuming that your hypotheses’ belief-levels and predictions are correct in the first place. God help you if you haven’t been updating properly on evidence, as per Bayes Theorem.)

So, these are the skills you need to improve to fix your prediction process. Converting belief-levels into probabilities calls again upon our old friend, calibration.

There’s lots more work to be done here. Will do it soon.


Note: Correct Beliefs doesn’t mean that you know what’s gonna happen with 100% certainty. It just means that when you say (implicitly) “X is gonna happen with 30% probability”, then the predicted outcome will happen around 30 times out of 100 such events.

No Longer Afraid of the Outcome

When I make a prediction on PredictionBook, I feel kind of relieved. I no longer have the pressure of being right. If I’m right, my calibration score will increase. And if I’m wrong, it will decrease. But the damage is isolated to the calibration score. It doesn’t hurt me. When I think about the outcome of the event, I’m not worried anymore. In fact, I’m just curious to see what happens.

Somehow, making a prediction on PredictionBook distances me from the prediction. No longer is my self-worth tied to the correctness of the prediction. My self-worth is not at stake. If needed, I will pay a price in terms of my calibration score, but not my self-worth. In contrast, if I didn’t make a prediction on PredictionBook but just thought to myself what would happen, I would desperately want the prediction to come out my way, or risk feeling like a crappy thinker.

This might be the same force at work that turns off your feelings when you bring in another token of exchange. I remember hearing about a late pick-up fee at a child care centre leading to even later pick-ups by parents, because now parents didn’t have to pay in terms of their self-worth (“I’m such a bad parent”). They could just pay the fee and do what they really wanted to do anyway!

This makes those parents seem like bad people, but we can turn this to our advantage. Wherever we’re feeling like losers for no good reason, say when our predictions go wrong or when our project fails or somebody rejects us, we can avert that feeling by instead paying in some other token. Put a price on the failure and be willing to pay that price. Suddenly, you no longer have to pay an unspecified, below-the-counter sum in raw self-worth damage.

It’s like we expect to feel a certain amount of shame for something we did wrong. But if we pay for that failure in dollars or something else, then we no longer have to pay in shame. If your project failed, you’re already suffering the monetary cost. Why pay twice in shame? We can still walk with our heads held high, which is what we want. This means you don’t have prerequisites for worthiness - no matter how you’re doing, you feel worthy of love and belonging… even when you fail!

Corollary: If we avoid feedback because we fear being wrong, then using another token of payment can help lift the fear of failure. Maybe this is one reason why people don’t fear failure so much in games (according to Jane McGonigal) - you pay only in terms of your score, but not your self-worth.

So, make more predictions on PredictionBook. You’ll fear the outcome less and thus improve your calibration in more areas.

In other areas, look to pay in money instead of self-worth. For example, say you’re taking a risk career move. If you fail, you will pay in terms of wages lost and reduced earning capacity. That’s cool. But you will not be a loser. You are willing to pay the price in money, so you can keep your head unbowed.

Lesson: Express your doubts in predictions. Don’t keep ruminating about them. For example, I was worrying about whether I would get money back from a certain friend. I kept thinking about how things might go wrong and I would not see a rupee of it. But instead of worrying, I can just make a prediction about it - “I will get money back from friend by the end of this month.”

Flinching away

I found myself flinching away from making a pessimistic prediction. I wanted to record that I thought I wouldn’t practice for more than 3 hours tomorrow. I haven’t actually done that for a while now, so it’s completely reasonable to expect that. But putting it down on paper like that seems… almost like a betrayal. “Whose side are you on? How dare you suggest that I/you won’t do this very important thing?”, my mind seems to be asking me. (I’m afraid to do that even after writing this paragraph. It really feels like betrayal.)

I was asking the wrong question. I was asking “Will I do this?” and pat came the answer from my mind, like any true believer, “Of course you will!” Doubting my future prospects seems like something only an enemy would do. Hence the flinching away.

Instead, let’s reframe the question: “What will an external observer predict about my performance tomorrow, given my history over the last month?” This question is straightforward to answer. It forces me to put aside my own prejudices and look at the system dispassionately, exactly the way Eliezer’s “Beyond the Reach of God” thought experiment forces you to put aside your beliefs about a just world and simply imagine the consequences of an action. I believe they call this the Outside View.

What Predictions Should I Make?

Why do I want to make predictions? One reason is to calibrate myself. I need to know what 5% probability feels like.

Another reason is that I want myself to notice how poorly-specified my beliefs are. Most of the time, I don’t demand clear probabilities from them (“35% chance of India winning this match”). I don’t expect my beliefs to pay rent in anticipated experiences.

This means I don’t realize when my beliefs are wrong. For example, I may plan to write an essay by the end of the day because I believe that I can write a good essay within two hours. However, I haven’t tested that belief using my past experiences. When I sat down and actually looked at the numbers, I found that a representative 1400-word post I wrote recently took me 4.5 hours. And I consider that a small essay! Therefore, my hopes are going to be dashed when I actually sit down to write the essay within two hours. I’ll probably ruin my other plans by taking over their time and, worse, feel like a complete loser because I couldn’t do such a “reasonable” task.

Hypothesis: Most of our awful feelings arise from poor calibration.

For example, a while back, I felt guilty for stacking up a huge pile of empty cartons and trash in the corner of my kitchen. I felt like I was the only guy who was lazy enough to do something like this. Then I visited a friend’s house and found a mountain of trash that would put mine to shame. I was not alone!

In related news, one of shame researcher Brene Brown’s books is called “I Thought It Was Just Me (but it isn’t)”. We seem to overestimate how alone we are in our shameful behaviour.

So, good calibration can help you set more realistic expectations. Instead of aiming to bang out a finished program within an hour, expect it to take a whole day. Realize that tasks take a finite, non-zero amount of time - brushing and flossing takes me around 10 minutes, washing the dishes takes around 10 minutes, and so on. When I make plans, I see only the big items (writing, studying, exercising) and fail to budget for the small fry.

Sadly, some inaccurate estimates cost me weeks and months and years instead of a few extra hours. This is nothing but the planning fallacy and it can be very costly indeed.

Lesson: Make predictions about all your planning estimates (hourly, daily, and weekly). If you plan to do three tasks, you’re implicitly estimating that you can fit them within your work hours. So, predict it on PB and see how you fare.

Making Better Predictions

Enumerate all the possibilities before you make your prediction. Right now, I feel like I’m focussing only on the outcomes I like, for example whether India loses one wicket or zero in the first ten overs. It doesn’t feel real to me that India could lose four wickets in the first ten overs. Or that it might rain! I don’t even push my mind to that extreme.

So, list all the outcomes and ask yourself what probabilities you put on each of them. Thinking in terms of binary predictions might be a problem. You tend to focus on one outcome (“India will lose one wicket”) and lump everything else under “other outcomes”. You thus fail to consider certain outcomes and their probabilities (a good spell by a bowler, great batting, or even just rain). Don’t make a prediction without listing all the outcomes (or at least three exclusive events).

Fallacy of Even Odds

A big related problem is that I look at the “two” outcomes (say, biotech company X will create a vaccine and it won’t) and decide that I don’t know much about biotech vaccines and so give it fair odds - 50% probability.

But that’s a false model! Framing it as “company X succeeds vs fails” makes me try to distribute 100% probability among these two outcomes, and thinking that I don’t know anything makes me distribute it evenly. But I do know something, which is that there are several biotech companies in the world, many of whom are trying to create that vaccine. So, to truly distribute my probability mass evenly, I must look at the number of biotech companies (say 8) and give them all equal probability (12.5%).

This, I think, is one of the biggest problems in becoming calibrated: we start with the wrong probability estimates! We put too much confidence in some hypotheses. It’s hard to recover from that because you don’t expect to be proven wrong (you’re confident) and so you don’t go looking at the experiments that you would (ideally) expect to surprise you.

I’m reminded of the 2-4-6 task in HPMOR. Hermione (and I and many others) were doomed right from the beginning simply by focussing too much on the “even plus two” rule. We didn’t ask about weird triplets like (-2432, 34, 3.34) because we fully expected them to be invalid and so didn’t want to waste a question. But we should. We’re just starting out - we haven’t got enough information to locate the right hypothesis. But to no avail. We’re doomed.

Prediction: If you get yourself accurate prior probabilities, then you can update your way to decent calibrated estimates.

Test: Does new information change your predictions?

Does your prediction change after you get more information? For example, I made my prediction about a cricket match without knowing who won the toss or what changes they made to the team. How can my prediction remain exactly the same after I find out? That would mean that India batting first had exactly the same odds of winning for me as India bowling first. Or that the opposition’s main bowler getting injured didn’t change their chances at all.

Features I Would Like

When I’m making a 60% prediction, it would be cool to see the last 20 predictions where I made 60% predictions. This way I can see what 60% felt like and what it should have felt like (perhaps more like 55% in some cases). Seeing the stats table for all my 60% predictions ever isn’t as helpful because many of them are from prehistoric times when I was prediction-illiterate. Also they don’t give me concrete instances to reflect on (“remember how uncertain you were about getting that refund?”) so I can’t make analogies using the past.

Current PB Statistics

As it stands, the current PB stats don’t distinguish at all between a 70% prediction and a 79% prediction. They both fall in the 70% band and add the same amount to your accuracy if they come out right.

This is not good, because apparently rounding the predictions of good forecasters to the nearest 5% or 10% removes a lot of their advantage.

Multiple Outcomes are Okay

I’m afraid to make a 20% prediction for one of five outcomes, even though that’s the uniform prior. This is because if I only register this one prediction, then I feel will be penalized. Is that true?

Remember that you’re testing your calibration, not discrimination. So, it’s ok if you just book one 20% prediction. If you really are well-calibrated, then for every ten such 20% predictions, eight will come wrong and your stats will be good. Have faith even if it goes wrong once in a while. This is the challenge - to be able to withstand temporary setbacks and trust the numbers.

Uncertainty - Take on a little, not a lot

It’s easier to predict the winners for a series of cricket matches than to predict the exact outcome of each ball of a match. Why? Because you have lower uncertainty in the former.

Take 10 matches. Say there can only be two outcomes - win or loss. Your maximum uncertainty about those matches is 1 bit per match x 10 matches = 10 bits of uncertainty. This is because each match can have only two outcomes and so a maximum of 1 bit of uncertainty (when you put 50-50 on win and loss).

However, take just 10 balls in a single match. Now, there are several possible outcomes for each ball - the batsman could get out, hit a four, take a single, leave it alone, etc. This is excluding the finer details like how fast the ball was, where it pitched, where he hit the ball, who fielded, etc. Even if we just restrict it to 16 possible outcomes per ball, that gives us a maximum of 4 bits of uncertainty, and so 40 bits of uncertainty total about those 10 balls.

So, you have to do strictly more work in the second challenge than in the first (40 bits of uncertainty vs 10 bits) to get to the correct answer. And that was for just 10 balls! A full T20 match has 40 overs x 6 balls = 240 balls. With even just 16 possible outcomes per ball, you get a maximum of 960 bits of uncertainty. That is orders of magnitude more than the 10-match scenario. It’s the difference between looking for a diamond in a set of 2^10 = ~10^3 closed boxes and looking for a diamond in a set of 2^960 = ~10^288 closed boxes. Good luck trying to locate the diamond within the latter.

For example, if India plays Bangladesh in a cricket match, it’s hard to predict the outcome from ball to ball (like Raina getting out in the 9th over), but quite easy to predict that India is going to win the match. Don’t beat yourself up if you get unexpected outcomes in between - it’s too hard.

Lesson: If you want to make correct predictions, stay away from high-uncertainty situations. The deck is stacked against you there.

If you want to remain calibrated in such situations, stick to the uniform prior predictions, otherwise you’ll be surprised very often.

What if you want to make correct predictions? Go for places where you have low maximum uncertainty, like who will win the match (just two possible outcomes) or who will be the man of the match (only a few stand-out players) or what the run rate will be in the last five overs (there are only a few options - 5-6, 6-7, 7-8, etc.). But not how many runs India will score to a precision of 5 runs - there are lots of options and a lot can go wrong.

In short, don’t trust yourself to reduce a lot of uncertainty. Stick to small, manageable problems.

I thought I could get away with thinking complicated thoughts myself, in the literary style of the complicated thoughts I read in science books, not realizing that correct complexity is only possible when every step is pinned down overwhelmingly. Today, one of the chief pieces of advice I give to aspiring young rationalists is “Do not attempt long chains of reasoning or complicated plans.”

– Eliezer Yudkowsky, My Wild and Reckless Youth

How to Judge a Confidence Interval

Let’s say you estimated that you would wake up tomorrow between 5am and 8am with probability 81%. When would you feel more confident in your prediction ability: if you woke up at 6:30am or 7:59am?

Somehow, you would feel that you barely got it right in the second case. Two more minutes and your whole 81% prediction would have gone wrong, whereas in the first case, you’d still be comfortably within range.

So, not all outcomes within a confidence interval are equal. Ideally, 81% probability mass distributed between 5am and 8am would give 81%/180 = 0.45% probability mass to each minute. Then, a wake-up time of 7:59am would count for just as much Bayesian evidence as 6:30am. But we treat them differently. It’s like we have an implicit bell curve within the confidence interval, with central points like 6:30am counting for more than peripheral points like 7:59am.

For a strong prediction, an outcome that barely falls within your confidence interval suggests that you were overconfident. For a weak prediction, an outcome that falls near the centre of your confidence interval suggests that you were overconfident.

(I think this holds even if you didn’t create an explicit confidence interval. Not sure how.)

Resources for Making Predictions

Gwern’s advice on Making Predictions

He talks about 3 parts of making predictions: specifying the prediction, deciding the due-date, and assigning a probability to the prediction.

The prediction should be “a statement on an objective and easily checkable fact”, so that even an arch-enemy trying to poke holes would have to agree when you claim that your prediction was right (or wrong).

He goes into great detail on how to make calibrated probability estimates. Will delve into those after some time. I just want to get started now.

Douglas Hubbard - How to Measure Anything

He gives lots of brilliant techniques for making calibrated predictions. Will go into all of them soon.

Calibration

Philip Tetlock

Main way to become better at predicting: keep score.

PS: Ideas

My PB Calibration Scores

February 13, 2016:

Confidence 50% 60% 70% 80% 90% 100% Total
Accuracy 30% 51% 67% 73% 85% 0%  
Sample Size 23 43 55 37 60 0 218

Chart?cht=s&chg=20,20&chds=50,100,0,100,0,60&chm=o,aaaaff,0,-1,25,1|r,44ff44,0,0.49,0.51|d,ffcccc,0,0,0

Feb 23, 2016:

Statistics

Confidence 50% 60% 70% 80% 90% 100% Total
Accuracy 36% 60% 67% 77% 86% 100%  
Sample Size 28 57 63 53 66 1 268
Chart?cht=s&chg=20,20&chds=50,100,0,100,0,66&chm=o,aaaaff,0,-1,25,1|r,44ff44,0,0.49,0.51|d,ffcccc,0,0,0

July 19, 2016:

Statistics

Confidence 50% 60% 70% 80% 90% 100% Total
Accuracy 48% 66% 66% 75% 86% 100%  
Sample Size 44 79 77 61 79 1 341

Chart?cht=s&chg=20,20&chds=50,100,0,100,0,79&chm=o,aaaaff,0, 1,25,1|r,44ff44,0,0.49,0.51|d,ffcccc,0,0,0

January 28, 2017:

Statistics

Confidence 50% 60% 70% 80% 90% 100% Total
Accuracy 51% 64% 69% 77% 86% 100%  
Sample Size 59 89 95 70 88 3 404

Chart?cht=s&chg=20,20&chds=50,100,0,100,0,95&chm=o,aaaaff,0, 1,25,1|r,44ff44,0,0.49,0.51|d,ffcccc,0,0,0

My Calibration Scores on the quiz from Software Estimation

The format is “[my low estimate - my high estimate] question - actual answer”.

[ 1000K - 50000K ] Surface temperature of the Sun - 6000 + 273 [ 0 - 200 ] Latitude of Shanghai - 31 degs North [ 20 - 40 million square miles] Area of the Asian continent - 17 million square miles (NOT COOL!) [ 1000 BC - 1200 ] The year of Alexander the Great’s birth - 356 BC [ $100B - $7T ] Total value of U.S. currency in circulation in 2004 - $700B (too close) [ 2 x 10^8 m^3 - 12 x 10^11 m^3 ] Total volume of the Great Lakes - 6 x 10^20 m^3 (not even in the ballpark) [ $500M - $6B ] Worldwide box office receipts for the movie Titanic - $1.835 [ 1000 miles - 20000 miles ] Total length of the coastline of the Pacific Ocean - 84000 miles (JFC) [ 2500 - 500,000,000 ] Number of book titles published in the U.S. since 1776 - 22 million (huh, good call) [ 1 ton - 10^8 kg ] Heaviest blue whale ever recorded - 170,000 kg (nice call)

7/10 - JFC

I came so close with the area of the Asian continent. I actually added up the areas I knew for India, China, and Russia. However, those were in square kilometer! Damn. Screwed by the unit. However, I just looked it up and even in square kilometers, the area is 44.58 million square kilometers. Which means my high estimate was just not high enough. So, I need to widen my confidence intervals. I shouldn’t have put so much faith on my uncertain beliefs about things outside my professional domain.

Similarly, I had too small an interval for the total coastline of the Pacific Ocean. I figured that the west coast of the Americas from north to south would be around 3000 miles (don’t know why I thought that). Turns out the Pacific coastline for the US is around 7000 miles. But the length of the US is only 1200 miles. Not sure how the coastline is calculated. But I anchored hard on the 3000 miles thing and doubled it for both sides of the Pacific and tripled it for safety. Turns out it was still too low. Not because I didn’t double and triple enough, but because I didn’t consider the range of values for my sole input number - the length of the west coast of the Americas. I should have gone for 20x my original estimate just because I was so unsure about how a “coastline” was calculated. After all, this is a 90% confidence interval.

Finally, I was off by 8 orders of magnitude on the volume of the Great Lakes. I thought their area would be no more than the area of India (which I thought was 3.5 million squared miles, when it was actually 3.2 million squared kilometers) and that their depth would be from 10m to 200m. Turns out their area is just 94000 squared miles and their average depth is 150m. Their total volume is 22000 km^3 or 2 x 10^13 m^3. So, his answer seems wrong. And mine is off by one order of magnitude. Fuck.

I was so close! Apparently less than 5% of the 600 people Steve McConnell surveyed got 70% or more (like me). Most people got 30% or less. I could have easily got to 90% by widening my intervals for the area and coastline questions and just taken the hit for the volume question.

When I find the rare person who gets 7 or 8 answers correct, I ask “How did you get that many correct?” The typical response? “I made my ranges too wide.”

My response is, “No, you didn’t! You didn’t make your ranges wide enough!” If you got only 7 or 8 correct, your ranges were still too narrow to include the correct answer as often as you should have.

– Chapter 2, Software Estimation

True. I could have made them wider.

Created: November 3, 2014
Last modified: September 28, 2019
Status: finished
Tags: prediction

comments powered by Disqus