## Make a guess, any guess

by

There was an interesting article in The Economist a few weeks ago about how to get an accurate estimate of a not-necessarily-well-known quantity (e.g. How many people were on board the RMS Titanic on its premiere voyage?). One way, which was already examined by James Surowiecki is to ask a bunch of people rather than just one. In general, the too-high guesses and too-low guesses start to cancel out, and the average number is a pretty good estimate, or at least more accurate in general than only asking one person.

But what if you don’t have a crowd? Then Edward Vul and Harold Pashler said that you can just answer twice. Indeed, whether you make two guesses one right after the other or two guesses several weeks apart, the average will (on average) be more accurate than the original guess. Better accuracy comes when the guesses are several weeks apart, but several commenters note that this may be because people looked up the answer in the meantime. Edited to add that they, like I myself, should have looked more closely at the original article since (as the eagle-eyed readers below note) the authors point out that this is unlikely because the second guesses were typically worse than the first guesses. They just averaged to something that was a bit better.

Let’s simulate this. I asked four random people to make two guesses as to the number of people who sailed on the Titanic . For comparison, Wikipedia says there were 2223 people aboard that night.

Person #1 First Guess: 900
Person #1 Second Guess: 2300
(Average: 1600, closer than the first guess!)

Person #2 First Guess: 2000
Person #2 Second Guess: 5000
(Average: 3500, a lot worse than the first guess)

Person #3 First Guess: 3200
Person #3 First Guess: 1400
(Average: 2300, nearly exact!)

Person #4 First Guess: 4000
Person #4 First Guess: 5000
(Average: 4500, worse than the first guess)

The average of the first guesses was 2525, which was more accurate than every first guess except the guess of 2000. This illustrates the idea that the average over crowds tends to be more accurate than an individual.

Now for the purposes of illustrating this article, the average guesses should be a little more accurate than the initial guesses. This happened for two people, but not for the other two: the average Average guess was 2975, which is less accurate than 2525. Dang, real data is so uncooperative. Of course, Vul and Pashler asked 438 people instead of 4, so maybe if I had asked more people I’d have gotten a result similar to V&P.

[You can see what seems to be the original article from Psychological Sciences here.]

### 4 Responses to “Make a guess, any guess”

1. Jon Ingram Says:

It’s an interesting statistical experiment, and one I’ll file away to possibly try with some of my sets next year. You raise a point in your post:

“Better accuracy comes when the guesses are several weeks apart, but several commenters note that this may be because people looked up the answer in the meantime.”

They address this in the paper:

“This benefit of averaging cannot be attributed to subjects’ finding more information between guesses, because second guesses were less accurate than first guesses in both the immediate condition, and the delayed condition.” (page 4).

They also attempt to quantify the benefit of asking someone twice compared to asking two people the same question:

“Simply put, you can gain about 1/10th as much from asking yourself the same question twice as you can from getting a second opinion.” (page 5).

Not entirely sure how valid their attempt to quantify the relative benefit is, but it’s at least not immediately implausible, which makes it better than average for the use of statistics in the social sciences :).

2. Ξ Says:

Thanks for your more-careful reading of the paper — I missed that last night. My informal survey also differed from the authors in another important way: they said:

“It is important that neither group knew they would be required to furnish a second guess, as this precluded subjects from misinterpreting their task as being to specify the two endpoints of a range.” (p. 3)

but I actually asked for two guesses.

I wonder if knowing about this study would change the guesses at all. If my first guess for the number of people on the Titanic were 1000 and my second guess was going to be 2000, maybe I’d average those two and choose 1500 as my second guess. Which quickly leads to a kind of series problem, although one that (I suspect) rapidly loses any sort of scientific validity!

3. Hank Says:

Referring to the improvement over 3 weeks, you said “but several commenters note that this may be because people looked up the answer in the meantime.”

The authors of the original study responded to that issue here:

http://www.edvul.com/crowdwithin.php

Hank

4. Ξ Says:

Thanks Hank! I made a correction to my summary aboutu this. The Ed Vul site is interesting — thanks!