September 16, 2004

Bad Statistics Give Me Chest Pains

On the radio last week I heard a local news reporter say of Bill Clinton that "90% of some of his arteries were blocked" (prior to his quadruple bypass). I grimaced and groaned and I think I might have made some derogatory remark about the reporter's intelligence. Ninety percent of *some*???!!! Nothing like being exactly vague with your statistics. Note also that the reporter wasn't specific about which arteries...here we're talking about coronary arteries...not just any old arteries pushing blood around the body.

The facts:
Most reports I heard were more accurate and more successful at communicating a technical statistic to the average "Joe" on the street. For example, this AP photo has a good caption that more accurately states the stat:
"Clinton was at high risk of a heart attack before his quadruple bypass surgery Monday, with several arteries well over 90 percent blocked"

Doing a bit of Googling turns up more inaccuracies in reports of the same story:

"His arteries were 90 percent blocked."
from KABC-TV Los Angeles: Doctors: Clinton Dodged a Major Bullet
...all of his arteries?

"Monday's surgery revealed that his arteries were 90 percent blocked."
from WCVB-TV Boston: After Clinton Scare, Docs Urge Heart Vigilance
...exactly 90%? And again, all of his arteries?

"Clinton remained in intensive care after cardiologists performed a four-hour operation Monday to bypass four clogged arteries. They were so severely blocked that less than 10 percent of the normal blood flow was getting to his heart..."
from New York Post Online: CLINTON'S RECOVERY GOES WELL, DOCS SAY
...Wow! We're not talking about a few arteries that are mostly blocked, we're talking about 90% less blood flow in total to the heart!!!???


When dealing with statistics, whether related to the health of a past President or a recent usability test, it's important to maintain accuracy. Don't try to quote an exact stat unless A) you can get the stat correct, and B) the audience will be able to follow and understand the stat's context and content.

It's one thing to say "in testing we found that most people didn't use the site map" and a totally different thing to say "our tests and third party research show that 73% of users will not find a site map or site index useful in locating detailed product information on consumer web sites." Stats can be difficult to understand, so sometimes having them in print or on a slide can help people understand the stats. (Note, that stat example is entirely made up...)

Finally, if you find yourself trying to summarize a stat, be careful that you're not changing the meaning (as many of the news reporters on the Clinton story did).

1 comment:

Stuart Kruse said...

Hi Lyle,

I couldn't agree more. Statistics are difficult for most people to understand and interpret, so we should be especially careful when presenting this type of data.

A few thought-wonders that have just popped into my mind:

1) I remember a psychology study that examined how children interpreted statistics. It found that they used this type of information in a blunt, absolute fashion such that they would interpret a statement such as, 'There is an 80% chance that the operation will succeed' as meaning it DEFINITELY WOULD succeed. Equally, probabilities such as 10% chance of failure were seen as meaning zero chance of failure. I wonder how long this naive view persists into adulthood, and what are the effects of learning, IQ and the like?

2) In your post you cite an example of a better statistic,

"our tests and third party research show that 73% of users will not find a site map or site index useful in locating detailed product information on consumer web sites."

I think this statistic would be better if actual figures were quoted (70 out of the 100 users will not find..). I think people have a tendency to get tricked by percentage figures. I believe they read 90% as ALWAYS meaning lots (even though, for example, 90% of 10 users would only be 9 people) and small percentages such as 1% as always meaning little (even though a 1% risk of dying in a large population may mean the death of lots of people).

3)I've always found weather statistics a bit odd. What does it mean when the weatherman (or lady) says, 'There is a 90% chance of rain'? Most of us probably have a naive slide rule in our head where 90% means almost certain (toward the certain end of the scale). But, in a statistical sense, 90% means something like: 9 times out of 10 an event will happen. This is problematic for predicting a single day's weather as there is no such thing as the probablility of a single event (or so I've read that this is what mathematicians state). Could it mean something like, 'when running weather simulations, in 9 out of the 10 runs, rain was predicted'. I just don't know! I'm flumoxed! When I studied Chaos Theory, they had a more interesting measure of the unpredictability of a system. Using this, the weather report would give a Predictability Index for a forecast, which told you to what degree you could trust the forecast. A low index would tell you that the weather system was in a stable state and prediction was possible and likely to be good. A high indext would say the system was in a chaotic state with predictions not able to be trusted.

Sorry for this ramble, obviously not busy enough at work today.

Mindful