Probably Overthinking It : How to Use Data to Answer Questions, Avoid Statistical Traps, and Make Better Decisions
Probably Overthinking It : How to Use Data to Answer Questions, Avoid Statistical Traps, and Make Better Decisions
Click to enlarge
Author(s): Downey, Allen B.
ISBN No.: 9780226822587
Pages: 256
Year: 202312
Format: Trade Cloth (Hard Cover)
Price: $ 33.12
Dispatch delay: Dispatched between 7 to 15 days
Status: Available

Let me start with a premise: we are better off when our decisions are guided by evidence and reason. By "evidence," I mean data that is relevant to a question. By "reason" I mean the thought processes we use to interpret evidence and make decisions. And by "better off," I mean we are more likely to accomplish what we set out to do-- and more likely to avoid undesired outcomes. Sometimes interpreting data is easy. For example, one of the reasons we know that smoking causes lung cancer is that when only 20% of the population smoked, 80% of people with lung cancer were smokers. If you are a doctor who treats patients with lung cancer, it does not take long to notice numbers like that. But interpreting data is not always that easy.


For example, in 1971 a researcher at the University of California, Berkeley, published a paper about the relationship between smoking during pregnancy, the weight of babies at birth, and mortality in the first month of life. He found that babies of mothers who smoke are lighter at birth and more likely to be classified as "low birthweight." Also, low- birthweight babies are more likely to die within a month of birth, by a factor of 22. These results were not surprising. However, when he looked specifically at the low- birthweight babies, he found that the mortality rate for children of smokers is lower, by a factor of two. That was surprising. He also found that among low-birthweight babies, children of smokers are less likely to have birth defects, also by a factor of 2. These results make maternal smoking seem beneficial for low- birthweight babies, somehow protecting them from birth defects and mortality.


The paper was influential. In a 2014 retrospective in the International Journal of Epidemiology, one commentator suggests it was responsible for "holding up anti- smoking measures among pregnant women for perhaps a decade" in the United States. Another suggests it "postponed by several years any campaign to change mothers' smoking habits" in the United Kingdom. But it was a mistake. In fact, maternal smoking is bad for babies, low birthweight or not. The reason for the apparent benefit is a statistical error I will explain in chapter 7. Among epidemiologists, this example is known as the low-birthweight paradox. A related phenomenon is called the obesity paradox.


Other examples in this book include Berkson's paradox and Simpson's paradox. As you might infer from the prevalence of "paradoxes," using data to answer questions can be tricky. But it is not hopeless. Once you have seen a few examples, you will start to recognize them, and you will be less likely to be fooled. And I have collected a lot of examples. So we can use data to answer questions and resolve debates. We can also use it to make better decisions, but it is not always easy. One of the challenges is that our intuition for probability is sometimes dangerously misleading.


For example, in October 2021, a guest on a well- known podcast reported with alarm that "in the [United Kingdom] 70- plus percent of the people who die now from COVID are fully vaccinated." He was correct; that number was from a report published by Public Health England, based on reliable national statistics. But his implication-- that the vaccine is useless or actually harmful-- is wrong.


To be able to view the table of contents for this publication then please subscribe by clicking the button below...
To be able to view the full description for this publication then please subscribe by clicking the button below...