Paul Rogers - NYT
Recent events have made clear that improving science literacy within our communities is critical to helping individuals make better decisions and engage in honest and informed discussions. However, it is equally clear that complaining about poor literacy while doing nothing to help improve it is of little use. Thus, in this post I attempt to provide some clarity around parts of the scientific process that I think are often poorly understood, and which underlie the mistrust and misinformation that obscure conversations about complex but critically important parts of everyday life. This list is by no means comprehensive; rather, my hope is to provide those with an interest in understanding of how science can and should inform policy/behaviour with a starting point from which a larger conversation may evolve!
1. Scientific experiments are designed to provide evidence, not conclusions.
Even people who do their best to ground opinions and decisions in the best available science can be frustrated by recommendations that evolve or change completely over time. However, this represents a strength of the scientific process, not a flaw. An impartial scientist’s role is to assess the strength of the evidence in support of possible theories or courses of action, and make a recommendation based on what the evidence suggests will be most effective or appropriate. However, the process of evidence gathering is always ongoing - if at any time the balance of the available evidence suggests that a different approach is more effective or appropriate, then the recommendation based on that evidence should also change. While it may appear indecisive, changing recommendations to reflect the best available evidence is the responsible approach to policy-setting. Unfortunately, the balance of evidence is most likely to change (and change dramatically) when little is known about an issue, and novel data are rapidly emerging.
2. Experiments typically describe average effects, but individual experiences will vary (experiments also typically try to capture this variability).
Many experiments are designed to measure the difference in one factor that arises due to variability in some other factor. Examples include how life expectancy differs by sex, or how the risk of stroke differs in the presence/absence of daily aspirin. Very often, these studies (and the media coverage that follows) will focus on comparing the average or ‘typical’ values measured in each group; however, these measures often obscure the variability observed in the underlying data. Consider that a recent study determined life expectancy in Canada to be ~80 and ~84 years for men and women, respectively (StatsCan, 2018). Yet a man may outlive his sister, die of natural causes at 55, or live to be 100. None of these experiences contradicts the study’s findings; rather, they reflect the variability of individual experience.
3. Experiments are designed to understand something about a population by taking measurements from a smaller sample of individuals.
Here, the term sample refers to the group of participants who actually completed the experiment, while population refers to the larger group of individuals about which the experiment aims to make a conclusion (e.g. all emergency care physicians, all women over 40, all of humankind). Nearly all experiments involve sampling, which is typically done out of necessity - it is often impossible to take a measure from every member of the population of interest. Well-designed studies pay a great deal of attention to constructing a sample that reflects the larger population as closely as possible. Creating this ‘representative sample’ allows the scientist to be more confident that their findings will apply more broadly to the population of interest. In addition, consideration is also given to how large a sample needs to be to accurately capture differences that are thought to exist in the larger population, and to minimize the likelihood that the differences measured arose at random. Overall, if the sample studied in a given experiment is sufficiently large and representative of the larger population, we can be more confident that the results obtained will apply more broadly. However, we can be much more confident if multiple research groups have obtained similar results using different, but comparable samples.
4. Not all available evidence is created equally.
Before an experiment is published in what we might consider to be a ‘reputable journal’, the work is typically reviewed by between 2 and 5 other scientists with expertise in the area of study. This ‘peer-review’ process is designed to provide a consensus that the study was properly designed, well-executed, and that the conclusions discussed follow logically from the data collected. However, not all published work undergoes this process of review. For example, there are many seemingly legitimate publications (sometimes referred to as ‘predatory journals’) that often do not abide by the peer-review process described above. Moreover, there are a number of websites (e.g. bioRxiv, etc) that have been designed to allow scientific works to be shared prior to the peer-review process (often referred to as ‘pre-prints’). While many of these papers will go on to be peer-reviewed and published, unless you are an expert in the field of study, it can be extremely difficult to discern good science from bad at this stage*. Thus, while there may be legitimate experimental results that contradict popular opinion (see the relationship between the balance of evidence and policy setting in point 1), it is important to consider the source of information carefully.
*It is worth noting that pre-print servers are a very valuable tool for research that accelerate the transfer of important work between research groups, and provide opportunities for expanded peer-review.