Confirmation Bias
The Original Error
Lesson 1:
Our biased brains
1.1 Defining confirmation bias
Confirmation bias is often referred to as the “original error” in human cognition. In simple terms, it is the mind’s tendency to seek out information that supports the views we already hold by selectively filtering data and distorting analyses (Casad and Luebering, 2025)—even when evidence to the contrary is available (Nickerson, 1998). This bias is not a minor quirk. Instead, it is a cognitive phenomenon that influences every phase of scientific work, from designing experiments and formulating hypotheses to interpreting data.
This is an innate, unconscious proclivity of the human mind that persists even in cases when stakes are trivial and tasks are simple, and whose impact is felt even outside of science—from its role in popular performances like mind reading to the idea of a “lucky” sweater, confirmation bias pervades our day-to-day lives.
In this unit, the term “confirmation bias" also covers two related cognitive biases: Expectation bias and observer bias. Expectation bias (also experimenter's bias) is the tendency for researcher expectations to influence subjects and the outcomes by altering participant behavior. Similarly, observer bias is the tendency for researcher expectations to influence outcomes by altering the perception and scoring of researchers.
Let’s look at both of these related biases more closely!
"Confirmation Bias . . . is the mind’s tendency to seek out information that supports the views we already hold by selectively filtering data and distorting analyses."
Expectations can produce changes. In the 1960s, Robert Rosenthal and Kermit Fode asked students to train 2 breeds of rats ("maze-bright" and "maze-dull") to solve mazes.
The rats were actually genetically identical. Student expectations, however, affected training and created performance differences by the end of the study. In other words, the expectations of the student experimenters produced an actual, measurable difference in performance in the two groups of mice.
This is an example of expectation bias affecting outcomes by altering subject performance. Note that the students had no real active interest in one group outperforming another and no personal stakes in a particular outcome – they had simply been primed to expect such an outcome.
Expectations can also affect data collection. In a similar study, Tuyttens et al. (2014) asked students to rate the sociability of two breeds of pigs ("normal" and "high social breeding value" [SBV+]) from video recordings. The videos were actually identical, yet students rated the SBV+ pigs as more social.
Unlike the Rosenthal and Fode study, the students did not interact with the pigs, so differences in ratings could only result from different human perceptions of the same video data. In other words, the expectations of the data collectors caused them to perceive differences that were not actually there: a clear instance of observer bias, again in a situation where bias presented itself even though the students had no personal stake in the outcome.
Tip:
It’s worth noting that although expectation and observer bias can be considered sub-types of confirmation bias because they involve expectation-driven distortions, the same is not true of all biases. Fundamental attribution error, conformity bias, and the availability heuristic are all examples of cognitive biases that are distinct from confirmation bias.
Activity: Deduce the Number Rule
Before delving further into definitions, try out a revamped version of a task first developed by Peter Cathcart Wason in 1960. Your goal is to try and deduce a secret rule that matches some sequences of 3 numbers.
But there is a catch—you can’t guess the rule directly!
Form a hypothesis by interpreting results from your guesses, then submit your final hypothesis to see if you were correct!
Post-activity questions:
- How did that go? Did you falsify your hypotheses?
- When testing sequences, did you match the hypothesis you had in mind or try to falsify that hypothesis? Why?
- Were there moments when you realized you needed a new strategy?
- How might collaboration with others help avoid some of the pitfalls demonstrated in the task?
1.2 Real-world studies of confirmation bias
As evidenced by this task, humans are inclined to confirm rather than falsify hypotheses. Many quickly latch onto an initial pattern (say, “increasing by 2”). Once an early hypothesis is formed, subsequent searches for evidence tend to focus on confirming that rule, with little effort to seek out counterexamples (Wason, 1960). This results in unconsciously filtering out disconfirming evidence.
It’s not difficult to imagine that if such bias emerges even under the simple constraints of the Wason task and the low stakes of the student experiments mentioned earlier, it may well present all the more prominently in higher stakes scientific settings.
By distorting thinking, confirmation bias skews how scientific hypotheses are conceived, tested, and evaluated
Much like throws aimed at a bullseye in a game of darts, scientific investigation can be either precise or imprecise, accurate or inaccurate, in its pursuit of truth—precision being a measure of consistency, and accuracy being a measure of correctness. In that sense, bias is a failure of accuracy—it refers to a distortion of our thinking which skews how scientific hypotheses are conceived, tested, and evaluated (Kahneman & Tversky, 1996).
Crucially, this kind of error will mislead us even if we make the most precise measurements using the most state-of-the-art equipment (Born, 2024), because it shapes and presents in every stage of scientific investigation:
Early on, bias can predispose you to make certain observations and favor certain hypotheses. Later in the process, it can influence your data collection and your data analysis alike, motivating—consciously or unconsciously—questionable practices like p-hacking and circular analysis (Simmons et al., 2011; Kriegeskorte et al., 2009; Button, 2019). It thwarts the goal of scientific research to build objective experiments and achieve an impartial interpretation of experimental results.
Much like throws aimed at a bullseye in a game of darts, scientific investigation can be either precise or imprecise, accurate or inaccurate, in its pursuit of truth—precision being a measure of consistency, and accuracy being a measure of correctness. In that sense, bias is a failure of accuracy—it refers to a distortion of our thinking which skews how scientific hypotheses are conceived, tested, and evaluated (Kahneman & Tversky, 1996).
Crucially, this kind of error will mislead us even if we make the most precise measurements using the most state-of-the-art equipment (Born, 2024), because it shapes and presents in every stage of scientific investigation:
Early on, bias can predispose you to make certain observations and favor certain hypotheses. Later in the process, it can influence your data collection and your data analysis alike, motivating—consciously or unconsciously—questionable practices like p-hacking and circular analysis (Simmons et al., 2011; Kriegeskorte et al., 2009; Button, 2019). It thwarts the goal of scientific research to build objective experiments and achieve an impartial interpretation of experimental results.
Tip:
In some ways, one might well argue that science itself is meant to be a means of combating the biases that govern non-scientific insight—after all, the systematic rigor of the scientific method helps mitigate many of the deficiencies of unscientific observation and reasoning. In that sense, confirmation bias is one such deficiency that has proven particularly pervasive and requires additional caution.
This inclination toward confirmation bias can be managed, however. In the activity inspired by the Wason task, for instance, deliberately looking for counterexamples that might refute a guess is a conscientious counter to the implicit pull toward confirming beliefs.
In other scientific research, confirmation bias can be mitigated by:
- Building habits to recognize bias in thinking.
- Placing checks to pause and reflect on choices.
- Reducing error via principles of rigorous experimental design.
- Making clear distinctions between exploratory and testing research.
Numerous studies have illustrated how confirmation bias can subtly yet powerfully affect behavior and decision-making. For example, a study by Dror and colleagues (2006) showed how contextual cues can undermine forensic expertise. In a within‐subject design, five fingerprint examiners (averaging 17 years’ experience) reviewed prints they had previously matched.
When given misleading context that the FBI had misidentified the prints in a high-profile case, four changed their opinions: three ruled the prints non-matches and one deemed the evidence inconclusive, despite clear instructions to ignore the extra information. Only one examiner stuck with the original match. The study highlights the impact of cognitive bias on professional judgment and supports safeguards like blind examinations in forensic science.
In another published example of this tendency to discount information that undermines prior judgments, participants were placed in a betting scenario to monitor their own confidence in their wagers. The scenario showed that when participants had betting partners who agreed with their predictions, the participants greatly increased their bets. When their partners disagreed with their predictions, they only slightly decreased their wagers, but held to their initial predictions nonetheless. This imbalance—where confirmation has a stronger effect than refutation—demonstrates that preexisting beliefs can exert a powerful influence on behavior (Kappes et al., 2020).
Confirmation bias affects real-world decision-making, from the way we interpret experimental results to how we form our opinions on scientific theories. Even in high-stakes scenarios, such as clinical trials or policy-making, the tendency to overvalue confirmatory evidence can lead to inflated expectations and even the misinterpretation of data. For instance, early decisions in research might lead to a selective search in the literature or a tendency to report only positive findings. This, in turn, can skew the overall picture of a scientific field.
This is not just theoretical. In fact, later in this unit we will discuss reviews that have found that studies that fail to implement countermeasures to address confirmation bias consistently report larger effect sizes than studies that do. This is another example of the real-world impact unaddressed bias is having right now.
1.3 Why this matters to neuroscientists
For neuroscientists, the stakes of confirmation bias are particularly high. Neuroscience research often grapples with complex systems and subtle signals. When an initial hypothesis is formed, there is a risk of overlooking alternative mechanisms.
Consider a scenario where a hypothesis posits that a particular neurotransmitter is central for a specific behavior. If a researcher unconsciously prioritizes data that supports this hypothesis, it increases the likelihood of overlooking critical findings that suggest a role for other neurotransmitters or neural circuits. This oversight can lead to incomplete conclusions and develop an overconfidence in a flawed model of brain function.
The design and interpretation of experiments have far-reaching implications for understanding the brain. To counteract these pitfalls, it is essential to cultivate a mindset that actively challenges initial assumptions. Developing habits like deliberately searching for disconfirming evidence, discussing alternative explanations with colleagues, and employing robust statistical methods can help guard against the trap of confirmation bias. By integrating these practices, neuroscientists can design experiments that are more objective and reliable.
Developing habits like deliberately searching for disconfirming evidence, discussing alternative explanations with colleagues, and employing robust statistical methods can help guard against the trap of confirmation bias.
Takeaways:
- Confirmation Bias is one of many naturally-occurring cognitive biases that move thinking and observations to prioritize evidence that confirms existing beliefs.
- Neural reward systems reinforce this bias, making it difficult to recognize or challenge preconceptions.
- Being aware of confirmation bias is the first step toward designing experiments that minimize its impact.
- To counteract this tendency, individuals can intentionally seek out examples that violate hypotheses or other expectations.
Reflection:
- Can you recall a time in your lab work where an unexpected result made you question your assumptions?
- When was the last time you formed a hypothesis and then found yourself ignoring an alternative explanation?
- Being mindful of that feeling is an important first step toward taming confirmation bias in the lab.
Lesson 2:
Formulating
Rigorous Hypotheses
2.1 The Pitfalls of a Single, “Favored” Hypothesis
Confirmation bias is often revealed when scientists begin with an initial idea about the link between a cause and an outcome, like "when A happens, then that leads to B," and then design experiments to confirm that hypothesis. This kind of inquiry is an essential part of scientific discovery and experimentation, but it can lead to a project that focuses on demonstrating that a link between A and B exists, rather than conducting a more rigorous exploration of how A and B interact within a biological system.
Consider the Deduce the Number Rule activity: you may have come up with an idea for the number rule and then tested a number sequence that matched that rule. Confirmation bias creeps in when you interpret a matching result as confirmation of your rule. In actuality, a matching number sequence is merely evidence that is consistent with your rule being true—it’s possible other rules that could be true would have produced the same set of results!
Let's look more closely at the kinds of stumbling blocks encountered in scientific studies.
Case 1
If you start with a vague hypothesis, it is easy to interpret many kinds of results as supporting your hypothesis. For instance, if you suspect that "when A happens, then that leads to B," you may end up designing an experiment where the results show that A and B happen at the same time. However, just as in the number rule activity, this set of results is also consistent with other explanations:
- The cause and effect are actually reversed, and B is causing A.
- The experiment did not implement proper controls to identify what happens when A is NOT present, or to identify the conditions where B is NOT present (for more, see our forthcoming unit on controls).
- Some other effect results in both A and B, and so they are always observed together.
To put this in context – suppose that our hypothesis is “exercise improves mood”, which we test by randomly assigning people to an exercise group and a control group; where the exercise group runs on treadmills for 30 minutes, and the control group reads for the same amount of time; and then measure the change in happiness on a 1-10 scale (which was assessed for each person before and after exercise/reading).
The results might show any of the following that, on average, happiness:
- increased more in the exercise group, relative to the control group
- change was about equal in both groups
- increased less in the exercise group, relative to the control group
One could also imagine any number of explanations for these results (with the possibility of multiple explanations happening at the same time):
- Some forms of exercise improve mood, but not running.
- Exercise improves mood for some people but not others.
- Running for 30 min decreases mood for people who dislike running.
- Exercise causes endorphin release, which improves mood, but on a longer timescale than is measured by the study.
- Exercise is more effective at mitigating sadness, but does not directly improve happiness.
Since our hypothesis never specified how much improvement we expect, in which populations we expect to observe the improvement, or under what specific conditions we expect to observe it, many different potential outcomes can be interpreted as supporting our idea. Thus, confirmation bias can lead us towards an interpretation favorable to our hypothesis, and not much actually ends up being learned.
A specific hypothesis (that delineates which variables, which populations, the type of change) constrains what evidence can be interpreted as supporting, thereby making it harder for confirmation bias to have an influence.
Case 2
If you design a study to test a hypothesis, but the study design does not perform a strict or systematic test, confirmation bias may lead to erroneous interpretations of the results as supporting the hypothesis.
This can be seen in the common framework of Null Hypothesis Significance Testing (NHST), where we would start by designating the hypotheses:
- Null hypothesis (H0)—there is no relationship between A and B.
- Alternative hypothesis (Ha)—there is a relationship between A and B.
We might approach the study by collecting data about A and B, and analyzing it in the hopes that the p-value is smaller than our pre-defined significance level (usually 0.05). If the p-value is less than the significance level, the results are deemed statistically significant, meaning that it is very unlikely to have observed our data if the null hypothesis is true. We've successfully rejected the null hypothesis!
But what does rejecting the null hypothesis actually tell us? The null hypothesis states that there is no relationship between A and B—but was this a meaningful hypothesis to begin with? In most situations, an experiment is motivated by the expectation of a specific kind of relationship between A and B, which we hoped to elucidate through research.
Tip:
This highlights a common misunderstanding of Null Hypothesis Significance Testing as equivalent or indicative of confirmatory research. NHST does not automatically render your hypothesis well-defined or rigorously tested, rather it is a method of statistical inference which—properly applied—estimates the odds of a particular result reflecting a potential reality.
What we should have done is first determine what aspects of the hypothesized relationship between A and B are of scientific interest. For example, this could be a specific quantitative relationship (e.g. exponential growth, linear and decreasing, saturating response) or that the relationship only holds under certain conditions or for certain members of the study population. From this, we can design a suitable test for the relationship or the boundary conditions of the phenomenon.
Returning to the example of the Deduce the Number Rule activity, if our hypothesis is that the rule is "successive numbers increase by 2", we would expect certain number sequences to fail to match:
- number sequences increasing successively by 1
- number sequences increasing successively by 4
- number sequences where the first increase is by 2, and the second increase is by a different amount (or not at all)
- number sequences where the first increase is by a different amount (or not at all), and the second increase is by 2
Testing these alternative sequences and finding that they all fail to match the rule is thus stronger evidence that the rule is "successive numbers increase by 2" (and only by 2). By designing the study to demonstrate falsifiability of the hypothesis, we prevent over-interpretation of the evidence, and thus reduce the influence of confirmation bias.
Case 3
If you design an experiment around two hypotheses (H1 and H2) and they are NOT mutually exclusive, your experiment may not have a clear objective or interpretable set of results. While it is great to start with two hypotheses, if they are not mutually exclusive, remember that they could both be true at the same time (or alternatively, neither could be true)! Without providing definitive evidence on what the actual relationship is between A and B, future studies may not have much to build on.
Let’s imagine that we want to study the impact of caffeine on test performance and devise two hypotheses:
- H1: Caffeine improves alertness.
- H2: Caffeine improves motivation.
These hypotheses are not mutually exclusive – caffeine might improve both alertness and motivation, and so if we find that the coffee drinkers among our subjects outperform their counterparts, we’ve no way of knowing which hypothesis actually holds true. We could, however, revise these:
- H1: Caffeine improves performance even when motivation remains constant.
- H2: Caffeine improves performance only by increasing motivation.
Now we are able to design an experiment that sets up both contexts (by explicitly measuring and controlling for motivation), and in so doing discern its role in the underlying mechanism.
There is one case where it could be acceptable to have two hypotheses that can be true at the same time. This applies when the two hypotheses are known to occur independently (e.g. the relationship between A and B occurs through multiple pathways), and the goal of a study is to quantify the frequency or strength of the two hypothesized pathways. In such a study, it might still be valuable to implement a mutually exclusive control to establish the validity of the study design.
The Solution
The common thread through these three cases is that vague hypotheses and open-ended study design enable confirmation bias to take root. Conversely, we can support rigorous science by planning ahead with our hypothesis and study design. We can do this by ensuring three key qualities for rigorous hypotheses:
- Specific: What are the independent and dependent variables? How will they be measured as part of the study? Is the goal to test a general explanation of a phenomenon or a specific instance?
- Falsifiable: Under what conditions will the hypothesized effect occur? When will it fail to occur?
- Contextual: What are the populations, experimental factors, and pathways that influence the hypothesized effect? If a study is investigating these moderators, are there proper controls?
We would be remiss without mentioning the strategy of "Strong Inference" (Platt 1964). In this paper, Platt recommends that experiments should distinguish between mutually exclusive hypotheses:
It seems to me that the method of most rapid progress in such complex areas, the most effective way of using our brains, is going to be to set down explicitly at each step just what the question is, and what all the alternatives are, and then to set up crucial experiments to try to disprove some.
We agree with the general sentiment of Platt that each study or experiment should provide useful and interpretable evidence. However, we also expand beyond the requirement that studies must necessarily pit two (or more) competing hypotheses against each other. In many cases, rigorous, important work involves articulating the boundaries of hypotheses and the contexts in which they operate. Hence, our focus on the properties of a hypothesis being falsifiable and contextual.
2.2 Activity: Strategies for Hypothesis Generation
In the next activity, you have the opportunity to practice developing specific and falsifiable hypotheses. Aim to move beyond broad explanations in order to identify the underlying mechanism and specify clear, testable predictions. Similarly, a strong hypothesis makes explicit what necessary evidence would falsify it.
Post-activity questions:
- Did you come up with a plausible competing hypothesis?
- Which prompts helped you generate a competing hypothesis? Did any of the prompts change your view of the initial hypothesis?
- Were there additional ways of altering the initial or competing hypothesis?
- What strategies helped you think beyond your initial explanation?
- Did you have a preferred competing hypothesis? How did it complement or directly challenge the original idea?
Developing a strong hypothesis is challenging, and part of strengthening a hypothesis is putting it up against contradicting ideas. Remember: it is necessary to be thoughtful about initial hypotheses and consider opposing ideas that could equally explain underlying mechanisms or observed effects.
2.3 Rigorous experiments start with good hypotheses
To recap, a rigorous experiment requires meaningfully challenging your hypotheses by:
- Developing a hypothesis that is specific, falsifiable, and contextual.
- Designing a study or experiment that challenges the hypothesis, and which can provide strong evidence regardless of how the results turn out.
- Iterating on the hypothesis and study design to make sure they are rigorous and well-aligned.
What are some specific actions that can be taken to support these steps?
- Examine your hypotheses: Look for a set of hypotheses that make predictions that are incompatible with each other. Ensure that these hypotheses are scientifically meaningful and biologically plausible.
- Exploratory pre-studies: Conduct pilot experiments or look at existing data to see if a hypothesis is indeed a plausible explanation for the observations.
- Seek opposing views: Talk to colleagues who are skeptical. They may quickly offer other competing ideas, helping you to refine how you approach your study design.
- Stay updated on literature: Carefully seek out literature about contradictory or inconclusive findings to ensure that you don't get tunnel vision about the publications that support your favored explanation.
- Be explicit about the circumstances under which the hypotheses should be rejected.
Takeaways:
- Confirmation bias can lead you to focus on a favored hypothesis, causing you to interpret supportive evidence as definitive proof, even when other plausible explanations exist.
- If you design experiments with competing hypotheses that make incompatible predictions, then you are more likely to develop a clear differentiation between potential explanations. This leads to more conclusive and informative results, which provides results consistent with reality.
- Designing experiments that attempt to disprove hypotheses can often be both more efficient and more elucidative than designing experiments that attempt to prove them.
- Writing specific, falsifiable hypotheses ensures that research design explicitly tests the causes and underlying mechanisms of a given phenomenon rather than just confirming what you already believe.
Reflection:
- Think back to a recent project: where did you feel a strong urge to prove your favorite explanation instead of testing whether it could be wrong?
- When you design an experiment, how do you make room for a rival idea that could oust your front-runner hypothesis?
- Recall a time your results “fit” a vague prediction; what alternative stories about the data did you overlook?
