Causality

From 'what' to 'why'

Lesson 1:

Correlations point towards causal explanations

Summary

This lesson explores how correlations serve as starting points for causal inquiry. Learners examine an example of a correlation and use it to brainstorm causal explanations. The lesson then introduces the form of causal questions and contrasts two neurophysiological studies: one that is an indirect test, and another that does a direct manipulation.

Goal

Identify how correlations lead to causal explanations to test, and analyze study designs to understand when evidence is correlational vs a direct manipulation.

1.1 Why causality matters

What makes it possible to say one thing has caused another?

This question is at the heart of scientific inquiry across disciplines. As humans, we instinctively seek out explanations for the phenomena that we observe. And in neuroscience, we care about causal questions because the answers can give us insight into how the brain works, or even lead to treatments for neurological conditions.

Example causal questions:

  • Does dopamine reinforce learning?
  • Does Parkinson's disease involve dopamine dysfunction?

Discovering accurate and actionable explanations to these kinds of questions can be quite challenging! This unit will take you through the key methods in scientific research to arrive at causal inference and avoid some of the traps.

We begin, as inquiry often does, with a simple observation of two phenomena that appear to be related.

What makes it possible to say one thing has caused another?

Activity: Explain the correlation

Consider the two phenomena presented, and suggest an explanation for why they may appear to be related.

Would you say that one causes the other?

Click anywhere to start

Post-activity questions:

  • Does your explanation contain a hypothesis or link that could be tested scientifically?
  • What factors give the appearance of a causal link, if at all, between chocolate consumption and Nobel laureates per capita?
  • What kinds of information would you need to be able to make a causal claim?

Since Messerli first documented this correlation between chocolate consumption and Nobel laureates in different countries in 2012, others have discovered other, similar, correlations, such as milk consumption or GDP per capita. In truth, eating chocolate is almost certainly not going to make you a Nobel laureate. This nugget does, however, reveal several important insights into how to think about causation.

First, the presence of a correlation is not sufficient evidence to suggest causation (the old adage that "correlation is not causation").

Second, that by thinking about potential explanations for a correlation, we can land on causal questions to investigate.

1.2 The form of causal questions

Causal questions have the form: "If I make a change to an independent variable, does that cause a change in the dependent variable?"

Independent Variable

Dependent Variable

Causal Question

chocolate consumption

Nobel laureates

Does eating chocolate make you a Nobel laureate?

dopamine

learning

Does dopamine reinforce learning?

dopamine dysfunction

Parkinson's disease

Does Parkinson's disease involve dopamine dysfunction?

So far, these questions have been introduced, with the associated independent and dependent variables. Note that the exact wording of the question is not always the same! The verb that links the independent and dependent variable together may encapsulate several details, such as:
to what extent an independent variable might be a sole cause or an influence on the dependent variable
whether the independent and/or dependent variables are binary (yes/no), quantitative (on a numeric scale) or other
hints as to the hypothesized causal pathway whereby the independent variable exerts influence on the dependent variable.

Some of these details will be covered in later lessons:
Lesson 2
Lesson 4
Lesson 5

Causal questions have the form: "If I make a change to an independent variable, does that cause a change in the dependent variable?"

Takeaways:

  • Research lacking proper controls might seem obviously flawed in hindsight, but critically evaluating research methods, thinking carefully about concerns that might arise from those methods, and implementing controls is the best defense against a dangerous false positive result. 
  • Experiments can be laid out in process diagrams to improve clarity and allow teams to work to identify potential concerns.
  • Controls work to address concerns. Concerns are any factor that can impact the strength of a scientific claim.

Reflection:

  • Can you recall a time you read research findings that were later walked back or retracted?
  • When was the last time you performed an experiment and discovered an unforeseen variable impacted your results?
  • What steps do you take to map out your research? Where do you incorporate and plan controls in that work?

Lesson 2:

Control, Controls, Controlled (COPY)

Summary

This lesson disambiguates the word “control” in science. Is a “control group” the same as “controlling for” a variable? Not always. In this lesson, we’ll unpack these meanings and introduce a simple framework to help identify potential concerns and explore opportunities to manage variables that could mislead.

Goal

  • Identify potential sources of concern and differentiate among strategies available to apply experimental control — including but not limited to control groups. 
  • Explore prior knowledge and consider the reasoning YOU use when selecting an approach to “control.”

2.1 The many faces of “control”


In everyday language, “control” means to take charge or hold something steady. In scientific research, though, the word wears many hats. Sometimes it refers to a comparison group without an intervention. Sometimes it means managing variability. Sometimes it’s a strategy for protecting results from bias.

We think we know what control means, but it can be surprisingly hard to nail down in an experimental setting.

How many of these uses of the word “control” look familiar to you?

Example 1. “We used untreated fruit flies, fed the same food and raised in the same environment, as a control group.” This is an example of Controls as a comparison group. In this example, control refers to another study population subjected to the same conditions as the intervention group for the duration of the experiment, but not given the intervention.

Example 2. “We controlled for the influence of circadian rhythms by testing all participants 2 hours after walking.” This is an example of control meaning removing the influence of a variable. In this example, controlling for a variable refers to keeping that variable constant within the study in order to insulate an experiment from the effects of that variable. 

Example 3. “We used stratified randomization to distribute mice across treatment arms, controlling for sex and age.” This is an example of control meaning distribution to create comparable groups. This example describes a treatment allocation practice to achieve the same insulating effect while still allowing the full range of a variable to be included in the study. It does so by taking steps to avoid the potential for uneven distribution of variability  across the study groups. 

Example 4. “Controlling for levels of daily exercise, drug ABC reduced the severity of neuropathic pain by an average of 62%.” This is an example of control meaning a statistical correction for the effect of a variable. Control in this example refers to the calculations performed to identify the impact of a variable on the outcome of an experiment in quantitative terms, then adjust all results in a way that removes that impact prior to hypothesis testing.

What’s missing from these examples? They are all common examples of “scientific control”, yet none of them describes the main way we were using “control” in Lesson 1! Our missing lobotomy control was a separate group of subjects that could be used for comparison to test and eliminate alternative causes (like spontaneous remission) of an outcome from variables other than the independent variable.

 Clearly, there is more to “control” than just control groups

2.2 A new superfood?

Let’s imagine an example experiment. A research team is exploring the effects of a new superfood called the Rareberry. The Rareberry is reported to cause exceptional motor skill development in juvenile mammals. Does it? Let’s design an experiment to find out. 

Our question: Do Rareberries affect coordination development in mice? The intervention, or independent variable, is the consumption of Rareberries. There are a lot of ways to demonstrate a juvenile mammal has superior motor skills. The dependent variable selected for this study is coordination, which serves as a proxy for their overall motor skills. We hypothesize that juvenile mice fed Rareberries will develop greater coordination than the control group. 

Our plan: supplement one group of lab mice with Rareberries, then test their motor coordination on a rotarod—a standard assay where mice must maintain balance on a rotating rod.

If the Rareberry mice perform well, will that provide convincing evidence that the berries are effective? What if they don’t perform well? Will we be prepared to walk away confident that we have busted this myth? Rigorous controls ensure that research is able to make meaningful scientific claims about the world. 

2.3 Concerns about validity


In any experiment, we will have an independent variable - our intervention - and a dependent variable - our outcome. Inevitably, though, the world is chock full of other variables. Many of these are sources of concern for our experiment.

Our independent variable: Rareberries

Our dependent variable: time on a rotarod

As a general rule, variables of concern will arise from one of 4 sources.

  • Subject: Factors like age, sex, genetic strain, and history. 
  • Environment: lighting, noise, time of testing
  • Measurement: who does the scoring, what tools are used, how data are processed prior to analysis
  • Intervention: vehicle solution, delivery method, handling, time

Variables of concern may vary from study to study. A study exploring effects of an intervention on participants’ smoking habits may be concerned withparticipant’s existing smoking habits. A study exploring the effects of an intervention on social dynamics within a group could be concerned about potential sources of environmental agitation or aggression. A study exploring the effects of an intervention with qualitatively assessed measures might be concerned about raters’ biases in favor of a specific result. A study exploring the effects of serine protease inhibitor neuroserpin on neuronal growth would now know that they need to be concerned about the delivery and sterilization methods used to introduce neuroserpin into the experimental environment.

A Closer Look: The Rareberry and cage height

Consider this scenario for our Rareberry study: mice housed on higher shelves in the rack have a better view of the room. They can watch other mice cavorting below them. They receive more light exposure and tend to be active for longer periods of time. Over time, these environmental differences enhance their motor coordination.

If our Rareberry-fed mice happen to be more often found living on the upper shelves, how might this affect our results?

Pause to consider: 

How might each of these categories (subject, environment, measurement, intervention) apply in our Rareberries experiment? Can you think of one variable that is likely to influence our outcome in each category?

Variables are concerning when we either know or suspect that they might affect our outcome. We have to be vigilant because - as we saw in the neuroserpin study - just because we don’t suspect a variable, that doesn’t mean we shouldn’t be concerned! Depending on how they act and how they are distributed, impactful variables might exaggerate, diminish, change, or obscure our outcome of interest. If this happens, it reduces the accuracy of our experiment: we will be unable to trust that any result - or lack thereof -  is specifically attributable to our intervention.

This kind of accuracy has a name: internal validity. Uncontrolled variables threaten the internal validity of our experiment because they have the power to make our conclusions about the subjects in our study wrong. 

Internal validity may be contrasted with external validity, which describes the extent to which our conclusions also apply to subjects or circumstances beyond our study. External validity is sometimes known as the “generalizability” of results. We’ll return to this later, when we discuss the frustrating trade-offs that come with some types of controls. 

At its heart, “control” in experimental design means managing variables that could distort our understanding of cause and effect. Apart from statistical errors, failed controls of one kind or another may be the single most common source of wrong answers in science.

2.4 Strategies for control


It can be helpful to organize our options for control into categories. Consider how you might apply each of these options for managing a variable:

1. Constrain

Sometimes we manage variables of concern by keeping them constant or narrowing the range of variability allowed in our study. For example, using only male mice of a specific age limits the impact that effects of sex or age could have on our outcomes.

2. Distribute

When a variable isn’t constrained in our study design, we often take steps to prevent it from affecting any one group disproportionately. Depending on a study design, randomization methods can ensure that key variables are evenly distributed. They may still add to the variability of our dataset overall, but we can at least be assured that their influence across groups should be similar.

3. Test

Sometimes, we want or need to know what kind of an impact a variable has on our outcome. Testing for specific impacts or outcomes helps us contextualize any effect of our independent variable by comparing it to minimum, maximum, or alternative outcomes to understand a difference. Sometimes this may mean creating a new “control group”; alternatively, we may record and test for the effect of a variable like sex to help us understand how and whether our findings apply to the general population.

Activity: How do you think about control?

When we design an experiment, we apply control strategies to create the best experiment we can. In many cases, we may not be fully aware of why we are choosing one strategy over another, or have even considered that alternatives exist. 

In our next activity, you will be given a description of a familiar experiment with several “concerns.” Use your best intuition to sort these concerns into the strategy bin that you think best fits the approach you would take. While in some cases there may be common answers, there isn’t necessarily a best answer for any given variable. 

After sorting your concerns and explaining a bit about how you envision a strategy being applied to it, you will see how others chose to address each concern.

Click anywhere to start

Post-activity questions:

In sorting concerns into different categories, you had a chance to think about the differences and relationships among our options for experimental control. Now, consider the following questions:

  • Why did you make the choices that you made? 
  • Were there some variables you always chose to distribute? To constrain?
  • Other variables that didn’t make sense to handle with anything but a control group?

The term “control” can mean many different things. This is not bad. But, cultivating an awareness of different applications of the term control and understanding why each of these roles matter is key to doing more rigorous science.

Takeaways:

  • Control, used colloquially, refers to things as divergent as setting up a formal control group and applying a statistical correction. Control groups, controlling for variables, controlling group allocations, and controlling variable influence on results are all different things and it’s important to know which you mean when you use the word control. 
  • Controls are important because they are key determinants ofthe internal and external validity of an experiment. How the experimentalist knows that their result is accurate, and whether others can generalize the experiment’s results.
  • Broadly speaking, strategies for controlling concerning components of an experimental design follow three patterns: Constraining variables by keeping them constant or within a permitted range of low variability, distributing variables across subject allocations, and testing for the effects of variables, sometimes by creating control groups.

Reflection:

  • What kinds of factors would be important to you in making decisions about how to handle concerning variables in your own research?
  • Think of a choice you have had to make regarding control for your own research. How does your choice fit into our category bins? How would your research be different if you had chosen any of the other categories, as well or instead?