Instrumental Variables in Observational Studies

Research updated on November 2, 2025
Author: Santhosh Ramaraj

Instrumental variables (IVs) are powerful tools used to reduce the bias caused by unmeasured confounding in observational research. Originally developed for economic models, they are now common in health research and policy studies where randomized trials are not always possible. If you have ever wondered how we can estimate the real effect of a treatment when confounders are hidden, IV methods are worth learning.

What is an Instrumental Variable?

An instrumental variable is a factor that influences treatment but does not directly affect the outcome, except through the treatment itself. Imagine you want to study the effect of a new drug on patient recovery. If both treatment and recovery are influenced by patient health (an unobserved factor), a direct comparison may be biased. Here, an IV serves as a sort of “pseudo-randomizer” that can help isolate the treatment effect.

To be valid, an IV must satisfy three critical conditions:

Relevance: The IV must be strongly correlated with the treatment.
Mathematically, $\text{Cov}(Z, T) \neq 0$ ,
where $Z$ is the instrumental variable and $T$ is the treatment.
Exclusion Restriction: The IV must influence the outcome $Y$ only through $T$ . There should be no direct path:
$\text{Cov}(Z, \epsilon_Y | T) = 0$
Independence: The IV should be unrelated to both observed and unobserved confounders that affect $Y$ .

Examples of Instrumental Variables

In healthcare, suitable IVs often arise from naturally occurring events or system-level factors that act like random assignments. For instance:

Geographic variation: Patients in different regions may have varying access to specialists. Distance to a specialty hospital can act as an IV for treatment assignment.
Physician preferences: Some doctors prefer a particular treatment approach, regardless of patient characteristics. This preference can serve as an instrument.
Policy changes: A regulatory update or innovation can alter prescribing behavior, creating an opportunity for quasi-random variation in treatment.

A real example is found in research by Wright et al. (2016), where geographic differences in the performance of lymphadenectomy (a surgical procedure) were used as an IV to study patient survival rates.

Two-Stage Least Squares (2SLS): The Standard Approach

The most widely used estimation technique for IV analysis is two-stage least squares (2SLS). It works in two main steps:

Stage 1: Predict the treatment $T$ using the IV and other covariates $X$ :
$T = \alpha_0 + \alpha_1 X + \phi Z + \epsilon_T$

Stage 2: Use the predicted treatment $T^{t}$ from stage 1 in the outcome model:
$Y = \beta_0 + \beta_1 X + \gamma \hat{T} + \epsilon_Y$

Here:

$Y$ is the outcome variable (e.g., patient recovery score),
$X$ is a vector of observed covariates like age or diagnosis,
$Z$ is the instrumental variable,
$\epsilon_T$ and $\epsilon_Y$ are error terms.

In matrix notation, if $V$ represents the IV and $X$ represents other covariates, the system can be expressed as:
$T = \alpha_0 + \alpha^t X + \phi V + \epsilon_T$
$Y = \beta_0 + \beta^t X + \gamma T + \epsilon_Y$
where $^t$ denotes a transpose, turning a vector of coefficients into a column form.

Why Use IVs Instead of Regular Regression?

A standard regression assumes that all confounders are either controlled or random. In many real-world studies, this is unrealistic. For example, if patients with severe illness are more likely to receive aggressive treatment, simple regression might overestimate or underestimate treatment effects. By using an IV that is independent of illness severity but linked to treatment, you can estimate a causal effect.

How to Identify a Good IV?

Finding a valid IV is often the hardest part. There is no universal test to confirm whether a variable is a perfect instrument. Researchers typically rely on:

Domain expertise: Knowing the healthcare system, physician behaviors, or policy environment can suggest potential instruments.
Statistical checks: You can test if the IV is correlated with treatment (relevance). Weak instruments, where this correlation is close to zero, can lead to misleading results.
Falsification tests: Assessing whether the IV directly predicts the outcome after controlling for treatment. If it does, the IV is invalid.

Testing for Relevance

The strength of the IV can be assessed using an F-test in the first-stage regression. If the F-statistic is less than 10, the IV is typically considered weak, which can bias results.

For example, if you regress treatment $T$ on the IV $Z$ and covariates $X$ :
$T = \alpha_0 + \alpha_1 Z + \alpha_2 X + \epsilon_T$
a weak relationship between $Z$ and $T$ would make the 2SLS estimates unreliable.

Practical Healthcare Example

Imagine we want to study whether a new diabetes drug improves patient HbA1c levels (a measure of blood sugar control). Patients who receive the drug are often sicker or have tried other treatments, so simple comparisons are biased. We might use distance to a clinic that frequently prescribes this drug as an IV. This distance influences whether a patient receives the drug but does not directly affect HbA1c, except through treatment.

Stage 1: Use distance (IV) to predict treatment assignment.
Stage 2: Use predicted treatment values to estimate the drug’s effect on HbA1c.

Challenges and Limitations

While IV analysis is powerful, it comes with challenges:

Assumptions are not directly testable. We cannot always verify that the IV has no direct effect on the outcome.
Weak instruments produce large errors. If $\text{Cov}(Z, T)$ is very small, the 2SLS estimate becomes unstable.
Interpretation can be tricky. The estimated effect is often a local average treatment effect (LATE), meaning it applies only to the subgroup affected by the instrument.

Instrumental variables are like hidden keys that unlock causal effects in non-randomized studies. When chosen carefully and validated properly, they provide a reliable way to address unmeasured confounding. For students and beginners, remember that the magic lies not in complex formulas but in finding a clever variable $Z$ that mimics the role of randomization.

Relevant Reads