We often compare an outcome between two groups – for instance, disease incidence in an exposed group vs. an unexposed group. The effect can be summarized by measures like the risk ratio, rate ratio, or odds ratio. In previous article we saw log likelihood. Let’s focus on a rate ratio example to see how likelihood helps estimate these effects and their uncertainty.
Scenario
A cohort study in Guatemala investigates acute lower respiratory infections in young children. The researchers suspect that housing conditions affect infection rates. They categorize children under 5 into two groups:
- Exposed group: Children living in poor housing conditions (e.g. overcrowded, poor ventilation).
- Unexposed group: Children living in good housing conditions.
Over one year, they follow both groups and count how many lower respiratory infections occur.
The results:
- Poor housing group: 33 infections observed over 355 child-years of follow-up.
- Good housing group: 24 infections observed over 518 child-years of follow-up.
From these data, the incidence rate in each group can be calculated as cases divided by person-time:
(infections per child-year for the exposed)
(infections per child-year for the unexposed)
The rate ratio (RR) comparing poor to good housing is:
![]()
So children in poor housing had about double the rate of respiratory infection as those in good housing.
This 2.01 is our point estimate (MLE) for the rate ratio.
Now, we will confirm that using a likelihood approach and see how to get a confidence interval.
Likelihood for Rate Ratio
To approach this with likelihoods, we need a probability model for the data.
Counts of infections over time often follow a Poisson distribution (especially when the counts are relatively low and the events are independent).
For two groups, we can think in terms of a Poisson regression or simply two Poisson likelihoods.
We have two parameters of interest:
- The baseline rate
for the unexposed (good housing) group. - The rate ratio
. The rate in the exposed group will be
.
Using properties of the Poisson distribution, the likelihood of the observed data (33 events in group 1 over
years, and 24 events in group 0 over
years) can be written as:
![]()
The log-likelihood
is:
![]()
We can separate parts involving
and
:
![]()
Maximizing with respect to
, the solution is:
![]()
Plugging
back gives the profile log-likelihood for
:
![]()
The maximum occurs at
, confirming the MLE for the rate ratio.
So the MLEs match the intuitive estimates:
Log-Scale and Confidence Interval for the Rate Ratio
For ratios, it’s often convenient to switch to the log scale.
Let
, so
.
Near the peak, the log-likelihood in
is approximately quadratic.
The standard error for
can be obtained from:
![]()
where
,
. So:
![]()
Then:

- 95% CI for
: 

- Back-transform:
, 
Thus, the 95% confidence interval for the rate ratio is approximately 1.19 to 3.39.
Since this interval does not include 1, it suggests a statistically significant difference.
Why Likelihood is Useful Here
We used the likelihood approach conceptually to derive these results. In practice, one might plug data into a Poisson regression software, but under the hood it’s doing the same thing – finding the
and
that maximize the likelihood, and then using the curvature of the log-likelihood at the peak to get confidence intervals or p-values.
This example shows that:
- The MLEs for multi-parameter models are often just the intuitive estimates (observed values).
- Likelihood provides a unified way to get both the estimate and its uncertainty.
- Using the log scale for ratios gives symmetrical confidence intervals on that log scale, which correspond to asymmetric intervals on the original scale.
In epidemiological papers, you’ll often see something like “Rate ratio = 2.01, 95% CI [1.19–3.39], p = 0.009”. All of that information can be traced back to a likelihood-based calculation (or an equivalent large-sample approximation).
So, likelihood methods seamlessly extend from one-sample problems (like estimating one probability) to comparing two groups (estimating ratios, differences, etc.), and further to many groups or more complex regression models. The core idea remains: find the parameter values that best explain the data, and then determine which other values are reasonably compatible with the data by seeing how much the likelihood drops off when you move away from the best fit.