Sensitivity of logistic regression prediction on coefficients

The output of a logistic regression model is a function that predicts the probability of an event as a function of the input parameter. This post will only look at a simple logistic regression model with one predictor, but similar analysis applies to multiple regression with several predictors.

p(x) = \frac{1}{1 + \exp(-a + bx)}

Here’s a plot of such a curve when a = 3 and b = 4.

Flattest part

The curvature of the logistic curve is small at both extremes. As x comes in from negative infinity, the curvature increases, then decreases to zero, then increases again, then decreases as x goes to positive infinity. We quantified this statement in another post where we calculate the curvature. The curvature is zero at the point where the second derivative of p

p''(x) = \frac{b^2 \exp(a + bx)\left(\exp(a +bx) -1\right)}{(1 + \exp(a + bx))^3}

is zero, which occurs when x = –a/b. At that point p = 1/2, so the curve is flattest where the probability crosses 1/2. In the graph above, this happens at x = -0.75.

A little calculation shows that the slope at the flattest part of the logistic curve is simply b.

Sensitivity to parameters

Now how much does the probability prediction p(x) change as the parameter a changes? We now need to consider p as a function of three variables, i.e. we need to consider a and b as additional variables. The marginal change in p in response to a change in a is the partial derivative of p with respect to a.

To know where this is maximized with respect to x, we take the partial derivative of the above expression with respect to x

\frac{\partial^2 p}{\partial x\, \partial a} = \frac{b(\exp(a + bx) - 1) \exp(a + bx)}{(1 + \exp(a + bx))^3}

which is zero when  x = –a/b, the same place where the logistic curve is flattest. And the partial of p with respect to a at that point is simply 1/4, independent of b. So a small change Δa results in a change of approximately Δa/4 at the flattest part of the logistic curve and results in less change elsewhere.

What about the dependence on b? That’s more complicated. The rate of change of p with respect to b is

\frac{\partial p}{\partial b} = \frac{\exp(a + bx) x }{(1 + \exp(a + bx))^2}

and this is maximized where

\frac{\partial^2 p}{\partial x \partial b} = 0

which in turn requires solving a nonlinear equation. This is easy to do numerically in a specific case, but not easy to work with analytically in general.

However, we can easily say how p changes with b near the point x = –a/b. This is not where the partial of p with respect to b is maximized, but it’s a place of interest because it has come up two times above. At that point the derivative of p with respect to b is –a/4b. So if a and b have the same sign, then a small increase in b will result in a small decrease in p and vice versa.

Leave a Reply

Your email address will not be published. Required fields are marked *