The logistic distribution looks very much like a normal distribution. Here’s a plot of the density for a logistic distribution.
This suggests we could approximate a logistic distribution by a normal distribution. (Or the other way around: sometimes it would be handy to approximate a normal distribution by a logistic. Always invert.)
But which normal distribution approximates a logistic distribution? That is, how should we pick the variance of the normal distribution?
The logistic distribution is most easily described by its distribution function (CDF):
F(x) = exp(x) / (1 + exp(x)).
To find the density (PDF), just differentiate the expression above. You can add a location and scale parameter, but we’ll keep it simple and assume the location (mean) is 0 and the scale is 1. Since the logistic distribution is symmetric about 0, we set the mean of the normal to 0 so it is also symmetric about 0. But how should we pick the scale (standard deviation) σ for the normal?
The most natural thing to do, or at least the easiest thing to do, is to match moments: pick the variance of the normal so that both distributions have the same variance. This says we should use a normal with variance σ2 = π2/3 or σ = π/√3 . How well does that work? The following graph gives the answer. The logistic density is given by the solid blue line and the normal density is given by the dashed orange line.
Not bad, but we could do better. We could search for the value of σ that minimizes the difference between the two densities. The minimum occurs around σ = 1.6. Here is the revised graph using that value.
The maximum difference is almost three times smaller when we use σ = 1.6 rather than σ = π/√ 3 ≈ 1.8.
What if we want to minimize the difference between the distribution (CDF) functions rather than the density (PDF) functions? It turns out we end up at about the same spot: set σ to approximately 1.6. The two optimization problems don’t have exactly the same solution, but the two solutions are close.
The maximum difference between the distribution function of a logistic and the distribution of a normal with σ = 1.6 is about 0.017. If we used moment matching and set σ = π/√3, the maximum difference would be about 0.022. So moment matching does a better job of approximating the CDFs than approximating the PDFs. But we don’t need to decide between the two criteria since setting σ = 1.6 approximately minimizes both measures of the approximation.