Suppose you have a system with n possible states. The entropy of the system is maximized when all states are equally likely to occur. The entropy is minimized when one outcome is certain to occur.
You can say more. Starting from any set of probabilities, as you move in the direction of more uniformity, you increase entropy. And as you move in the direction of less uniformity, you decrease entropy.
These statements can be quantified and stated more precisely. That’s what the rest of this post will do.
***
Let pi be the probability of the ith state and let p be the vector of the pi.
Then the entropy of p is defined as
If one of the p‘s is 1 and the rest of the p‘s are zero, then H(p) = 0. (In the definition of entropy, 0 log2 0 is taken to be 0. You could justify this as the limit of x log2 x as x goes to zero.)
If each of the pi are equal, pi = 1/n, then H(p) = log2 n. The fact that this is the maximum entropy, and that compromises between the two extremes always decrease entropy, comes from the fact that the entropy function H is concave (proof). That is, if p1 is one list of probabilities and p2 another, then
When we speak informally of moving from p1 in the direction of p2, we mean we increase the parameter λ from 0 to some positive amount no more than 1.
Because entropy is concave, there are no local maxima. As you approach the location of global maximum entropy, i.e. equal state probabilities, from any direction, entropy increases monotonically.