Modal and temporal logic for computer security

key

In the previous post, I mentioned that modal logic has a lot of interpretations and a lot of axiom systems. It can also have a lot of operators. This post will look at Security Logic, a modal logic for security applications based on a seminal paper by Glasgow et al [1]. It illustrates how modal and temporal logic can be applied to computer security, and it illustrates how a logic system can have a large number of operators and axioms.

Knowledge axioms

Security Logic starts with operators Ki that extend the box operator □. For a proposition p, Ki p is interpreted as saying that the ith agent knows p to be true. Glasgow et al chose the following axioms for inference regarding Ki.

Axiom K1:   Ki φ → φ
Axiom K2:   Ki(φ → ψ) → (Ki φ → Ki ψ)
Axiom K3:   Ki φ → Ki Ki φ
Axiom K4:   ¬ Ki φ → Ki( ¬ Ki φ)

These can be interpreted as follows.

If you know something to be true, it’s true.
If you know one thing implies another, then knowing the first lets you know the other.
If you know something, you know that you know it.
If you don’t know something, you know that you don’t know it.

This is not to claim that these propositions are universal metaphysical truths, only modeling assumptions. It may be possible, for example, to know something without knowing that you know it, but that’s a subtle matter excluded from this model.

Temporal logic axioms

Temporal logic is modal logic with operators interpreted as applying to propositions over time. Glasgow et al introduce three temporal operators: ∀□, ∀◇, and  ∃□. Note that each of these is a single operator; there is no box or diamond operator in Security Logic.

∀□p means that p always holds, ∀◇p means that p eventually holds, and ∃□p means that p sometimes holds.

The ∃□ and ∀□ operators are dual in the same sense that the basic modal operators □ and ◇ are dual:

∃□p ↔︎ ¬ ∀□ ¬p
∀□p ↔︎ ¬ ∃□ ¬p

Security Logic not give axioms for ∃□ because its behavior is determined by the axioms for ∀□.

The three temporal logic axioms for Security Logic are as follows.

Axiom T1:  φ → ∀◇φ
Axiom T2:  ∀□(φ → ψ) → (∀□φ → ∀□ψ)
Axiom T3:  ∀□φ → ∀□ ∀□φ

These can be interpreted as follows.

If a proposition holds, it eventually holds.
If its always the case that one proposition implies another, then if the former always holds then the latter always holds.
If something always holds, then it always always holds.

Obligation and Permission

Finally, Security Logic introduces modal operators O and P for obligation and permission.

Obligation and permission are dual, just as necessity and possibility are dual and the always and sometimes operators are dual.

Op ↔︎ ¬ P ¬p
Pp ↔︎ ¬ O ¬p

When obligation and permission are combined with knowledge, Security Logic introduces the following abbreviations.

Oi = OKi
Pi = PKi

Axioms for Security Logic

The complete axioms for Security Logic are the four knowledge axioms above, the three temporal axioms above, and the five additional axioms below.

Axiom SL1:   Pi φ for all propositional tautologies φ.
Axiom SL2:  Pi φ → φ
Axiom SL3:  (Pi φ ∧ Pi ψ) ↔︎ Pi (φ ∧ ψ)
Axiom SL4:  Pi φ → Pi (φ ∨ ψ)
Axiom SL5:  Oi φ → Pi φ

These can be interpreted as follows.

You are permitted to know all tautologies.
Whatever is permitted must be true.
Permission holds through conjunction and disjunction.
Whatever is obligatory must be permissible.

Related posts

[1] Janice Glasgow, Glenn MacEwen, and Prakash Panangaden. A Logic for Reasoning About Security. ACM Transactions on Computer Systems, Vol 10 No 3, August 1992, pp 226–264.

Passwords and power laws

According to this paper [1], the empirical distribution of real passwords follows a power law [2]. In the authors’ terms, a Zipf-like distribution. The frequency of the rth most common password is proportional to something like 1/r. More precisely,

fr = C rs

where s is on the order of 1. The value of s that best fit the data depended on the set of passwords, but their estimates of s varied from 0.46 to 0.91.

This means that the most common passwords are very common and easy to guess.

Size of password spaces

If passwords come from an alphabet of size A and have length n, then there are An possibilities. For example, if a password has length 10 and consists of uppercase and lowercase English letters and digits, there are

6210 = 839,299,365,868,340,224

possible such passwords. If users chose passwords randomly from this set, brute force password attacks would be impractical. But brute force attacks are practical because passwords are not chosen uniformly from this large space of possibilities, far from it.

Attackers do not randomly try passwords. They start with the most common passwords and work their way down the list. In other words, attackers use Pareto’s rule.

Rules requiring, say, one upper case letter, don’t help much because most users will respond by using exactly one upper case letter, probably the first letter. If passwords must have one special character, most people will use exactly one special character, most likely at the end of the word. Expanding the alphabet size A exponentially increases the possible passwords, but does little to increase the actual number of passwords.

What’s interesting about the power law distribution is that there’s not a dichotomy between naive and sophisticated users. If there were, there would be a lot of common passwords, and all the rest uniformly distributed. Instead, there’s a continuum between the most naive and most sophisticated. That means a lot of people are not exactly naive, but not as secure as they think they are.

Some math

Under the Zipf model [3], the number of times we’d expect to see the most common password is NC where N is the size of the data set. The constant C is what it has to be for the frequencies to sum to 1. That is, C depends on the number of data points N and the exponent s and is given by

C_{N, s} = \frac{1}{\sum_{r=1}^n r^{-s}}

We can bound the sum in the denominator from above and below with integrals, as in the integral test for series convergence. This gives us a way to see more easily how C depends on its parameters.

1 + \int_1^N x^{-s} \, dx > \frac{1}{C} = \sum_{r=1}^N r^{-s} > 1 + \int_2^{N+1} x^{-r}\, dx

When s = 1 this reduces to

1 + \log(N) > \frac{1}{C} > 1 + \log(N+1) - \log(2)

and otherwise to

1 + \frac{N^{1-s} - 1}{1-s} > \frac{1}{C} > 1 + \frac{(N+1)^{1-s} - 2^{1-s}}{1-s}

Suppose you have N = 1,000,000 passwords. The range of s values found by Wang et al varied from roughly 0.5 to 0.9. Let’s first set s = 0.5. Then C is roughly 0.0005. This mean the most common password appears about 500 times.

If we keep N the same and set s to 0.9, then C is roughly 0.033, and so the most common password would appear about 33,000 times.

How to choose passwords

If you need to come up with a password, randomly generated passwords are best, but hard to remember. So people either use weak but memorable passwords, or use strong passwords and don’t try to remember them. The latter varies in sophistication from password management software down to Post-it notes stuck on a monitor.

One compromise is to concatenate a few randomly chosen words. Something like “thebestoftimes” would be weak because they are consecutive words from a famous novel. Something like “orangemarbleplungersoap” would be better.

Another compromise, one that takes more effort than most people are willing to expend, is to use Manuel Blum’s mental hash function.

Related posts

[1] In case the link rots in the future, the paper is “Zipf’s Law in Passwords” by Ding Wang, Haibo Cheng, Ping Wang, Xinyi Huang, and Gaopeng Jian. IEEE Transactions on Information Forensics and Security, vol. 12, no. 11, November 2017.

[2] Nothing follows a power law distribution exactly and everywhere. But that’s OK: nothing exactly follows any other distribution everywhere: not Gaussian, not exponential, etc. But a lot of things, like user passwords, approximately follow a power law, especially over the middle of their range. Power law’s like Zipf’s law tend to not fit as well at the beginning and the end.

[3] Here I’m using a pure Zipf model for simplicity. The paper [1] used a Zipf-like model that I’m not using here. Related to the footnote [2] above, it’s often necessary to use a minor variation on the pure Zipf model to get a better fit.

The Tangled Web

The Tangled Web (ISBN 1593273886) is a security book that you may find interesting even if you’re not interested in security. The first half of the book is an excellent explanation of how Web technologies work in theory and especially in practice. This material is included in order to discuss security implications, but it’s interesting on its own. The second half of the book is directly devoted to Web security.

The author, Michal Zalewski, has a colorful writing style. His book is serious and loaded with technical detail, but that doesn’t stop him from turning a nice phrase here and there.

Here’s an excerpt from The Tangled Web that I particularly liked, one that explains why security concerns on the Web differ from previous security concerns.

In the traditional model followed by virtually all personal computers … there are very clear boundaries between high-level data objects (documents), user-level code (applications), and the operating system kernel … These boundaries are well studied and useful for building practical security schemes. A file opened in your text editor is unlikely to be able to steal your email …

In the browser world, this separation is practically nonexistent: Documents and code live as parts of the same intermingled blobs of HTML, isolation between completely unrelated applications is partial at best …

In the end, the seemingly unlikely scenario of a text file stealing your email is, in fact, a frustratingly common pattern on the Web.