Curry-Howard-Lambek correspondence

Curry-Howard-Lambek is a set of correspondences between logic, programming, and category theory. You may have heard of the slogan proofs-as-programs or propositions-as-types. These refer to the Curry-Howard correspondence between mathematical proofs and programs. Lambek’s name is appended to the Curry-Howard correspondence to represent connections to category theory.

The term Curry-Howard isomorphism is often used but is an overstatement. Logic and programming are not isomorphic, nor is either isomorphic to category theory.

You’ll also hear the term computational trinitarianism. That may be appropriate because logic, programming, and category theory share common attributes without being identical.

The relations between logic, programming, and categories are more than analogies, but less than isomorphisms.

There are formal correspondences between parts of each theory. And aside from these rigorous connections, there is also a heuristic that something interesting in one area is likely to have an interesting counterpart in the other areas. In that sense Curry-Howard-Lambek is similar to the Langlands program relating number theory and geometry, a set of theorems and conjectures as well as a sort of philosophy.

Logic and applications Twitter account

I stopped posting to the @FormalFact Twitter account last July, but I didn’t deactivate the account. Now I’m going to restart it.

Unlike my other Twitter accounts, I don’t plan to have a regular posting schedule. I may not post often. We’ll see how it goes.

LogicPractice icon

I’ve changed the account name from @FormalFact to @LogicPractice. The “formal” part of the original name referred to formal theorem proving, the initial focus of the account. The new name reflects a focus on logic more generally, and practical applications of logic that are less laborious than formal theorem proving.

Modal and temporal logic for computer security

key

In the previous post, I mentioned that modal logic has a lot of interpretations and a lot of axiom systems. It can also have a lot of operators. This post will look at Security Logic, a modal logic for security applications based on a seminal paper by Glasgow et al [1]. It illustrates how modal and temporal logic can be applied to computer security, and it illustrates how a logic system can have a large number of operators and axioms.

Knowledge axioms

Security Logic starts with operators Ki that extend the box operator □. For a proposition p, Ki p is interpreted as saying that the ith agent knows p to be true. Glasgow et al chose the following axioms for inference regarding Ki.

Axiom K1:   Ki φ → φ
Axiom K2:   Ki(φ → ψ) → (Ki φ → Ki ψ)
Axiom K3:   Ki φ → Ki Ki φ
Axiom K4:   ¬ Ki φ → Ki( ¬ Ki φ)

These can be interpreted as follows.

If you know something to be true, it’s true.
If you know one thing implies another, then knowing the first lets you know the other.
If you know something, you know that you know it.
If you don’t know something, you know that you don’t know it.

This is not to claim that these propositions are universal metaphysical truths, only modeling assumptions. It may be possible, for example, to know something without knowing that you know it, but that’s a subtle matter excluded from this model.

Temporal logic axioms

Temporal logic is modal logic with operators interpreted as applying to propositions over time. Glasgow et al introduce three temporal operators: ∀□, ∀◇, and  ∃□. Note that each of these is a single operator; there is no box or diamond operator in Security Logic.

∀□p means that p always holds, ∀◇p means that p eventually holds, and ∃□p means that p sometimes holds.

The ∃□ and ∀□ operators are dual in the same sense that the basic modal operators □ and ◇ are dual:

∃□p ↔︎ ¬ ∀□ ¬p
∀□p ↔︎ ¬ ∃□ ¬p

Security Logic not give axioms for ∃□ because its behavior is determined by the axioms for ∀□.

The three temporal logic axioms for Security Logic are as follows.

Axiom T1:  φ → ∀◇φ
Axiom T2:  ∀□(φ → ψ) → (∀□φ → ∀□ψ)
Axiom T3:  ∀□φ → ∀□ ∀□φ

These can be interpreted as follows.

If a proposition holds, it eventually holds.
If its always the case that one proposition implies another, then if the former always holds then the latter always holds.
If something always holds, then it always always holds.

Obligation and Permission

Finally, Security Logic introduces modal operators O and P for obligation and permission.

Obligation and permission are dual, just as necessity and possibility are dual and the always and sometimes operators are dual.

Op ↔︎ ¬ P ¬p
Pp ↔︎ ¬ O ¬p

When obligation and permission are combined with knowledge, Security Logic introduces the following abbreviations.

Oi = OKi
Pi = PKi

Axioms for Security Logic

The complete axioms for Security Logic are the four knowledge axioms above, the three temporal axioms above, and the five additional axioms below.

Axiom SL1:   Pi φ for all propositional tautologies φ.
Axiom SL2:  Pi φ → φ
Axiom SL3:  (Pi φ ∧ Pi ψ) ↔︎ Pi (φ ∧ ψ)
Axiom SL4:  Pi φ → Pi (φ ∨ ψ)
Axiom SL5:  Oi φ → Pi φ

These can be interpreted as follows.

You are permitted to know all tautologies.
Whatever is permitted must be true.
Permission holds through conjunction and disjunction.
Whatever is obligatory must be permissible.

Related posts

[1] Janice Glasgow, Glenn MacEwen, and Prakash Panangaden. A Logic for Reasoning About Security. ACM Transactions on Computer Systems, Vol 10 No 3, August 1992, pp 226–264.

Typesetting modal logic

Modal logic extends propositional logic with two new operators, □ (“box”) and ◇ (“diamond”). There are many interpretations of these two symbols, the most common being necessity and possibility respectively. That is, □p means the proposition p is necessary, and ◇p means that p is possible. Another interpretation is using the symbols to represent things a person knows to be true and things that may be true as far as that person knows.

There are also many axiom systems for inference concerning these operators. For example, some axiom systems include the rule

\Box p \rightarrow \Box \Box p

and some do not. If you interpret □ as saying a proposition is provable, this axiom says whatever is provable is provably provable, which makes sense. But if you take □ to be a statement about what an agent knows, you may not want to say that if an agent knows something, it knows that it knows it.

See the next post for an example of applying logic to security, a logic with lots of modal operators and axioms. But for now, we’ll focus on how to typeset the box and diamond operators.

LaTeX

In LaTeX, the most obvious commands would be \box and \diamond, but that doesn’t work. There is no \box command, though there is a \square command. And although there is a \diamond command, it produces a symbol much smaller than \square and so the two look odd together. The two operators are dual in the sense that

\begin{align*} \Box p &= \neg \Diamond \neg p \\ \Diamond p &= \neg \Box \neg p \end{align*}

and so they should have symbols of similar size. A better approach is to use \Box and \Diamond. Those were used in the displayed equations above. These symbols are in the amsfonts package.

Unicode

There are many box-like and diamond-like symbols in Unicode. It seems reasonable to use U+25A1 for box and U+25C7 for diamond. I don’t know of any more semantically appropriate characters. There are no Unicode characters with “modal” in their name, for example.

HTML

You can always insert Unicode characters into HTML by using &#x, followed by the hexadecimal value of the codepoint, followed by a semicolon. For example, I typed □ and ◇ to enter the box and diamond symbols above.

If you want to stick to HTML entities because they’re easier to remember, you’re mostly out of luck. There is no HTML entity for the box operator. There is an entity ◊ for “lozenge,” the typographical term for a diamond. This HTML entity corresponds to U+25CA and is smaller than U+25C7 recommended above. As discussed in the context of LaTeX, you want the box and diamond operators to have a similar size.

Related posts

Automated theorem proving

When I first heard of automated theorem proving, I imagined computers being programmed to search for mathematical theorems interesting to a wide audience. Maybe that’s what a few of the pioneers in the area had in mind too, but that’s not how things developed.

The biggest uses for automated theorem proving have been highly specialized applications, not mathematically interesting theorems. Computer chip manufacturers use formal methods to verify that given certain inputs their chips produce certain outputs. Compiler writers use formal methods to verify that their software does the right thing. A theorem saying your product behaves correctly is very valuable to you and your customers, but nobody else. These aren’t the kinds of theorems that anyone would cite the way they might site the Pythagorean theorem. Nobody would ever say “And therefore, by the theorem showing that this particular pacemaker will not fall into certain error modes, I now prove this result unrelated to pacemakers.”

Automated theorem provers are important in these highly specialized applications in part because the results are of such limited interest. For every theorem of wide mathematical interest, there are a large number of mathematicians who are searching for a proof or who are willing to scrutinize a proposed proof. A theorem saying that a piece of electronics performs correctly appeals to only the tiniest audience, and yet is probably much easier (for a computer) to prove.

The term “automated theorem proving” is overloaded to mean a couple things. It’s used broadly to include any use of computing in proving theorems, and it’s used more narrowly to mean software that searches for proofs or even new theorems. Most theorem provers in the broad sense are not automated theorem provers in the more narrow sense but rather proof assistants. They verify proofs rather than discover them. (There’s some gray zone. They may search on a small scale, looking for a way to prove a minor narrow result, but not search for the entire proof to a big theorem.) There have been computer-verified proofs of important mathematical theorems, such as the Feit-Thompson theorem from group theory, but I’m not aware of any generally interesting discoveries that have come out of a theorem prover.

Related post: Formal methods let you explore the corners

Beta reduction: The difference typing makes

Beta reduction is essentially function application. If you have a function described by what it does to x and apply it to an argument t, you rewrite the xs as ts. The formal definition of β-reduction is more complicated than this in order to account for free versus bound variables, but this informal description is sufficient for this blog post. We will first show that β-reduction holds some surprises, then explain how these surprises go away when you add typing.

Suppose you have an expression (λx.x + 2)y, which means apply to y the function that takes its argument and adds 2. Then β-reduction rewrites this expression as y + 2. In a more complicated expression, you might be able to apply β-reduction several times. When you do apply β-reduction several times, does the process always stop? And if you apply β-reduction to parts of an expression in a different order, can you get different results?

Failure to normalize

You might reasonably expect that if you apply β-reduction enough times you eventually get an expression you can’t reduce any further. Au contraire!

Consider the expression  (λx.xx) (λx.xx).  Beta reduction says to replace each of the red xs with the expression in blue. But when we do that, we get the original expression (λx.xx) (λx.xx) back. Beta reduction gets stuck in an infinite loop.

Next consider the expression L = (λx.xxy) (λx.xxy). Applying β-reduction the first time gives  (λx.xxy) (λx.xxyy or Ly. Applying β-reduction again yields Lyy. Beta “reduction” doesn’t reduce the expression at all but makes it bigger.

The technical term for what we’ve seen is that β-reduction is not normalizing. A rewrite system is strongly normalizing if applying the rules in any order eventually terminates. It’s weakly normalizing if there’s at least some sequence of applying the rules that terminates. Beta reduction is neither strongly nor weakly normalizing in the context of (untyped) lambda calculus.

Types to the rescue

In simply typed lambda calculus, we assign types to every variable, and functions have to take the right type of argument. This additional structure prevents examples such as those above that fail to normalize. If x is a function that takes an argument of type A and returns an argument of type B then you can’t apply x to itself. This is because x takes something of type A, not something of type function from A to B. You can prove that not only does this rule out specific examples like those above, it rules out all possible examples that would prevent β-reduction from terminating.

To summarize, β-reduction is not normalizing, not even weakly, in the context of untyped lambda calculus, but it is strongly normalizing in the context of simply typed lambda calculus.

Confluence

Although β-reduction is not normalizing for untyped lambda calculus, the Church-Rosser theorem says it is confluent. That is, if an expression P can be transformed by β-reduction two different ways into expressions M and N, then there is an expression T such that both M and N can be reduced to T. This implies that if β-reduction does terminate, then it terminates in a unique expression (up to α-conversion, i.e. renaming bound variables). Simply typed lambda calculus is confluent as well, and so in that context we can say β-reduction always terminates in a unique expression (again up to α-conversion).

Proofs and programs

Here’s an interesting quote comparing writing proofs and writing programs:

Building proofs and programs are very similar activities, but there is one important difference: when looking for a proof it is often enough to find one, however complex it is. On the other hand, not all programs satisfying a specification are alike: even if the eventual result is the same, efficient programs must be preferred. The idea that the details of proofs do not matter—usually called proof irrelevance—clearly justifies letting the computer search for proofs …

The quote is saying “Any proof will do, but among programs that do the same job, more efficient programs are better.” And this is certainly true, depending on what you want from a proof.

Proofs serve two main purposes: to establish that a proposition is true, and to show why it is true. If you’re only interested in the existence of a proof, then yes, any proof will do. But proofs have a sort of pedagogical efficiency just as programs have an execution efficiency. Given two proofs of the same proposition, the more enlightening proof is better by that criteria.

I assume the authors of the above quote would not disagree because they say it is often enough to find one, implying that for some purposes simpler proofs are better. And existence does come first: a complex proof that exists is better than an elegant proof that does not exist! Once you have a proof you may want to look for a simpler proof. Or not, depending on your purposes. Maybe you have an elegant proof that you’re convinced must be essentially true, even if you doubt some details. In that case, you might want to complement it with a machine-verifiable proof, no matter how complex.

Source: Interactive Theorem Proving and Program Development
Coq’Art: The Calculus of Inductive Constructions

 

Big Logic

As systems get larger and more complex, we need new tools to test whether these systems are correctly specified and implemented. These tools may not be new per se, but they may be applied with new urgency.

Dimensional analysis is a well-established method of error detection. Simply checking that you’re not doing something like adding things that aren’t meaningful to add is surprisingly useful for detecting errors. For example, if you’re computing an area, does your result have units of area? This seems like a ridiculously basic question to ask, and yet it is more effective than it would seem.

Type theory and category theory are extensions of the idea of dimensional analysis. Simply asking of a function “Exactly what kind of thing does it take in? And what sort of thing does it put out?” is a powerful way to find errors. And in practice it may be hard to get answers to these questions. When you’re working with a system that wasn’t designed to make such things explicit and clear, it may be that nobody knows for certain, or people believe they know but have incompatible ideas. Type theory and category theory can do much more than nudge you to define functions well, but even that much is a good start.

Specifying types is more powerful than it seems. With dependent types, you can incorporate so much information in the type system that showing that an instance of a type exists is equivalent to proving a theorem. This is a consequence of the Curry-Howard correspondence (“Propositions as types”) and is the basis for proof assistants like Coq.

I’d like to suggest the term Big Logic for the application of logic on a large scale, using logic to prove properties of systems that are too complex for a typical human mind to thoroughly understand. It’s well known that systems have gotten larger, more connected, and more complex. It’s not as well known how much the tools of Big Logic have improved, making formal verification practical for more projects than it used to be.

Primitive recursive functions and enumerable sets

The set of primitive recursive (PR) functions is the smallest set of functions of several integer arguments satisfying five axioms:

  1. Constant functions are PR.
  2. The function that picks the ith element of a list of n arguments is PR.
  3. The successor function S(n) = n+1 is PR.
  4. PR functions are closed under composition.
  5. PR functions are closed under primitive recursion.

The last axiom obviously gives PR functions their name. So what is primitive recursion? Given a PR function  that takes k arguments, and another PR function g that takes k+2 arguments, the primitive recursion of f and g is a function h of k+1 arguments satisfying two properties:

  1. h(0, x1, …, xk) = f(x1, …, xk)
  2. h(S(y), x1, …, xk) = g(yh(yx1, … xk), x1, …, xk)

Not every computable function is primitive recursive. For example, Ackermann’s function is a general recursive function, but not a primitive recursive function. General recursive functions are Turing complete. Turing machines, lambda calculus, and general recursive functions are all equivalent models of computation, but the first two are better known.

For this post, the main thing about general recursive functions is that, as the name implies, they are more general than primitive recursive functions.

Now we switch from functions to sets. The characteristic function of a set A is the function that is 1 for elements of A and zero everywhere else. In other areas of math, there is a sort of duality between functions and sets via characteristic functions. For example, the indicator function of a measurable set is a measurable function. And the indicator function of a convex set is a convex function. But in recursive functions, there’s an unexpected wrinkle in this analogy.

A set of integers is recursively enumerable if it is either empty or the image of a general recursive function. But there’s a theorem, due to Alonzo Church, that a non-empty recursively enumerable set is actually the image of a primitive recursive function. So although general recursive functions are more general, you can’t tell that from looking at function images. For example, although the Ackermann function is not primitive recursive, there is a primitive recursive function with the same image.