Evidence producing type-checker for simply typed λ-calculus


This was an experiment in building an interpreter with a few goals:

  1. Investigate the relationship between untyped and bi-directional well-typed syntax.
  2. Use well-typed syntax for evaluation.
  3. Use the Partiality monad to define a big-step interpreter.
  4. Define a notion of partial type-soundness.


A further goal was an attempt to illustrate that—through dependent types—the task of defining an interpreter for a language and proving the languages meta-theoretic properties are part of the same activity, exhibiting this duality through what is known as the Curry-Howard correspondence.

This approach is broken down into four phases/files as described below.

Static Semantics Specification

The bi-directional typing judgement is defined as an inductive family indexed by directionality and untyped terms. This judgement serves the rôle of well-typed term syntax.

Static Semantics Proof

Once well-typed syntax is defined, we can relate untyped syntax to well-typed syntax by defining the familiar mutually recursive infer and check functions for bi-directional systems:

infer :  {n} (Γ : Ctx n) t
           ( λ τ  Γ [ inf ] t  τ)
         ¬ ( λ τ  Γ [ inf ] t  τ)

check :  {n} (Γ : Ctx n) t τ
            Γ [ chk ] t  τ
         ¬ (Γ [ chk ] t  τ)

Traditionally, check takes a term and type and produces a boolean while infer takes a term and produces a type. Here these functions are modified to produce evidence in the form of a typing derivation or proof that such a derivation does not exist.

Given the disjunction property of intuitionistic logics, these functions serve as proofs of the decidability of type-inference and type-checking—two important and desirable meta-theoretic results for type-systems designed for programming languages.

Through the Curry-Howard correspondence, recall that typing derivations for us are the same thing as well-typed terms. Therefore check and infer can also be understood as taking typable untyped terms to well-typed terms.

Thus the Curry-Howard duality gives us two things at once:

  1. Type-checking and type-inference (the first phase of an interpreter)
  2. A proof that the type-system we are using is decidable.

We gain something further corollary to (1) and (2):

  1. A way to lift typable untyped terms to well-typed terms

Interestingly, because of (3) we can define our interpreter to work only on well-typed terms. This obviates the need for us to even define behavior for impossible scenarios where we would otherwise be interpreting untypable terms that never passed the type-checking phase.

Dynamic Semantics Specification

Next the dynamic semantics of the well-typed fragment of the language is defined in terms of big-step operational semantics. Although small-step semantics is more popular in general, big-step semantics is closer to the actual implementation of interpreters in practice.

Dynamic Semantics Proof

Finally we define an eval function that relates well-typed terms to values. In our language, values are by definition only well-typed.

eval :  {n} {Γ : Ctx n} {μ : Mode} {t : term μ n} {τ}
      (μt : Γ [ μ ] t  τ)
      (ρ  : Env Γ)
      (Σ (Value τ) (λ v  ρ [ μ ] μt  v)) 

Our eval function is more sophisticated than the traditional version in that it outputs—in addition to values—a proof that the output is related to the input according to the big-step operational semantics we specified, and furthermore than the resultant value retains the same type as the input term.

Again, the Curry-Howard correspondence gives us a few things:

  1. Evaluation of well-typed terms to values (the second phase of an interpreter)
  2. A proof that the evaluator respects the big-step operational semantics
  3. A proof that the evaluator preserves typing from terms to values

Typically one proves type-soundness for a language through the following:

  • progress – well-typed terms are either values or can transition by a small-step
  • preservation – well-typed terms transitioning by a small-step retain their type

Through (1) and (2) we have something, not quite the same as, but similar to progress. With a caveat discussed below, our property is stronger in that all well-typed terms transition in one big-step all the way to values. The evaluator retains the type of the input term in the value that it outputs, providing us an analogue of preservation.

However, there is a caveat. Our evaluator returns a values and proofs of big-step derivations in the Partiality monad. We use this monad because, as written, the evaluator is not obviously terminating and would otherwise be rejected by the Agda termination checker. Although we know from the bigger picture that well-typed terms in an STLC will terminate under correct dynamic semantics, the Agda termination checker does not have enough information to deduce this fact.

Logically, monads behave like modalities–they modify the meaning of types taken as propositions. This means that our version of type-soundness encoded by the type of the evaluator is only specified up to termination of eval for given input. I call this partial type-soundness.

If we prove that eval will always terminate, as we know it should, we would be able to use this proof to extract our type-soundness proof from the Partiality monad, letting us derive total type-soundness. Such an extraction method would work by generating for each well-typed term a proof expressing the finite length of time the evaluator would have to run to output the value of the term and then actually running the evaluator only for that finite period. There are many possible ways one could go about implementing this functionality but it is left for the future.