A bit more

This commit is contained in:
Michael Zhang 2022-02-02 04:15:05 -06:00 committed by Michael Zhang
parent 34ea75029b
commit 44ba9624f7
3 changed files with 108 additions and 21 deletions

View File

@ -80,6 +80,7 @@ blockquote {
color: $small-text-color;
border-left: 4px solid $small-text-color;
padding-left: 12px;
font-size: 0.9rem;
}
.postlisting-row td {
@ -114,7 +115,7 @@ code {
font-family: $monofont;
font-size: 0.9em;
box-sizing: border-box;
padding: 3px;
padding: 1px 5px;
background-color: $faded-background-color;
color: $code-color;
}

View File

@ -18,8 +18,8 @@ $monofont: "PragmataPro Mono Liga", "Roboto Mono", "Roboto Mono for Powerline",
$faded-background-color: darken($background-color, 10%);
$heading-color: #15202B;
$text-color: #15202B;
$small-text-color: lighten($text-color, 10%);
$smaller-text-color: lighten($text-color, 20%);
$small-text-color: #6e707f;
$smaller-text-color: lighten($text-color, 30%);
$faded: lightgray;
$link-color: royalblue;
$code-color: firebrick;

View File

@ -1,5 +1,5 @@
+++
title = "a formal cek machine in agda"
title = "building a formal cek machine in agda"
draft = true
date = 2022-02-02
tags = ["computer-science", "programming-languages", "formal-verification", "lambda-calculus"]
@ -17,23 +17,39 @@ proofs of certain properties.
My lambda calculus implemented `call/cc` on top of a CEK machine.
## Foreword
<details>
<summary><b>Why is this interesting?</b></summary>
**Why is this interesting?** Reasoning about languages is one way of ensuring
whole-program correctness. Building up these languages from foundations grounded
in logic helps us achieve our goal with more rigor.
Reasoning about languages is one way of ensuring whole-program correctness.
Building up these languages from foundations grounded in logic helps us
achieve our goal with more rigor.
As an example, suppose I wrote a function that takes a list of numbers and
returns the maximum value. Mathematically speaking, this function would be
_non-total_; an input consisting of an empty set would not produce reasonable
output! If this were a library function I'd like to tell people who write code
that uses this function "don't give me an empty list!"
As an example, suppose I wrote a function that takes a list of numbers and
returns the maximum value. Mathematically speaking, this function would be
_non-total_; an input consisting of an empty set would not produce reasonable
output. If this were a library function I'd like to tell people who write code
that uses this function "don't give me an empty list!"
Unfortunately, just writing this in documentation isn't enough. What we'd really
like is for a tool (like a compiler) to tell any developer who is trying to pass
an empty list into our maximum function "You can't do that." Unfortunately, most
of the popular languages being used today have no way of describing "a list
that's not empty."
Unfortunately, just writing this in documentation isn't enough. What we'd
really like is for a tool (like a compiler) to tell any developer who is
trying to pass an empty list into our maximum function "You can't do that."
Unfortunately, most of the popular languages being used today have no way of
describing "a list that's not empty."
We still have a way to prevent people from running into this problem, though
it involves pushing the problem to runtime rather than compile time. The
maximum function could return an "optional" maximum. Some languages'
implementations of optional values force programmers to handle the "nothing"
case, while others ignore it silently. But in the more optimistic case, even
if the list was empty, the caller would have handled it and treated it
accordingly.
This isn't a pretty way to solve this problem. _Dependent types_ gives us
tools to solve this problem in an elegant way, by giving the type system the
ability to contain values. This also opens its own can of worms, but for
questions about program correctness, it is more valuable than depending on
catching problems at runtime.
</details>
## Lambda calculus
@ -87,10 +103,80 @@ the language means we don't need the `s` and `z` dance to refer to them).
As I noted above, the lambda calculus is _Turing-complete_. One feature of
Turing complete systems is that they have a (provably!) unsolvable "halting"
problem.
problem. Most of the simple term shown above terminate predictably. But as an
example of a term that doesn't halt, consider the _Y combinator_, an example of
a fixed-point combinator:
### Simply-typed lambda calculus (STLC)
Y = λf.(λx.f(x(x)))(λx.f(x(x)))
If you tried calling Y on some term, you will find that evaluation will quickly
expand infinitely. That makes sense given its purpose: to find a _fixed point_
of whatever function you pass in.
> As an example, the fixed-point of the function f(x) = sqrt(x) is 1. That's
> because f(1) = 1. The Y combinator attempts to find the fixed point by simply
> applying the function multiple times. In the untyped lambda calculus, this can
> be used to implement simple (but possibly unbounded) recursion.
Because there are terms that may not terminate, the untyped lambda calculus is
not very useful for logical reasoning. Instead, we add some constraints on it
that makes evaluation total, at the cost of losing Turing-completeness.
### Simply-typed lambda calculus
The simply-typed lambda calculus (STLC) adds types to every term. Types are
crucial to any kind of static program analysis. Suppose I was trying to apply
the term `5` to `6`. As humans we can look at that and instantly recognize that
the evaluation would be invalid, yet under the untyped lambda calculus, it would
be completely representable.
To solve this in STLC, we make this term completely unrepresentable at all. To
say you want to apply 5 to 6 would not be a legal STLC term. That's because all
STLC terms are untyped lambda calculus terms accompanied by a _type_.
This gives us more information about what's allowed before we run the
evaluation. For example, numbers may have their own type `Nat` (for "natural
number"), while functions have a special "arrow" type `_ -> _`, where the
underscores represent other types. A function that takes a number and returns a
boolean (like isEven) would have the type `Nat -> Bool`, while a function that
takes a boolean and returns another boolean would be `Bool -> Bool`.
With this, we have a framework for rejecting terms that would otherwise be legal
in untyped lambda calculus, but would break when we tried to evaluate them. A
function application would be able to require that the argument is the same type
as what the function is expecting.
A semi-formal definition for STLC terms would look something like this:
- **Var.** Same as before, it's a variable that can be looked up in the
environment.
- **Abstraction, or lambda (λ).** This is a function that carries three pieces
of information: (1) the name of the variable that its input will be
substituted for, (2) the _type_ of the input, and (3) the body in which the
substitution will happen.
- **Application.** Same as before.
It doesn't seem like much has changed. But all of a sudden, _every_ term has a
type.
- `5 :: Nat`
- `λ(x:Nat).2x :: Nat -> Nat`
- `isEven(3) :: (Nat -> Bool) · Nat = Bool`
Notation: (`x :: T` means `x` has type `T`, and `f · x` means `f` applied to
`x`)
This also means that some values are now unrepresentable:
- `isEven(λx.2x) :: (Nat -> Bool) · (Nat -> Nat)` doesn't work because the type
of `λx.2x :: Nat -> Nat` can't be used as an input for `isEven`, which is
expecting a `Nat`.
We have a good foundation for writing programs now, but this by itself can't
qualify as a system for computation.
## CEK machine
A CEK machine
A CEK machine is responsible for evaluating a lambda calculus term.