oplss2024/pfenning/notes.typ
2024-06-07 10:22:38 -04:00

638 lines
17 KiB
Plaintext

#import "@preview/fletcher:0.4.5" as fletcher: diagram, node, edge
#import "../common.typ": *
#import "@preview/prooftrees:0.1.0": *
#import "@preview/algo:0.3.3": algo, i, d, comment, code
#show: doc => conf("Adjoint Functional Programming", doc)
#let n(t) = text(fill: rgb("#aa3333"))[#t]
#let evalto = $arrow.r.hook$
#let with = $op(\&)$
== Lecture 1
- Origins of linearity is from linear logic
- from 1987 or before
#let isValue(x) = $#x "value"$
Language to be studied is called "snax"
- Features
- Substructural programming
- Inference
- Overloading? writing the same function that works both linearly and non-linearly
- "Proof-theoretic compiler"
- Everything up til the target C is either natural deduction, sequent calculus, etc.
- Types
- "if there's something you don't need in the language, don't put it in the language"
- no empty type
- unit type
- 1
- #tree(axi[], uni[$() : 1$])
- #tree(axi[], uni[$isValue(())$])
- +
- #tree(axi[$e : A$], uni[$"inl" e : A + B$])
- #tree(axi[$e : B$], uni[$"inr" e : A + B$])
- #tree(axi[$isValue("v")$], uni[$isValue("inl" v)$])
- #tree(axi[$isValue("v")$], uni[$isValue("inr" v)$])
- $A + B :equiv +\{"inl":A, "inr": B\}$
- label set $L eq.not emptyset$ , finite
- finiteness is important
- nat = $+\{ "zero" : 1 , "succ" : "nat" \}$
- recursion is required to define an infinite amount of things
- "equirecursive" types
- nat is "equal" to the recursive definition in some sense
- k
- $\{(k \in l) e : A_e}{k(e) : +\{l : A_l\}_{l : L}}$
- $ceil(0) = "zero" ()$
- Computational rules
Define negation of booleans
$"not" (x : "bool") : "bool" \
"not" x = "match" x "with" \
| "true" u arrow.r "false" u\
| "false" u arrow.r "true" u \
$
Function definitiosn like these operate at the meta level.
Avoid introducing function types today.
Linear functional programming means *every variable must be used exactly once.*
*NOTE:* Above cannot be defined $"true" u arrow.r "false" ()$, since $u$ is not used.
DOn't need a garbage collector for linear functional programming language
#tree(
axi[$e : +{l : A_l}_(l in L)$],
axi[$x : A_l$],
axi[$tack.r e_l : C(l in L)$],
nary(3)[$"match" e "with" (l(x) arrow.r.double e_l)_(l in L) : C$],
)
Cannot write a function $"duplicate"(x) = (x , x)$ because of linear programming
#tree(
axi[$Delta tack.r e_1: A$],
axi[$Gamma tack.r e_2 : B$],
bin[$Delta,Gamma tack.r (e_1, e_2) : A times B$],
)
so introducing contexts
$Gamma :equiv (dot) | Gamma , x : A$
where the variable $x$ must be unique.
#tree(axi[], uni[$x : A tack.r x : A$])
*NOTE:* Left side is $x : A$, not $Gamma , x : A$ because those others would be unused, violating the single use rule.
#tree(
axi[$Delta , e : A times B$],
axi[$Gamma , x : A, y : B tack.r e' : C$],
nary(2)[$Delta , Gamma tack.r "match" e "with" (x , y) => e' : C$],
)
Have to split your variables to check $e$ and $e'$. This is not the way to implement it.
To implement it, check $"FV"(e)$ and $"FV"(e')$. These correspond to $Delta$ and $Gamma$ respectively. This is expensive.
- An easier way is to have each judgement give back another context containing the variables not use. Called a "subtractive" approach.
- Additive approach returns the variables used, and then check to make sure variables don't exist on the output.
GOing backk
#tree(
axi[],
uni[$dot tack.r () : 1$],
)
// #tree(
// axi[$Delta tack.r e : +{l : A_l}_(l in L)$],
// axi[$Gamma $]
// )
Defining plus:
-
$"plus" (x : "nat") (y : "nat") : "nat" \
"plus" x " " y = "match" x "with" \
| "zero" () arrow.r.double #text(fill: red)[$"zero" ()$] \
| "succ" x' arrow.r.double "succ" ("plus" x' y)
$
This is problematic because it doesn't use up $y$ in the first branch.
-
$"plus" (x : "nat") (y : "nat") : "nat" \
"plus" x " " y = "match" x "with" \
| "zero" () arrow.r.double y \
| "succ" x' arrow.r.double "succ" ("plus" x' y)
$
This uses up all inputs exactly.
#tree(
axi[$Gamma tack.r e : A_k (k in L)$],
uni[$Gamma tack.r k(e) : +{l : A_l}_(l in L)$],
)
#pagebreak()
=== Lecture 2
Type system
- Proving something different than type soundness; we want to demonstrate that no garbage is created.
- Set up a close correspondence between the static rules for inference and dynamic rules for evaluation.
- This should be _very_ close ideally.
At runtime, if you have $Gamma tack.r e : A$
- $e$ is what you're executing
- you don't want to carry $A$ around during runtime
- $Gamma$ corresponds to a set of variables (you need this) $eta : Gamma$
#rect[$eta : Gamma$]
#tree(axi[], uni[$(dot) : (dot)$])
#tree(axi[$eta : Gamma$], axi[$dot tack.r v : A$], bin[$eta , x mapsto v : Gamma , x : A$])
("environment" refers to $eta$, while context refers to $Gamma$)
Runtime values are _always_ closed.
*Theorem.*
Under the conditions $Gamma tack.r e : A$ and $eta : Gamma$, then $eta tack.r e arrow.r.hook v$ and $dot tack.r v : A$.
However, this rule is problematic:
#tree(
axi[$Delta tack.r e_1 : A$],
axi[$Gamma tack.r e_2 : B$],
bin[$Delta,Gamma tack.r (e_1, e_2) : A times B$]
)
This evaluates to:
#tree(
axi[$e_1 arrow.r.hook v_1$],
axi[$e_2 arrow.r.hook v_2$],
bin[$eta tack.r (e_1, e_2) arrow.r.hook (v_1, v_2)$],
)
How to split the environment in this case?
It's not feasible to split the free variables.
One idea is we can just pass the whole environment into both sides:
#tree(
axi[$eta tack.r e_1 arrow.r.hook v_1$],
axi[$eta tack.r e_2 arrow.r.hook v_2$],
bin[$eta tack.r (e_1, e_2) arrow.r.hook (v_1, v_2)$],
)
The typechecker tells us that everything is used up correctly, so at runtime we can use this assumption.
Using subtractive approach, this looks like:
#tree(
axi[$eta tack.r e_1 arrow.r.hook v_1 #n[$tack.l eta_1$]$],
axi[$#n[$eta_1$] tack.r e_2 arrow.r.hook v_2 #n[$tack.l eta_2$]$],
bin[$#n[$eta_2$] tack.r (e_1, e_2) arrow.r.hook (v_1, v_2)$],
)
The subtractive approach has problems:
- When you execute your program, you are forced to go left-to-right. You can't run in parallel.
Additive approach:
#rect[$Gamma tack.r e: A tack.l Omega$]
where $Omega$ are the variables that are _actually_ used
For example, rules for pairs:
#tree(
axi[$Gamma tack.r e_1 : A tack.l Omega_1$],
axi[$Gamma tack.r e_2 : B tack.l Omega_2$],
bin[$Gamma tack.r (e_1,e_2) : A times B tack.l Omega_1, Omega_2$],
)
The $Omega_1 , Omega_2$ disjoint union, and is undefined if they share a variable.
The runtime rule corresponds to:
#tree(
axi[$eta tack.r e_1 arrow.r.hook v_1 tack.l omega_1$],
axi[$eta tack.r e_2 arrow.r.hook v_2 tack.l omega_2$],
bin[$eta tack.r (e_1, e_2) arrow.r.hook (v_1, v_2) tack.l omega_1, omega_2$]
)
If we typechecked correctly, $omega_1$ and $omega_2$ would be disjoint as well.
*Soundness.* If $Gamma tack.r e : A tack.l Omega$ then $Omega tack.r e : A$ ($Omega subset Gamma$)
*Completeness.* If $Omega tack.r e : A$ and $Gamma supset Omega$ then $Gamma tack.r e : A tack.l Omega$
The $Omega tack.r e : A$ is a _precise_ judgement: $Omega$ _only_ contains the variables used in $e$.
You would need two different sets of typing rules, one that has the $Omega$ rules and one that doesn't.
Prove this by rule induction.
Change the original theorem:
*New Theorem.*
Under the conditions $Gamma tack.r e : A tack.l Omega$ and $eta : Gamma$ and $omega : Omega$,
then $eta tack.r e arrow.r.hook v tack.l omega$ and $dot tack.r v : A$.
This can now be proven with the new dynamics.
How to prove that there's no garbage?
For all top level computations, $eta tack.r e arrow.r.hook v tack.l eta$.
This means that everything is used.
This can be proven using rule induction.
Updated theorem, to point out that $e arrow.r.hook v$ may not necessarily be true because termination is not proven by rule induction.
*New Theorem.*
Under the conditions $Gamma tack.r e : A tack.l Omega$ and $eta : Gamma$ and $omega : Omega$,
#n[and $eta tack.r e arrow.r.hook v tack.l omega$] then $dot tack.r v : A$.
(under affine logic, the top-level judgment $Gamma tack.r e : A tack.l Omega$ requires only that $Omega subset Gamma$, not that $Omega = Gamma$)
Example of evaluation rule that matches the typing rule:
#tree(
axi[$eta tack.r e arrow.r.hook (v_1, v_2) tack.l omega$],
axi[$eta, x mapsto v_1 , y mapsto v_2 tack.r e' arrow.r.hook v' tack.l (omega', x mapsto v_1, y mapsto v_2)$],
bin[$eta tack.r "match" e "with" (x, y) arrow.r.double e' arrow.r.hook v' tack.l (omega, omega')$]
)
==== Looking at typing rules as logic
#rect[Natural Deduction]
$Gamma ::= (dot) | Gamma , A$
#tree(
axi[],
uni[$A tack.r A$]
)
#tree(
axi[$Delta tack.r A$],
axi[$Gamma tack.r B$],
bin[$Delta, Gamma, tack.r A times B$]
)
#tree(
axi[$Delta tack.r A times B$],
axi[$Gamma, A, B tack.r C$],
bin[$Delta, Gamma tack.r C$]
)
These are like a rule of logic. For plus:
#tree(
axi[$Delta tack.r A$],
uni[$Delta tack.r A + B$],
)
#tree(
axi[$Delta tack.r B$],
uni[$Delta tack.r A + B$],
)
Proof by cases:
#tree(
axi[$Delta tack.r A + B$],
axi[$Gamma , A tack.r C$],
axi[$Gamma , B tack.r C$],
tri[$Delta , Gamma tack.r C$],
)
Every assumption has to be used _exactly_ once. This is called *linear logic*.
(notation uses $times.circle$ and $plus.circle$ instead of $times$ and $plus$)
Linear logic is weak by itself, just as linear type system is weak without global definitions.
Summary of operators:
#image("lec1.jpg")
$!A$ is read "of course A", lets you re-use assumptions. We don't have this except by top-level definitions.
In the judgement $Sigma ; Gamma tack.r e : A$, $Sigma$ contains definitions that you can use however many times you want, and $Gamma$ contains the definitions that are created linearly.
Distinction between positive and negative types:
- Lambdas cannot be pattern-matched against, you have to apply it.
- However, for $times.circle$ and $plus.circle$ you can directly observe their structure.
In this case, $A with B$, read "A with B":
#tree(
axi[$Gamma tack.r A$],
axi[$Gamma tack.r B$],
bin[$Gamma tack.r A with B$],
)
This is sound because only one of them can be extracted:
#tree(
axi[$Gamma tack.r e_1:A$],
axi[$Gamma tack.r e_2:B$],
bin[$Gamma tack.r angle.l e_1,e_2 angle.r : A with B$],
)
#tree(
axi[$Gamma tack.r e : A with B$],
uni[$Gamma tack.r e.pi_1 : A$]
)
#tree(
axi[$Gamma tack.r e : A with B$],
uni[$Gamma tack.r e.pi_2 : B$]
)
"Lazy pair" you can only extract one side at a time. There are also "lazy records":
$with { l : A_l}_(l in L)$
#tree(
axi[$Gamma tack.r e_l : A_l (forall l in L)$],
uni[$Gamma tack.r {l = e_l}_(l in L) : with {l : A_l}_(l in L)$]
)
#tree(
axi[$Gamma tack.r e : with { l : A_l}_(l in L) (forall l in L)$],
uni[$Gamma tack.r e.k : A_k (k in L)$]
)
#pagebreak()
=== Lecture 3
- negation
- Mixing linear & non-linear programs
- Mode checking & inference
#diagram((
node((0, -0.8), "Unrestricted"),
edge("-|>"),
node((0.8, 0), "Strict (at least once)"),
node((-0.8, 0), "Affine (at most once)"),
edge("-|>"),
node((0, 0.8), "Linear"),
edge((0, -0.8), (-0.8, 0), "-|>"),
edge((0.8, 0), (0, 0.8), "-|>"),
))
Cannot write map:
```
fail decl map (f : nat -> nat) (xs : list) : list
defn map f xs = match xs with
| 'nil () => 'nil ()
| 'cons (x, xs) => 'cons (f x, map f xs)
```
This is because $f$ isn't used in the first line but used twice in the second line.
We could write some iterator:
```
type iterator = &{'next : nat -> nat * iterator, 'done : 1}
decl iterate (iter : iterator) (xs : list) : list
defn iterate iter xs = match xs with
| 'nil () => (match iter.'done with | () => 'nil ())
| 'cons (x, xs) => (match iter.'next x with | (y, iter) => 'cons ('succ y, iterate iter xs))
```
This is a linear implementation of a function that adds 1 to everything in the list.
=== Modes
Take the entire language and parameterize by modes:
$A, B_m ::= 1_m &| A_m times B_m | +{l : A^l_m}_(l in L) | arrow.b^k_m A_k (k >= m) \
&| A_m arrow B_m | \&{l:A^l_m}_(l in L) | arrow.t^i_m A_i (i <= m)
$
These are implemented in the code:
```
type nat[k] = +{'zero : 1, 'succ : nat[k]}
```
Modes need to be _guarded_. For example:
```
type nat[k] = +{'zero : 1, 'succ : down[k] nat[k]}
type list[m k] = +{
'nil : 1,
'cons : down[k] nat[k] * down[m] list[m k],
}
```
The `m` is used because it comes first.
$ (!A_L)_L eq.delta arrow.b^U_L arrow.t^U_L A_L$
needs a partial ordering on U and L. Need to copy when it's in the U and then move it back into the L when you're done.
```
decl map (f : [mf] up[k] (nat[k] -> nat[k])) (xs : list[m k]) : list[r k]
defn map f xs = match xs with
| 'nil () => 'nil ()
| 'cons (<x>, <xs>) => 'cons(<f.force x>, <map f xs>)
```
$Gamma$ can be multi-modal. This is how top-level declarations can be re-used.
https://www.cs.cmu.edu/~fp/papers/tocl07.pdf
https://arxiv.org/pdf/2402.01428v1
Need to enforce independence, that $Gamma tack.r e : A_m$ means $Gamma >= m$
Pointer:
#let wrap(e) = $angle.l #e angle.r $
#tree(
axi[$Gamma >= k$],
axi[$Gamma tack.r e : A_k$],
bin[$Gamma tack.r wrap(e) : arrow.b^k_m A_k$]
)
*NOTE:* The bottom would not be valid if $Gamma cancel(>=) k$
#tree(
axi[$Delta >= m >= r$],
axi[$Delta tack.r e : arrow.b^k_m A_k$],
axi[$Gamma , x : A_k tack.r e' : C_r$],
nary(3)[$Delta Gamma tack.r "match" e with wrap(x) => e' : C_r$],
)
#pagebreak()
== Lecture 4
SAX : Semi-Axiomatic Sequence Calculus
$ f : A_L -> B_L , x : A_L tack.r C_U $
So we are using linear resources to create an unrestricted resource.
Can't actually build this because $Gamma >= m$ and $U > L$.
So in this case we woudl want to build:
$ f : A_L -> B_L , x : A_L tack.r arrow.b^U_L C_U $
=== Linear logic
Two modes: $U > L$, $!A = op(arrow.b)^U_L op(arrow.t)^U_L A$
Linear + non-linear logic (LNL) (Benton '95)
#let downshift(a, b) = $op(arrow.b)^#a_#b$
#let upshift(a, b) = $op(arrow.t)^#b_#a$
JS$._4$ $V > T : square A = op(arrow.b)^V_T op(arrow.t)^V_T A$ (comonad) \
Lax logic: $T > L : circle A = op(arrow.t)^T_L op(arrow.b)^T_L A$ (monad)
If you wanted proof irrelevance, you would have proof relevance and proof irrelevance as modes.
#pagebreak()
== Lecture 5
#let cell(a,V) = $"cell" #a " " #V$
#let proc = "proc"
#let cut = "cut"
#let write = "write"
#let read = "read"
#let call = "call"
Recall:
Types:
$A times B, & A -> B \
1, \
+{l: A_l}, & \&{l : A_l} \
arrow.b A, & arrow.t A $
Continuations: $K ::= (x, y) => P | () => P | (l(x) => P_l)_(l in L) | wrap(x) => P$
Values: $V ::= (a, b) | () | k(a) | wrap(a)$
Programs: $P ::= write c S | read c S | cut x P ; Q | id a " " b | call f " " a$
Storable: $S ::= V | K$
=== Dynamics
Format: $cell a_1 V_1, cell a_2 V_2, ..., proc P_1, proc P_2$
- Cells are not ordered, but the procedures are ordered
- Only writing to empty cells, mutating a cell is a different process
Rules:
- $proc (cut x P(x); Q(x)) mapsto cell(a, square), proc (P(a)), proc (Q(a))$
- $cell(a, square), proc (write a S) mapsto cell(a, S)$
- $cell(a, square), cell(b, S) , proc (id a b) mapsto cell(a, V)$
- $cell(c, S), proc (read c k) mapsto proc (S triangle.r S)$
- where
- $(a, b) &triangle.r ((x, y) => P(x, y)) = P(a, b) \
() &triangle.r (() => P) = P \
k(a) &triangle.r (l(x) => P_l(x))_(l in L) \
wrap(a) &triangle.r (wrap(x) => P(x)) = P(a)
$
Remarks: this is substructural because if you have the left side, then the arrow is a linear arrow.
#let linarrow = $multimap$
#tree(
axi[$Gamma,A tack.r B$],
uni[$Gamma tack.r A linarrow B$]
)
#tree(
axi[$Gamma, x tack.r P :: (y : B)$],
uni[$Gamma tack.r write c ((x, y) => P) :: (c : A linarrow B)$],
)
#tree(
axi[],
uni[$A, A linarrow B tack.r B$]
)
#tree(
axi[],
uni[$a : A, c : A linarrow B tack.r read c (a,b) :: (b : B)$]
)
#tree(
axi[$Gamma tack.r A$],
axi[$Gamma tack.r B$],
bin[$Gamma tack.r A & B$],
)
#tree(
axi[],
uni[$A & B tack.r A$],
)
#tree(
axi[],
uni[$A & B tack.r B$],
)
#tree(
axi[$(forall l in L) (Gamma tack.r P_l : (x : A_l))$],
uni[$Gamma tack.r write c (l(x) => P) :: (c : &{l : A_l}_(l in L))$],
)
#tree(
axi[$k in L$],
uni[$c : &{l:A_l}_(l in L) tack.r read c (k(a)) :: (a : A_k)$]
)
The final configuration of a program is all programs are used, all cells are used
$ dot tack.r P :: (d_0 : A) $
Cut elimination is not compatible with some semi-axiomatic sequent calculus.
=== Recover from cut elimination
https://www.cs.cmu.edu/~fp/papers/fscd20a.pdf \
https://www.cs.cmu.edu/~fp/papers/mfps22.pdf
Can't eliminate all cuts, cannot eliminate cuts that are subformulas (SNIPS) of the things you're trying to prove.
snip rule
#tree(axi[], uni[$underline(A), underline(B) tack.r A times.circle B$])
#tree(axi[], uni[$underline(A) tack.r A plus.circle B$])
#tree(axi[], uni[$underline(B) tack.r A plus.circle B$])
#tree(
axi[$Delta tack.r A$],
axi[$Gamma , underline(A) tack.r C$],
bin[$Delta, Gamma tack.r C$]
)