oplss2024/pfenning/notes.typ

#import "@preview/fletcher:0.4.5" as fletcher: diagram, node, edge
#import "../common.typ": *
#import "@preview/prooftrees:0.1.0": *
#import "@preview/algo:0.3.3": algo, i, d, comment, code
#show: doc => conf("Adjoint Functional Programming", doc)

#let n(t) = text(fill: rgb("#aa3333"))[#t]
#let evalto = $arrow.r.hook$
#let with = $op(\&)$

== Lecture 1

- Origins of linearity is from linear logic
- from 1987 or before

#let isValue(x) = $#x "value"$

Language to be studied is called "snax"
	- Features
		- Substructural programming
		- Inference
			- Overloading? writing the same function that works both linearly and non-linearly
		- "Proof-theoretic compiler"
			- Everything up til the target C is either natural deduction, sequent calculus, etc.
	- Types
		- "if there's something you don't need in the language, don't put it in the language"
		- no empty type
		- unit type
			- 1
				- #tree(axi[], uni[$() : 1$])
				- #tree(axi[], uni[$isValue(())$])
			- +
        - #tree(axi[$e : A$], uni[$"inl" e : A + B$])
        - #tree(axi[$e : B$], uni[$"inr" e : A + B$])
				- #tree(axi[$isValue("v")$], uni[$isValue("inl" v)$])
				- #tree(axi[$isValue("v")$], uni[$isValue("inr" v)$])
			- $A + B :equiv +\{"inl":A, "inr": B\}$
			- label set $L eq.not emptyset$ , finite
				- finiteness is important
			- nat = $+\{ "zero" : 1 , "succ" : "nat" \}$
				- recursion is required to define an infinite amount of things
					- "equirecursive" types
						- nat is "equal" to the recursive definition in some sense
			- k
				- $\{(k \in l) e : A_e}{k(e) : +\{l : A_l\}_{l : L}}$
			- $ceil(0) = "zero" ()$

  - Computational rules

    Define negation of booleans

    $"not" (x : "bool") : "bool" \
    "not" x = "match" x "with" \
    | "true" u arrow.r "false" u\
    | "false" u arrow.r "true" u \
    $

    Function definitiosn like these operate at the meta level.
    Avoid introducing function types today.

    Linear functional programming means *every variable must be used exactly once.*

    *NOTE:* Above cannot be defined $"true" u arrow.r "false" ()$, since $u$ is not used.

    DOn't need a garbage collector for linear functional programming language

    #tree(
      axi[$e : +{l : A_l}_(l in L)$],
      axi[$x : A_l$],
      axi[$tack.r e_l : C(l in L)$],
      nary(3)[$"match" e "with" (l(x) arrow.r.double e_l)_(l in L) : C$],
    )

    Cannot write a function $"duplicate"(x) = (x , x)$ because of linear programming

    #tree(
      axi[$Delta tack.r e_1: A$],
      axi[$Gamma tack.r e_2 : B$],
      bin[$Delta,Gamma tack.r (e_1, e_2) : A times B$],
    )

    so introducing contexts

    $Gamma :equiv (dot) | Gamma , x : A$

    where the variable $x$ must be unique.

    #tree(axi[], uni[$x : A tack.r x : A$])

    *NOTE:* Left side is $x : A$, not $Gamma , x : A$ because those others would be unused, violating the single use rule.

    #tree(
      axi[$Delta , e : A times B$],
      axi[$Gamma , x : A, y : B tack.r e' : C$],
      nary(2)[$Delta , Gamma tack.r "match" e "with" (x , y) => e' : C$],
    )

    Have to split your variables to check $e$ and $e'$. This is not the way to implement it.

    To implement it, check $"FV"(e)$ and $"FV"(e')$. These correspond to $Delta$ and $Gamma$ respectively. This is expensive.

    - An easier way is to have each judgement give back another context containing the variables not use. Called a "subtractive" approach.

    - Additive approach returns the variables used, and then check to make sure variables don't exist on the output.

    GOing backk

    #tree(
      axi[],
      uni[$dot tack.r () : 1$],
    )

    // #tree(
    //   axi[$Delta tack.r e : +{l : A_l}_(l in L)$],
    //   axi[$Gamma $]
    // )

  Defining plus:

  -
    $"plus" (x : "nat") (y : "nat") : "nat" \
    "plus" x " " y = "match" x "with" \
    | "zero" () arrow.r.double #text(fill: red)[$"zero" ()$] \
    | "succ" x' arrow.r.double "succ" ("plus" x' y)
    $

    This is problematic because it doesn't use up $y$ in the first branch.

  -
    $"plus" (x : "nat") (y : "nat") : "nat" \
    "plus" x " " y = "match" x "with" \
    | "zero" () arrow.r.double y \
    | "succ" x' arrow.r.double "succ" ("plus" x' y)
    $

    This uses up all inputs exactly.

  #tree(
    axi[$Gamma tack.r e : A_k (k in L)$],
    uni[$Gamma tack.r k(e) : +{l : A_l}_(l in L)$],
  )

#pagebreak()

=== Lecture 2

Type system

- Proving something different than type soundness; we want to demonstrate that no garbage is created.
- Set up a close correspondence between the static rules for inference and dynamic rules for evaluation.
  - This should be _very_ close ideally.

At runtime, if you have $Gamma tack.r e : A$
- $e$ is what you're executing
- you don't want to carry $A$ around during runtime
- $Gamma$ corresponds to a set of variables (you need this) $eta : Gamma$

#rect[$eta : Gamma$]

#tree(axi[], uni[$(dot) : (dot)$])
#tree(axi[$eta : Gamma$], axi[$dot tack.r v : A$], bin[$eta , x mapsto v : Gamma , x : A$])

("environment" refers to $eta$, while context refers to $Gamma$)

Runtime values are _always_ closed.

*Theorem.*
Under the conditions $Gamma tack.r e : A$ and $eta : Gamma$, then $eta tack.r e arrow.r.hook v$ and $dot tack.r v : A$.

However, this rule is problematic:

#tree(
  axi[$Delta tack.r e_1 : A$],
  axi[$Gamma tack.r e_2 : B$],
  bin[$Delta,Gamma tack.r (e_1, e_2) : A times B$]
)

This evaluates to:

#tree(
  axi[$e_1 arrow.r.hook v_1$],
  axi[$e_2 arrow.r.hook v_2$],
  bin[$eta tack.r (e_1, e_2) arrow.r.hook (v_1, v_2)$],
)

How to split the environment in this case?
It's not feasible to split the free variables.

One idea is we can just pass the whole environment into both sides:

#tree(
  axi[$eta tack.r e_1 arrow.r.hook v_1$],
  axi[$eta tack.r e_2 arrow.r.hook v_2$],
  bin[$eta tack.r (e_1, e_2) arrow.r.hook (v_1, v_2)$],
)

The typechecker tells us that everything is used up correctly, so at runtime we can use this assumption.
Using subtractive approach, this looks like:

#tree(
  axi[$eta tack.r e_1 arrow.r.hook v_1 #n[$tack.l eta_1$]$],
  axi[$#n[$eta_1$] tack.r e_2 arrow.r.hook v_2 #n[$tack.l eta_2$]$],
  bin[$#n[$eta_2$] tack.r (e_1, e_2) arrow.r.hook (v_1, v_2)$],
)

The subtractive approach has problems:

- When you execute your program, you are forced to go left-to-right. You can't run in parallel.

Additive approach:

#rect[$Gamma tack.r e: A tack.l Omega$]

where $Omega$ are the variables that are _actually_ used

For example, rules for pairs:

#tree(
  axi[$Gamma tack.r e_1 : A tack.l Omega_1$],
  axi[$Gamma tack.r e_2 : B tack.l Omega_2$],
  bin[$Gamma tack.r (e_1,e_2) : A times B tack.l Omega_1, Omega_2$],
)

The $Omega_1 , Omega_2$ disjoint union, and is undefined if they share a variable.

The runtime rule corresponds to:

#tree(
  axi[$eta tack.r e_1 arrow.r.hook v_1 tack.l omega_1$],
  axi[$eta tack.r e_2 arrow.r.hook v_2 tack.l omega_2$],
  bin[$eta tack.r (e_1, e_2) arrow.r.hook (v_1, v_2) tack.l omega_1, omega_2$]
)

If we typechecked correctly, $omega_1$ and $omega_2$ would be disjoint as well.

*Soundness.* If $Gamma tack.r e : A tack.l Omega$ then $Omega tack.r e : A$ ($Omega subset Gamma$)

*Completeness.* If $Omega tack.r e : A$ and $Gamma supset Omega$ then $Gamma tack.r e : A tack.l Omega$

The $Omega tack.r e : A$ is a _precise_ judgement: $Omega$ _only_ contains the variables used in $e$.

You would need two different sets of typing rules, one that has the $Omega$ rules and one that doesn't.
Prove this by rule induction.

Change the original theorem:

*New Theorem.*
Under the conditions $Gamma tack.r e : A tack.l Omega$ and $eta : Gamma$ and $omega : Omega$,
then $eta tack.r e arrow.r.hook v tack.l omega$ and $dot tack.r v : A$.

This can now be proven with the new dynamics.

How to prove that there's no garbage?
For all top level computations, $eta tack.r e arrow.r.hook v tack.l eta$.
This means that everything is used.
This can be proven using rule induction.

Updated theorem, to point out that $e arrow.r.hook v$ may not necessarily be true because termination is not proven by rule induction.

*New Theorem.*
Under the conditions $Gamma tack.r e : A tack.l Omega$ and $eta : Gamma$ and $omega : Omega$,
#n[and $eta tack.r e arrow.r.hook v tack.l omega$] then $dot tack.r v : A$.

(under affine logic, the top-level judgment $Gamma tack.r e : A tack.l Omega$ requires only that $Omega subset Gamma$, not that $Omega = Gamma$)

Example of evaluation rule that matches the typing rule:

#tree(
  axi[$eta tack.r e arrow.r.hook (v_1, v_2) tack.l omega$],
  axi[$eta, x mapsto v_1 , y mapsto v_2 tack.r e' arrow.r.hook v' tack.l (omega', x mapsto v_1, y mapsto v_2)$],
  bin[$eta tack.r "match" e "with" (x, y) arrow.r.double e' arrow.r.hook v' tack.l (omega, omega')$]
)


==== Looking at typing rules as logic

#rect[Natural Deduction]

$Gamma ::= (dot) | Gamma , A$

#tree(
  axi[],
  uni[$A tack.r A$]
)

#tree(
  axi[$Delta tack.r A$],
  axi[$Gamma tack.r B$],
  bin[$Delta, Gamma, tack.r A times B$]
)

#tree(
  axi[$Delta tack.r A times B$],
  axi[$Gamma, A, B tack.r C$],
  bin[$Delta, Gamma tack.r C$]
)

These are like a rule of logic. For plus:

#tree(
  axi[$Delta tack.r A$],
  uni[$Delta tack.r A + B$],
)

#tree(
  axi[$Delta tack.r B$],
  uni[$Delta tack.r A + B$],
)

Proof by cases:

#tree(
  axi[$Delta tack.r A + B$],
  axi[$Gamma , A tack.r C$],
  axi[$Gamma , B tack.r C$],
  tri[$Delta , Gamma tack.r C$],
)

Every assumption has to be used _exactly_ once. This is called *linear logic*.

(notation uses $times.circle$ and $plus.circle$ instead of $times$ and $plus$)

Linear logic is weak by itself, just as linear type system is weak without global definitions.

Summary of operators:

#image("lec1.jpg")

$!A$ is read "of course A", lets you re-use assumptions. We don't have this except by top-level definitions.

In the judgement $Sigma ; Gamma tack.r e : A$, $Sigma$ contains definitions that you can use however many times you want, and $Gamma$ contains the definitions that are created linearly.

Distinction between positive and negative types:

- Lambdas cannot be pattern-matched against, you have to apply it.
- However, for $times.circle$ and $plus.circle$ you can directly observe their structure.

In this case, $A with B$, read "A with B":

#tree(
  axi[$Gamma tack.r A$],
  axi[$Gamma tack.r B$],
  bin[$Gamma tack.r A with B$],
)

This is sound because only one of them can be extracted:

#tree(
  axi[$Gamma tack.r e_1:A$],
  axi[$Gamma tack.r e_2:B$],
  bin[$Gamma tack.r angle.l e_1,e_2 angle.r : A with B$],
)

#tree(
  axi[$Gamma tack.r e : A with B$],
  uni[$Gamma tack.r e.pi_1 : A$]
)

#tree(
  axi[$Gamma tack.r e : A with B$],
  uni[$Gamma tack.r e.pi_2 : B$]
)

"Lazy pair" you can only extract one side at a time. There are also "lazy records":

$with { l : A_l}_(l in L)$

#tree(
  axi[$Gamma tack.r e_l : A_l (forall l in L)$],
  uni[$Gamma tack.r {l = e_l}_(l in L) : with {l : A_l}_(l in L)$]
)

#tree(
  axi[$Gamma tack.r e : with { l : A_l}_(l in L) (forall l in L)$],
  uni[$Gamma tack.r e.k : A_k (k in L)$]
)

#pagebreak()

=== Lecture 3

- negation
- Mixing linear & non-linear programs
- Mode checking & inference

#diagram((
  node((0, -0.8), "Unrestricted"),
  edge("-|>"),
  node((0.8, 0), "Strict (at least once)"),
  node((-0.8, 0), "Affine (at most once)"),
  edge("-|>"),
  node((0, 0.8), "Linear"),
  edge((0, -0.8), (-0.8, 0), "-|>"),
  edge((0.8, 0), (0, 0.8), "-|>"),
))

Cannot write map:

```
fail decl map (f : nat -> nat) (xs : list) : list
defn map f xs = match xs with
  | 'nil () => 'nil ()
  | 'cons (x, xs) => 'cons (f x, map f xs)
```

This is because $f$ isn't used in the first line but used twice in the second line.

We could write some iterator:

```
type iterator = &{'next : nat -> nat * iterator, 'done : 1}

decl iterate (iter : iterator) (xs : list) : list
defn iterate iter xs = match xs with
  | 'nil () => (match iter.'done with | () => 'nil ())
  | 'cons (x, xs) => (match iter.'next x with | (y, iter) => 'cons ('succ y, iterate iter xs))
```

This is a linear implementation of a function that adds 1 to everything in the list.

=== Modes

Take the entire language and parameterize by modes:

$A, B_m ::= 1_m &| A_m times B_m | +{l : A^l_m}_(l in L) | arrow.b^k_m A_k (k >= m) \
&| A_m arrow B_m | \&{l:A^l_m}_(l in L) | arrow.t^i_m A_i (i <= m)
$

These are implemented in the code:

```
type nat[k] = +{'zero : 1, 'succ : nat[k]}
```

Modes need to be _guarded_. For example:

```
type nat[k] = +{'zero : 1, 'succ : down[k] nat[k]}

type list[m k] = +{
  'nil : 1,
  'cons : down[k] nat[k] * down[m] list[m k],
}
```

The `m` is used because it comes first.

$ (!A_L)_L eq.delta arrow.b^U_L arrow.t^U_L A_L$

needs a partial ordering on U and L. Need to copy when it's in the U and then move it back into the L when you're done.

```
decl map (f : [mf] up[k] (nat[k] -> nat[k])) (xs : list[m k]) : list[r k]
defn map f xs = match xs with
  | 'nil () => 'nil ()
  | 'cons (<x>, <xs>) => 'cons(<f.force x>, <map f xs>)
```

$Gamma$ can be multi-modal. This is how top-level declarations can be re-used.

https://www.cs.cmu.edu/~fp/papers/tocl07.pdf

https://arxiv.org/pdf/2402.01428v1

Need to enforce independence, that $Gamma tack.r e : A_m$ means $Gamma >= m$

Pointer:

#let wrap(e) = $angle.l #e angle.r $

#tree(
  axi[$Gamma >= k$],
  axi[$Gamma tack.r e : A_k$],
  bin[$Gamma tack.r wrap(e) : arrow.b^k_m A_k$]
)

*NOTE:* The bottom would not be valid if $Gamma cancel(>=) k$

#tree(
  axi[$Delta >= m >= r$],
  axi[$Delta tack.r e : arrow.b^k_m A_k$],
  axi[$Gamma , x : A_k tack.r e' : C_r$],
  nary(3)[$Delta Gamma tack.r "match" e with wrap(x) => e' : C_r$],
)

#pagebreak()

== Lecture 4

SAX : Semi-Axiomatic Sequence Calculus

$ f : A_L -> B_L , x : A_L tack.r C_U $

So we are using linear resources to create an unrestricted resource.
Can't actually build this because $Gamma >= m$ and $U > L$.

So in this case we woudl want to build:

$ f : A_L -> B_L , x : A_L tack.r arrow.b^U_L C_U $

=== Linear logic

Two modes: $U > L$, $!A = op(arrow.b)^U_L op(arrow.t)^U_L A$

Linear + non-linear logic (LNL) (Benton '95)

#let downshift(a, b) = $op(arrow.b)^#a_#b$
#let upshift(a, b) = $op(arrow.t)^#b_#a$

JS$._4$ $V > T : square A = op(arrow.b)^V_T op(arrow.t)^V_T A$ (comonad) \
Lax logic: $T > L : circle A = op(arrow.t)^T_L op(arrow.b)^T_L A$ (monad)

If you wanted proof irrelevance, you would have proof relevance and proof irrelevance as modes.

#pagebreak()

== Lecture 5

#let cell(a,V) = $"cell" #a " " #V$
#let proc = "proc"
#let cut = "cut"
#let write = "write"
#let read = "read"
#let call = "call"

Recall:

Types:

$A times B, & A -> B \
1, \
+{l: A_l}, & \&{l : A_l} \
arrow.b A, & arrow.t A $

Continuations: $K ::= (x, y) => P | () => P | (l(x) => P_l)_(l in L) | wrap(x) => P$

Values: $V ::= (a, b) | () | k(a) | wrap(a)$

Programs: $P ::= write c S | read c S | cut x P ; Q | id a " " b | call f " " a$

Storable: $S ::= V | K$

=== Dynamics

Format: $cell a_1 V_1, cell a_2 V_2, ..., proc P_1, proc P_2$

- Cells are not ordered, but the procedures are ordered
- Only writing to empty cells, mutating a cell is a different process

Rules:

- $proc (cut x P(x); Q(x)) mapsto cell(a, square), proc (P(a)), proc (Q(a))$
- $cell(a, square), proc (write a S) mapsto cell(a, S)$
- $cell(a, square), cell(b, S) , proc (id a b) mapsto cell(a, V)$
- $cell(c, S), proc (read c k) mapsto proc (S triangle.r S)$
  - where
    - $(a, b) &triangle.r ((x, y) => P(x, y)) = P(a, b) \
    () &triangle.r (() => P) = P \
    k(a) &triangle.r (l(x) => P_l(x))_(l in L) \
    wrap(a) &triangle.r (wrap(x) => P(x)) = P(a)
    $

Remarks: this is substructural because if you have the left side, then the arrow is a linear arrow.

#let linarrow = $multimap$

#tree(
  axi[$Gamma,A tack.r B$],
  uni[$Gamma tack.r A linarrow B$]
)

#tree(
  axi[$Gamma, x tack.r P :: (y : B)$],
  uni[$Gamma tack.r write c ((x, y) => P) :: (c : A linarrow B)$],
)

#tree(
  axi[],
  uni[$A, A linarrow B tack.r B$]
)

#tree(
  axi[],
  uni[$a : A, c : A linarrow B tack.r read c (a,b) :: (b : B)$]
)

#tree(
  axi[$Gamma tack.r A$],
  axi[$Gamma tack.r B$],
  bin[$Gamma tack.r A & B$],
)

#tree(
  axi[],
  uni[$A & B tack.r A$],
)

#tree(
  axi[],
  uni[$A & B tack.r B$],
)

#tree(
  axi[$(forall l in L) (Gamma tack.r P_l : (x : A_l))$],
  uni[$Gamma tack.r write c (l(x) => P) :: (c : &{l : A_l}_(l in L))$],
)

#tree(
  axi[$k in L$],
  uni[$c : &{l:A_l}_(l in L) tack.r read c (k(a)) :: (a : A_k)$]
)

The final configuration of a program is all programs are used, all cells are used

$ dot tack.r P :: (d_0 : A) $

Cut elimination is not compatible with some semi-axiomatic sequent calculus.

=== Recover from cut elimination

https://www.cs.cmu.edu/~fp/papers/fscd20a.pdf \
https://www.cs.cmu.edu/~fp/papers/mfps22.pdf

Can't eliminate all cuts, cannot eliminate cuts that are subformulas (SNIPS) of the things you're trying to prove.

snip rule

#tree(axi[], uni[$underline(A), underline(B) tack.r A times.circle B$])

#tree(axi[], uni[$underline(A) tack.r A plus.circle B$])
#tree(axi[], uni[$underline(B) tack.r A plus.circle B$])

#tree(
  axi[$Delta tack.r A$],
  axi[$Gamma , underline(A) tack.r C$],
  bin[$Delta, Gamma tack.r C$]
)