oplss2024/silva/notes.typ
2024-06-04 18:56:00 -04:00

332 lines
9.9 KiB
Plaintext

#import "../common.typ": *
#import "@preview/finite:0.3.0" as finite: automaton
#import "@preview/fletcher:0.4.5" as fletcher: diagram, node, edge
#import "@preview/prooftrees:0.1.0": *
#show: doc => conf("Program Analysis with Kleene Algebra with Tests", doc)
#let tru = $"true"$
#let fls = $"false"$
== Lecture 1
#quote(block: true)[
Kleene algebra with tests is an algebraic framework that can be used to reason about imperative programs. It has been applied across a wide variety of areas including program transformations, concurrency control, compiler optimizations, cache control, networking, and more. In these lectures, we will provide an overview of Kleene Algebra with Tests, including the syntax, semantics, and the coalgebraic theory underlying decision procedures for program equivalence. We will illustrate how it can be used as a core framework for program verification, including successful extensions and fundamental limitations.
]
#table(
columns: (1fr, auto, 1fr),
align: (auto, center, auto),
```
while a & b do
p;
while a do
q;
while a & b do
p;
```,
$eq.quest$,
```
while a do
if b then
p;
else
q;
```
)
Study equivalence of uninterpreted simple imperative programs.
==== Regular expressions
Basic elements
$e ::= 0 | 1 | a in A | ...$
where $A$ is finite.
$e ::= ... | e ; e | e + e | e *$
Set of words:
#rect[$db(e) : 2^(A^*) = cal(P)(A^*)$]
- $db(0) = emptyset$
- $db(1) = {epsilon }$
- $db(a) = {a }$
- For every symbol there's also an expression that's the same thing.
This is an abuse of notation
- $db(e + f) = db(e) union db(f)$
- $db(#[$e ; f$]) = db(e) circle.filled db(f)$
- $forall (U, V : 2^(A^*)) U circle.filled V = { u v | u in U , v in V }$
- this is word concatenation
- $db(e^*) = union.big_(n in bb(N)) db(e)^n$
These are the denotational semantics of regular expressions.
Examples:
- $(1 + a) ; (1 + b) mapsto {epsilon , a , b, a b }$
- $(a + b)^* mapsto {a , b}^*$
- $(a^* b)^* a^* mapsto {a , b}^*$
These last two are equal $(a + b)^* equiv (a^* b)^* a^*$ because they have the same denotational semantics.
This is _denesting_.
These denesting rules can also be applied to programs such as the example at the start.
For regular expressions and extensions of regular expressions, you can come up with some finite number of 3-bar equations to prove any equivalence between regular expressions.
#rect[
*Definition (Kleene's Theorem).*
Let $L$ be a regular language.
Then the following are equivalent:
1. $L = db(e)$ for some regexp $e$
2. $L$ is accepted by a DFA
]
Regular languages do not include things like $A^n B^n$.
See: #link("https://en.wikipedia.org/wiki/Chomsky_hierarchy")[Chomsky Hierarchy of languages].
- Regular sets are constructed with a similar construction as regular expressions.
Proof:
- #[
Direction 1 -> 2 first.
https://en.wikipedia.org/wiki/Brzozowski_derivative
($L = db(e)$ for some regexp $e$) $arrow.r$ ($L$ is accepted by a DFA)
Build an automaton directly from the expression. Use _derivatives_ as if it was a function.
DFA :
- S : finite set of states
- $F : S arrow.r bb(2)$
- $t : S arrow.r S^A$
- this must be deterministic
To construct a DFA out of an expression, construct 2 functions:
1. #[
$E : "RE" arrow.r bb(2)$
E stands for "empty word", this asks "is there an empty word"? Because this will indicate where you can stop
- $E(0) = 0$
- $E(1) = 1$
- $E(a) = 0$
- $E(e ; f) = E(e) times E(f)$
- $E(e + f) = E(e) or E(f)$
- $E(e^*) = 1$
]
2. #[
$L : cal(P)(A^*)$
- $L_a = { u | a u in L }$
$D_a : "RE" arrow.r "RE"^A$
- Note: $X arrow.r X^A equiv (X arrow.r X)_(a in A)$
Definition:
- $D_a (0) = 0$
- $D_a (1) = 0$
- $D_a (a) = 1$
- $D_a (b) = 0$
- $D_a (e ; f) = (D_a (e) ; f) + (E(e) ; D_a (f))$
- If $E(e)$ did not have the empty word, then $E(e) = 0$, then $0 + A = A$.
This is the case as the case split:
$cases(#[$D_a (e) ; f "if" E(e) = 0$], #[$D_a (e) ; f + D_a (f) "if" E(e) = 1$])$
- $D_a (e + f) = D_a (e) + D_a (f)$
- $D_a (e^*) = D_a (e) ; e^*$
]
This is not a DFA yet, because it needs to be shown that it's finite.
Pick a *start state*.
You can pick $e$ as your start state.
From there, collect states you can reach.
Then, argue that the states you can reach are finite.
It's not actually true that this is finite!
- If $e = (a^*)^*$, then $D_a (e) = (1 ; a^*) ; (a^*)^*$
- Then $D_a (D_a (e)) = (0;a^* + 1;a^*) ; (a^*)^* + (a^*) ; (a^*)^*$
- TODO: This isn't completely right but the point is that it expands infinitely and there is a common part
Need to look at the equivalent classes of derivatives, since $D$ only understands syntax and keeps around a lot of unnecessary baggage.
ACI (associativity, commutativity, I? (TODO))
This isn't completely necessary to show finiteness, since the repeating term is the same.
Delete unnecessary $(+ f)$ terms in the $+$ case.
This will give you finiteness.
]
Merge states of a non-deterministic finite automaton, using the peanut approach.
Alternative #link("https://en.wikipedia.org/wiki/Thompson%27s_construction")[*Thompson's construction*]
#pagebreak()
== Lecture 2
Talking about the other direction from Kleene's theorem.
$ A_e mapsto e $
mapping an automaton $A_e$ to a regular expression $e$.
Let DFA have states $S$ and transition function $S mapsto S^A$. This can be represented by a matrix.
The matrix is indexed on rows and columns by the states. Then in the cell for each row $i$ and column $j$, put all of the letters of the alphabet that can be used to transition between the states. For example, if you can use $a$ and $b$, put $a+b$ in the matrix.
This allows you to do matrix operations in order to do operations on regular expressions. This shows that the transition function actually has more structure than just an arbitrary function.
Repeatedly delete states until 2 states left, replacing the transitions with regular expressions.
What does it mean to delete states?
*State elimination method.* Need at least 3 states.
#automaton(
layout: finite.layout.snake.with(columns: 2),
(
q0: (q1:("a"), q2: "b"),
q1: (q1: "a", q0: "b"),
q2: (q1: "a", q2: "b"),
)
)
Delete q2, by merging its transition $b b^* a$.
#automaton(
final: "q0",
(
q0: (q1: "a + bb*a"),
q1: (q1: "a", q0: "b")
)
)
Matrix method is more robust than the state elimination method.
*Question.* Is it possible to write a finite number of equations to answer the question $e_1 eq.quest e_2$
$(K, 0, 1, +, op(\;), (..)*)$ satisfies the following:
- K is some set
- Semi-ring
- Joint semi-lattice
- + is idempotent ($e + e equiv e$)
- + is commutative ($e + f equiv f + e$)
- + is associative ($(e + f) + g equiv e + (f + g)$)
- + has a 0 element ($e + 0 equiv e$)
- Monoid
- ; is associative ($(e ; g) ; g equiv e ; (f ; g)$)
- ; has a 1 element ($e ; 1 equiv e equiv 1 ; e$)
- ; has an absorbent element ($e ; 0 equiv 0 equiv 0 ; e$)
- ; distributes over +, both from the right and the left
- $e ; (f + g) equiv e ; f + e ; g$ AND $(e + f) ; g equiv e ; g + f ; g$
- \* is a fix point ($e^* equiv 1 + e ; e^*$)
- \* can be unfolded on the left or the right ($e^* equiv 1 + e^* ; e$)
$e^*$ is a _least_ fix point
#rect[$e <= f$ iff $e + f equiv f$]
#tree(
axi[$e;x+f <= x$],
uni[$e^*;f <= x$]
)
This forms an #link("https://en.wikipedia.org/wiki/Axiom_schema")[axiom schema].
#rect[
Exercises:
- $x^* x^* equiv x^*$
- $x^* equiv (x^*)^*$
- $x y equiv y z arrow.r.long.double x^* y equiv y z^*$
- $(a + b)^* equiv (a^* b)^* a^*$
]
Need to do it in 2 steps ($<=, >=$).
This structure is useful because there are many structures like this. For example ${2^(A^*), emptyset, {epsilon}, union, circle.filled, (..)^*}$ the set of all languages is a Kleene algebra.
Other examples:
- #[
Binary relations: $("BRel", emptyset, Delta, union, circle, (..)^*)$
- $Delta$ is the diagonal relation
- The $(..)^*$ is transitive closure
Binary relations satisfies all of the rules of a Kleene algebra.
]
- #[
Square matrices: $("Mat"(k), 0, I, +, times, (..)^*)$
- $times$ is the multiplcation associated with $k$
- #[
The star operation is defined:
$mat(a,b;c,d)^* = mat((a+b d^* c)^*, (a+b d^* c)^* b d^*;
(d+c a^* b)^* c a^*, (d+c a^* b)^*)$
This describes _every_ path from a state to another state, as many times:
#automaton(
(
q0: (q0: "a", q1: "b"),
q1: (q0: "c", q1: "d"),
)
)
]
]
*Kozen '93 result.* If two expressions $e$ and $f$ have the same language, then
$ db(e) equiv db(f) arrow.long.double.l.r e equiv f$
Switching between syntax and semantics.
_Proof method._
For every expression $e$, build an NFA $A_e$. So $e equiv f$ gets $A_e$ and $A_f$.
(Modern proofs will directly go to the DFA).
Determinize the NFAs to get $A'_e$ and $A'_f$.
Then minimize the automata to get $M_e$ and $M_f$.
#image("lec1.jpg")
Every DFA has a _unique_ minimal automaton, modulo renaming the state names.
If $e$ and $f$ represent the same language, then the minimal automata must be isomorphic.
Every $A'_e$ and $M_e$ is a matrix of expressions, and can be proven sound using the axioms of Kleene algebra.
=== Imperative programs
What is missing from $(K, 0, 1, +, op(\;), (..)*)$ for imperative programs?
- State
- Conditionals
Conditionals needs to satisfy a boolean algebra. So we need to combine a Kleene
algebra and a Boolean algebra.
Subset of $K$: $(B, 0, 1, +, ;, overline((..)))$ is a boolean algebra. _Sub-algebra_
- For B, + and ; are thought of as $or$ and $and$
- It needs to additionally satisfy demorgan laws for not
- $overline(a + b) equiv overline(a) ; overline(b)$
- $overline(a \; b) equiv overline(a) + overline(a)$
- $overline(0) equiv 1$
- $overline(overline(a)) equiv a$
- $a ; b equiv b ; a$ (not true normally!!)
This is only true of language without effects or concurrency.
- Threads can be reasoned about in a partially distributive commutative lattice