Kleene algebra with tests is an algebraic framework that can be used to reason about imperative programs. It has been applied across a wide variety of areas including program transformations, concurrency control, compiler optimizations, cache control, networking, and more. In these lectures, we will provide an overview of Kleene Algebra with Tests, including the syntax, semantics, and the coalgebraic theory underlying decision procedures for program equivalence. We will illustrate how it can be used as a core framework for program verification, including successful extensions and fundamental limitations.
]
#table(
columns: (1fr, auto, 1fr),
align: (auto, center, auto),
```
while a & b do
p;
while a do
q;
while a & b do
p;
```,
$eq.quest$,
```
while a do
if b then
p;
else
q;
```
)
Study equivalence of uninterpreted simple imperative programs.
- For every symbol there's also an expression that's the same thing.
This is an abuse of notation
- $db(e + f) = db(e) union db(f)$
- $db(#[$e ; f$]) = db(e) circle.filled db(f)$
- $forall (U, V : 2^(A^*)) U circle.filled V = { u v | u in U , v in V }$
- this is word concatenation
- $db(e^*) = union.big_(n in bb(N)) db(e)^n$
These are the denotational semantics of regular expressions.
Examples:
- $(1 + a) ; (1 + b) mapsto {epsilon , a , b, a b }$
- $(a + b)^* mapsto {a , b}^*$
- $(a^* b)^* a^* mapsto {a , b}^*$
These last two are equal $(a + b)^* equiv (a^* b)^* a^*$ because they have the same denotational semantics.
This is _denesting_.
These denesting rules can also be applied to programs such as the example at the start.
For regular expressions and extensions of regular expressions, you can come up with some finite number of 3-bar equations to prove any equivalence between regular expressions.
#rect[
*Definition (Kleene's Theorem).*
Let $L$ be a regular language.
Then the following are equivalent:
1. $L = db(e)$ for some regexp $e$
2. $L$ is accepted by a DFA
]
Regular languages do not include things like $A^n B^n$.
See: #link("https://en.wikipedia.org/wiki/Chomsky_hierarchy")[Chomsky Hierarchy of languages].
- Regular sets are constructed with a similar construction as regular expressions.
Talking about the other direction from Kleene's theorem.
$ A_e mapsto e $
mapping an automaton $A_e$ to a regular expression $e$.
Let DFA have states $S$ and transition function $S mapsto S^A$. This can be represented by a matrix.
The matrix is indexed on rows and columns by the states. Then in the cell for each row $i$ and column $j$, put all of the letters of the alphabet that can be used to transition between the states. For example, if you can use $a$ and $b$, put $a+b$ in the matrix.
This allows you to do matrix operations in order to do operations on regular expressions. This shows that the transition function actually has more structure than just an arbitrary function.
Repeatedly delete states until 2 states left, replacing the transitions with regular expressions.
What does it mean to delete states?
*State elimination method.* Need at least 3 states.
#automaton(
layout: finite.layout.snake.with(columns: 2),
(
q0: (q1:("a"), q2: "b"),
q1: (q1: "a", q0: "b"),
q2: (q1: "a", q2: "b"),
)
)
Delete q2, by merging its transition $b b^* a$.
#automaton(
final: "q0",
(
q0: (q1: "a + bb*a"),
q1: (q1: "a", q0: "b")
)
)
Matrix method is more robust than the state elimination method.
*Question.* Is it possible to write a finite number of equations to answer the question $e_1 eq.quest e_2$
$(K, 0, 1, +, op(\;), (..)*)$ satisfies the following:
- K is some set
- Semi-ring
- Joint semi-lattice
- + is idempotent ($e + e equiv e$)
- + is commutative ($e + f equiv f + e$)
- + is associative ($(e + f) + g equiv e + (f + g)$)
- + has a 0 element ($e + 0 equiv e$)
- Monoid
- ; is associative ($(e ; g) ; g equiv e ; (f ; g)$)
- ; has a 1 element ($e ; 1 equiv e equiv 1 ; e$)
- ; has an absorbent element ($e ; 0 equiv 0 equiv 0 ; e$)
- ; distributes over +, both from the right and the left
- $e ; (f + g) equiv e ; f + e ; g$ AND $(e + f) ; g equiv e ; g + f ; g$
- \* is a fix point ($e^* equiv 1 + e ; e^*$)
- \* can be unfolded on the left or the right ($e^* equiv 1 + e^* ; e$)
$e^*$ is a _least_ fix point
#rect[$e <= f$ iff $e + f equiv f$]
#tree(
axi[$e;x+f <= x$],
uni[$e^*;f <= x$]
)
This forms an #link("https://en.wikipedia.org/wiki/Axiom_schema")[axiom schema].
#rect[
Exercises:
- $x^* x^* equiv x^*$
- $x^* equiv (x^*)^*$
- $x y equiv y z arrow.r.long.double x^* y equiv y z^*$
- $(a + b)^* equiv (a^* b)^* a^*$
]
Need to do it in 2 steps ($<=, >=$).
This structure is useful because there are many structures like this. For example ${2^(A^*), emptyset, {epsilon}, union, circle.filled, (..)^*}$ the set of all languages is a Kleene algebra.
- Threads can be reasoned about in a partially distributive commutative lattice
#image("lec2.jpg")
#pagebreak()
== Lecture 3
- _what is the semantics of $K A(T)$_?
- Examples (such as how to tell if while loops are the same)
- Net KAT (syntax and examples)
Two types of semantics for Kleene algebra are the denotational semantics and the operational semantics.
Having a conversion between both representations lets you pick the best of both worlds.
- Denotational better for describing the gist of the language
- Operational better for implementation
=== KAT syntax
- $e ::= 0 | 1 | p in P | e + e | e ; e | star(e) | b in B$
- b is read "assert b", a kind of test
- $b ::= t in T | 0 | 1 | b or b | b and b | overline(b)$
- split alphabet $A$ into two parts, $T union.plus P$ where $T$ describes the basic tests
- this is embedded in the programs part
After writing down assertion, everything after the assertion can be assumed to be true.
If you have $alpha :equiv t_1 overline(t_2) t_3 ... overline(t_1) t_2 overline(t_3) ...$ including all basic tests $t$ and their complements this is considered a _full test_.
For example, if $t_1 t_2 overline(t_1) overline(t_2)$, then there are 4 options for $alpha$:
- $alpha_0 :equiv t_1 t_2$
- $alpha_1 :equiv t_1 overline(t_2)$
- $alpha_2 :equiv overline(t_1) t_2$
- $alpha_3 :equiv overline(t_1) overline(t_2)$
Any other expression is a subset of this full test, $2^T$. So the semantics of booleans is $db(b) equiv 2^(2^T)$
Interleave programs between atoms, used to be $star(A)$, now $star((A t ; P)) ; A t$. There is an option of where to refine the $P$. So now we have:
#rect[$db(e) : 2^star((A t ; P)) ; A t$]
Also, $db(b) = {alpha | alpha <= b}$ (boolean satisfiability). For example, $t_1 t_2 <= t_1$ and $t_1 t_2 <= t_2$ but it doesn't $<= overline(t_2)$.
Also $db(p) = {alpha p beta | alpha_1 beta in A t}$. With $p$ uninterpreted, we only know that $p$ can transform any $alpha$ into any $beta$
- $db(e + f) equiv db(e) union db(f)$
- $db(e \; f) equiv db(e) diamond.small db(f)$
- $alpha_0 p_0 alpha_k diamond.small beta_0 ...$ the $alpha_k diamond.small beta_0$ must match.
Otherwise the program cannot continue; this operation is undefined.
Then it deletes the repeated atoms so it is the same.
(Because $alpha$ and $beta$ are atoms, they are not expressions that can contain other atoms so it's based on exact matching)
- Star is the same as before, but also using the diamond
=== KAT Example
For $ifthenelse(b,p,q)$ which is $b;p+overline(b);q$:
$&db(ifthenelse(b,p,q)) \
equiv &db(b\;p) union db(overline(b\;q)) \
equiv & db(b) diamond.small db(p) union ... \
equiv & {alpha | alpha <= b } diamond.small {alpha p beta | alpha beta in A t} union ... \
equiv & {alpha p beta | alpha <= b} union {alpha q beta | alpha <= overline(b)}
$
The semantics of this expression has 2 types of traces.
One where I start in b, where $p$ is executed.
Another is where I started in a state where the condition $b$ is false, so I executed the $q$ branch.
For $"while" b "do" p$ which is $star((b ; p)) ; overline(b)$
$&db("while" b "do" p) \
equiv & { beta , alpha_0 p beta , alpha_0 p alpha_1 p beta , ... | alpha_i <= b, beta <= overline(b)}$
For traces that don't terminate: $"while true do skip" equiv 0$. (there are alternate semantics where you record infinite traces) With non-termination, your observation power is 0.
=== Connection to Hoare triples
For ${b} C {c}$, the validity of the Hoare triple is the same thing as the KAT equation $b C overline(c) equiv 0$ (filtering out all the postconditions that render the condition false would result in empty)
Another one is $b C <= C c$. This is equivalent to the previous one #TODO Show this
=== Exercise solutions
$star(x + y) equiv star((star(x) y)) star(x)$
Need to prove it in two steps, $<=$ and $>=$:
- Use the fixpoint rule: $1 <= star((star(x) y)) star(x)$.
- Soundness + Completeness (KA + BA) are all you need to prove that 2 programs are equivalent
- Automaton model is where transitions are (atoms ; programs) and the final state is what atoms to validate. This is known as a KAT automaton.
- Equivalece of these automata is decidable, using BDDs (Pous), matrices (Kozen) and these are all $in$ PSPACE.
- Some of these algorithms are very efficient despite PSPACE and they use the Union-find datastructure (Hopcroft, Tarjan)
- $O(n log_c n)$ where $c$ is the inverse of the ackermann function, so the log part is _very_ small (in fact the original conjecture was that it was linear)
- Improvement "Coinduction up-to" on NFA, DFA, Brz. derivatives, can improve the theoretical limit (?)
=== While loop example
#table(
columns: (1fr, 1fr),
```
while a do
p;
while c do
q;
```,
```
if b then
p;
while b or c do
if c then
q;
else
p;
```,
$star((b p star((c q)) overline(c))) b$,
$b ; p ; star(((b + c) (c q + overline(c) p))) overline(b) overline(c) + overline(b)$
)
Show these are the same by applying denesting and sliding.
Net KAT is a special KAT. Think of networks as boxes the only thing they can do is get packets in, change something about the packet, and then sends the packet on.
Packets are records of fields $f_1, ..., f_k$ mapping to values $v_1, ..., v_k$. Program actions ($p = f arrow.l n$) and tests ($f = n$) come in pairs.
Either A and B are directly connected by the topology, or take several steps and get to B.
$ "sw" = A ; "top" ; ("switch" ; "top")^* ; "sw" = B $
If this expression is $equiv 0$, then I know that A is not connected to B.
==== Forwarding Loop
This is when a packet goes in a cycle between nodes. Normally this is stopped with a TTL, which is decreased, and when it reaches 0 then the packet is ejected.
Instead, with NetKAT you can check if you have a forwarding loop.