Proofreading pass through Chapter 2

This commit is contained in:
Adam Chlipala 2016-01-31 22:19:34 -05:00
parent f39f2ab0d3
commit 48c8906d10

View file

@ -118,6 +118,8 @@ When we encounter a new challenge, to prove a new kind of property about a new k
We can even alternate between strategies, breaking a system into parts, abstracting one as a simpler part, further decomposing that part into pieces, and so on. We can even alternate between strategies, breaking a system into parts, abstracting one as a simpler part, further decomposing that part into pieces, and so on.
\end{itemize} \end{itemize}
\newcommand{\encoding}[0]{\marginpar{\fbox{\textbf{Encoding}}}}
In the course of the book, we will never quite define any of these meta-techniques in complete formality. In the course of the book, we will never quite define any of these meta-techniques in complete formality.
Instead, we'll meet many examples of each, called out by eye-catching margin notes. Instead, we'll meet many examples of each, called out by eye-catching margin notes.
Generalizing from the examples should help the reader start developing an intuition for when to use each element and for the common design patterns that apply. Generalizing from the examples should help the reader start developing an intuition for when to use each element and for the common design patterns that apply.
@ -166,6 +168,7 @@ $$\begin{array}{l}
Rather than appeal to our intuition about grade-school arithmetic, we prefer to formalize concrete syntax with a \emph{grammar}\index{grammar}, following a style known as \emph{Backus-Naur Form (BNF)}\index{Backus-Naur Form}\index{BNF}. Rather than appeal to our intuition about grade-school arithmetic, we prefer to formalize concrete syntax with a \emph{grammar}\index{grammar}, following a style known as \emph{Backus-Naur Form (BNF)}\index{Backus-Naur Form}\index{BNF}.
We have a set of \emph{nonterminals}\index{nonterminal} (e.g., $e$ below), standing for sets of allowable strings. We have a set of \emph{nonterminals}\index{nonterminal} (e.g., $e$ below), standing for sets of allowable strings.
Some are defined by appeal to existing sets, as below, when we define constants $n$ in terms of the well-known set $\mathbb N$\index{N@$\mathbb N$} of natural numbers\index{natural numbers} (nonnegative integers). Some are defined by appeal to existing sets, as below, when we define constants $n$ in terms of the well-known set $\mathbb N$\index{N@$\mathbb N$} of natural numbers\index{natural numbers} (nonnegative integers).
\encoding
$$\begin{array}{rrcl} $$\begin{array}{rrcl}
\textrm{Constants} & n &\in& \mathbb N \\ \textrm{Constants} & n &\in& \mathbb N \\
\textrm{Variables} & x &\in& \mathsf{Strings} \\ \textrm{Variables} & x &\in& \mathsf{Strings} \\
@ -185,6 +188,7 @@ A more general notation for inductive definitions provides a series of \emph{inf
Formally, the set is defined to be \emph{the smallest one that satisfies all the rules}. Formally, the set is defined to be \emph{the smallest one that satisfies all the rules}.
Each rule has \emph{premises}\index{premise} and a \emph{conclusion}\index{conclusion}. Each rule has \emph{premises}\index{premise} and a \emph{conclusion}\index{conclusion}.
We illustrate with four rules that together are equivalent to the BNF grammar above, for defining a set $\mathsf{Exp}$ of expressions. We illustrate with four rules that together are equivalent to the BNF grammar above, for defining a set $\mathsf{Exp}$ of expressions.
\encoding
$$\infer{n \in \mathsf{Exp}}{ $$\infer{n \in \mathsf{Exp}}{
n \in \mathbb N n \in \mathbb N
} }
@ -211,6 +215,7 @@ After that brief interlude with concrete syntax, we now drop all formal treatmen
Instead, we concern ourselves with \emph{abstract syntax}\index{abstract syntax}, the real heart of language definitions. Instead, we concern ourselves with \emph{abstract syntax}\index{abstract syntax}, the real heart of language definitions.
Now programs are \emph{abstract syntax trees}\index{abstract syntax tree} (\emph{ASTs}\index{AST}), corresponding to inductive type definitions in Coq or algebraic datatype\index{algebraic datatype} definitions in Haskell\index{Haskell}. Now programs are \emph{abstract syntax trees}\index{abstract syntax tree} (\emph{ASTs}\index{AST}), corresponding to inductive type definitions in Coq or algebraic datatype\index{algebraic datatype} definitions in Haskell\index{Haskell}.
Such types can be defined by enumerating their \emph{constructor}\index{constructor} functions with types. Such types can be defined by enumerating their \emph{constructor}\index{constructor} functions with types.
\encoding
\begin{eqnarray*} \begin{eqnarray*}
\mathsf{Const} &:& \mathbb{N} \to \mathsf{Exp} \\ \mathsf{Const} &:& \mathbb{N} \to \mathsf{Exp} \\
\mathsf{Var} &:& \mathsf{Strings} \to \mathsf{Exp} \\ \mathsf{Var} &:& \mathsf{Strings} \to \mathsf{Exp} \\
@ -220,6 +225,7 @@ Such types can be defined by enumerating their \emph{constructor}\index{construc
Such a list of constructors defines the set $\mathsf{Exp}$ to contain exactly those terms that can be built up with the constructors. Such a list of constructors defines the set $\mathsf{Exp}$ to contain exactly those terms that can be built up with the constructors.
In inference-rule notation: In inference-rule notation:
\encoding
$$\infer{\mathsf{Const}(n) \in \mathsf{Exp}}{ $$\infer{\mathsf{Const}(n) \in \mathsf{Exp}}{
n \in \mathbb N n \in \mathbb N
} }
@ -245,8 +251,8 @@ Here is one in the clausal\index{clausal function definition} style of Haskell\i
\begin{eqnarray*} \begin{eqnarray*}
\mathsf{size}(\mathsf{Const}(n)) &=& 1 \\ \mathsf{size}(\mathsf{Const}(n)) &=& 1 \\
\mathsf{size}(\mathsf{Var}(x)) &=& 1 \\ \mathsf{size}(\mathsf{Var}(x)) &=& 1 \\
\mathsf{size}(\mathsf{Plus}(e_1, e_2)) &=& \mathsf{size}(e_1) + \mathsf{size}(e_2) \\ \mathsf{size}(\mathsf{Plus}(e_1, e_2)) &=& 1 + \mathsf{size}(e_1) + \mathsf{size}(e_2) \\
\mathsf{size}(\mathsf{Times}(e_1, e_2)) &=& \mathsf{size}(e_1) + \mathsf{size}(e_2) \mathsf{size}(\mathsf{Times}(e_1, e_2)) &=& 1 + \mathsf{size}(e_1) + \mathsf{size}(e_2)
\end{eqnarray*} \end{eqnarray*}
It is important that we include \emph{one clause per constructor of the inductive type}. It is important that we include \emph{one clause per constructor of the inductive type}.
@ -334,7 +340,7 @@ For that reason, we stick to machine-checked proofs here, using the book chapter
We do, however, need to get all the proof details filled in somehow. We do, however, need to get all the proof details filled in somehow.
One of the most convenient cases is when a proof goal fits into some \emph{decidable theory}\index{decidable theory}. One of the most convenient cases is when a proof goal fits into some \emph{decidable theory}\index{decidable theory}.
We follow the sense from computability theory\index{computability theory}, where we consider some \emph{decision problem}\index{decision problem}, as a (usually infinite) set $F$ of formulas and some subset $T \subseteq F$ of \emph{true} formulas, possibly considering only those provable using some limited set of inference rules. We follow the sense from computability theory\index{computability theory}, where we consider some \emph{decision problem}\index{decision problem}, as a (usually infinite) set $F$ of formulas and some subset $T \subseteq F$ of \emph{true} formulas, possibly considering only those provable using some limited set of inference rules.
The decision problem is \emph{decidable} if and only if there exists some always terminating program that, when passed some $f \in F$ as input, returns ``true'' if and only if $f \in T$. The decision problem is \emph{decidable} if and only if there exists some always-terminating program that, when passed some $f \in F$ as input, returns ``true'' if and only if $f \in T$.
Decidability of theories is handy because, whenever our goal belongs to the $F$ set of a decidable theory, we can discharge the goal automatically by running the deciding program that must exist. Decidability of theories is handy because, whenever our goal belongs to the $F$ set of a decidable theory, we can discharge the goal automatically by running the deciding program that must exist.
One common decidable theory is \emph{linear arithmetic}\index{linear arithmetic}, whose $F$ set is generated by the following grammar as $\phi$. One common decidable theory is \emph{linear arithmetic}\index{linear arithmetic}, whose $F$ set is generated by the following grammar as $\phi$.
@ -348,9 +354,9 @@ $$\begin{array}{rrcl}
The arithmetic terms used here are \emph{linear} in the same sense as \emph{linear algebra}\index{linear algebra}: we never multiply together two terms containing variables. The arithmetic terms used here are \emph{linear} in the same sense as \emph{linear algebra}\index{linear algebra}: we never multiply together two terms containing variables.
Actually, multiplication is prohibited outright, but we allow multiplication by a constant as an abbreviation (logically speaking) for repeated addition. Actually, multiplication is prohibited outright, but we allow multiplication by a constant as an abbreviation (logically speaking) for repeated addition.
Propositions are formed out of equality and less-than tests on terms, and we also have the Boolean negation (``not'') operator $\neg$ and conjunction (``and'') operator $\land$. Propositions are formed out of equality and less-than tests on terms, and we also have the Boolean negation (``not'') operator $\neg$ and conjunction (``and'') operator $\land$.
This set of propositional operators is enough to encode the other usual inequality and propositional operators, so we allow them, too, as convenient shorthands. This set of propositional\index{propositional logic} operators is enough to encode the other usual inequality and propositional operators, so we allow them, too, as convenient shorthands.
Applying decidable theories in a proof assistant like Coq, it is important to understand how a theory may apply to formulas that don't actually satisfy its grammar literally. Using decidable theories in a proof assistant like Coq, it is important to understand how a theory may apply to formulas that don't actually satisfy its grammar literally.
For instance, we may want to prove $f(x) - f(x) = 0$, for some fancy function $f$ well outside the grammar above. For instance, we may want to prove $f(x) - f(x) = 0$, for some fancy function $f$ well outside the grammar above.
However, we only need to introduce a new variable $y$, defined with the equation $y = f(x)$, to arrive at a new goal $y - y = 0$. However, we only need to introduce a new variable $y$, defined with the equation $y = f(x)$, to arrive at a new goal $y - y = 0$.
A linear-arithmetic procedure makes short work of this goal, and we may then derive the original goal by substituting back in for $y$. A linear-arithmetic procedure makes short work of this goal, and we may then derive the original goal by substituting back in for $y$.
@ -376,14 +382,16 @@ $$\infer[\mathsf{Reflexivity}]{e = e}{}
e_1 = e_3 e_1 = e_3
& e_3 = e_2 & e_3 = e_2
}$$ }$$
$$\infer[\mathsf{Congruence}]{f_1(e_1) = f_2(e_2)}{ $$\infer[\mathsf{Congruence}]{f(e_1, \ldots, e_n) = f'(e'_1, \ldots, e'_n)}{
f_1 = f_2 f = f'
& e_1 = e_2 & e_1 = e'_1
& \ldots
& e_n = e'_n
}$$ }$$
\medskip \medskip
As one more example of a decidable theory, consider the algebraic structure of \emph{semirings}\index{semirings}, which may profitably be remembered as ``things that act like natural numbers.'' As one more example of a decidable theory, consider the algebraic structure of \emph{semirings}\index{semirings}, which may profitably be remembered as ``types that act like natural numbers.''
A semiring is any set containing two elements notated 0 and 1, closed under two binary operators notated $+$ and $\times$. A semiring is any set containing two elements notated 0 and 1, closed under two binary operators notated $+$ and $\times$.
The notations are suggestive, but in fact we have free reign in choosing the set, elements, and operators, so long as the following axioms\footnote{The equations are taken almost literally from \url{https://en.wikipedia.org/wiki/Semiring}.} are satisfied: The notations are suggestive, but in fact we have free reign in choosing the set, elements, and operators, so long as the following axioms\footnote{The equations are taken almost literally from \url{https://en.wikipedia.org/wiki/Semiring}.} are satisfied:
\begin{eqnarray*} \begin{eqnarray*}
@ -400,10 +408,9 @@ The notations are suggestive, but in fact we have free reign in choosing the set
a \times 0 &=& 0 a \times 0 &=& 0
\end{eqnarray*} \end{eqnarray*}
The formal theory is then as follows. The formal theory is then as follows, where we consider as ``true'' only those equalities that follow from the axioms.
$$\begin{array}{rrcl} $$\begin{array}{rrcl}
\textrm{Variables} & x &\in& \mathsf{Strings} \\ \textrm{Variables} & x &\in& \mathsf{Strings} \\
\textrm{Functions} & f &\in& \mathsf{Strings} \\
\textrm{Terms} & e &::=& x \mid e + e \mid e \times e \\ \textrm{Terms} & e &::=& x \mid e + e \mid e \times e \\
\textrm{Propositions} & \phi &::=& e = e \textrm{Propositions} & \phi &::=& e = e
\end{array}$$ \end{array}$$
@ -414,7 +421,7 @@ For instance, by the semiring theory, we can prove $x \times y = y \times x$, wh
\section{Simplification and Rewriting} \section{Simplification and Rewriting}
While we leave most proof details to the accompanying Coq code, it does seem important to introduce two important principles that are often implicit in proofs on paper. While we leave most proof details to the accompanying Coq code, it does seem important to introduce two key principles that are often implicit in proofs on paper.
The first is \emph{algebraic simplification}\index{algebraic simplification}, where we apply the defining equations of a recursive definition to simplify a goal. The first is \emph{algebraic simplification}\index{algebraic simplification}, where we apply the defining equations of a recursive definition to simplify a goal.
For example, recall that our definition of expression size included this clause. For example, recall that our definition of expression size included this clause.
@ -424,7 +431,7 @@ For example, recall that our definition of expression size included this clause.
Now imagine that we are trying to prove this formula. Now imagine that we are trying to prove this formula.
$$\size{\mathsf{Plus}(e, \mathsf{Const}(7))} = 8 + \size{e}$$ $$\size{\mathsf{Plus}(e, \mathsf{Const}(7))} = 8 + \size{e}$$
We may apply the defining equation to rewrite into a different formula, where we have essential pushed the definition of $\size{\cdot}$ through the $\mathsf{Plus}$. We may apply the defining equation to rewrite into a different formula, where we have essential pushed the definition of $\size{\cdot}$ through the $\mathsf{Plus}$.
$$1 + \size{e} + \size{\mathsf{Const}(7)} = 7 + \size{e}$$ $$1 + \size{e} + \size{\mathsf{Const}(7)} = 8 + \size{e}$$
Another application of a different defining equation, this time for $\mathsf{Const}$, takes us to here. Another application of a different defining equation, this time for $\mathsf{Const}$, takes us to here.
$$1 + \size{e} + 7 = 8 + \size{e}$$ $$1 + \size{e} + 7 = 8 + \size{e}$$
From here, the goal follows by linear arithmetic. From here, the goal follows by linear arithmetic.
@ -443,6 +450,7 @@ Here, unification found the assignment $e = \mathsf{Var}(x)$.
\medskip \medskip
\encoding
We close the chapter with an important note on terminology. We close the chapter with an important note on terminology.
A formula like $P(\size{\mathsf{Plus}(\mathsf{Var}(x), \mathsf{Const}(7))})$ combines several levels of notation. A formula like $P(\size{\mathsf{Plus}(\mathsf{Var}(x), \mathsf{Const}(7))})$ combines several levels of notation.
We consider that we are doing our mathematical reasoning in some \emph{metalanguage}\index{metalanguage}, which is often applicable to a wide variety of proof tasks. We consider that we are doing our mathematical reasoning in some \emph{metalanguage}\index{metalanguage}, which is often applicable to a wide variety of proof tasks.