Library FingerTree.DependentFingerTreeModule
We shall now define the Finger Trees over some monoid and measure.
We will see in section how particular instantiations of these
will give different structures.
Ltac program_simpl ::= program_simplify ; auto with ×.
Module DependentFingerTree (M : Monoid).
Import M.
Module Digits := DigitMeasure M.
Import Digits.
Hint Unfold digit_measure option_digit_measure option_measure.
Nodes
Section Nodes.
Variables (A : Type) (measure : A → v).
Notation " 'lparr' x 'rparr' " := (measure x).
Intuitively, the measure represents an
abstraction of the sequence of objects in the node (later, in the tree) considered from left to right.
Taking the list monoid previously defined and the singleton measure gives
the most concrete abstraction of this sequence of elements: the sequence itself.
The measure function is denoted by in the following, when possible
(notations local to a function are not supported). We now define 2-3 nodes with cached measures:
Inductive node : Type :=
| Node2 : ∀ x y, { s : v | s = lparr x rparr cdot lparr y rparr } → node
| Node3 : ∀ x y z, { s : v | s = lparr x rparr cdot lparr y rparr cdot lparr z rparr } → node.
We use a subset type here to specify the invariant on the cached value. This invariant
could not be checked using simple types, because it is dependent
on the values carried by the node. Moreover, it uses the measure
as a so it would require an additional abstraction mechanism like type
classes or functors to achieve it in a regular ML-based functional language.
A regular dependent product is used here instead.
We have smart constructors node2 and node3 which compute the measure, e.g:
The generated obligations are trivially true as they are of the form .
Correspondingly, node_measure pulls the cached measure.
Program Definition node_measure (n : node) : v :=
match n with Node2 _ _ s ⇒ s | Node3 _ _ _ s ⇒ s end.
End Nodes.
Although it may seem that the node_measure function is independent of the measure function,
it is not. Witness its type after we closed the section:
The node datatype itself is parameterized by the measure function, hence all operations on
nodes take it as an implicit parameter.
Had we not added the invariant, we would not need this parameter,
but we could not prove much about node measures either, because they might be
any element of type v. We could define an inductive invariant on nodes asserting
that their measures were built using the measure function coherently, but we would need to show
that this invariant is preserved in each of the functions, while in our case we get it
for free by typing.
The fingertree datatype
- Empty builds the empty tree of measure ;
- Single x builds the tree with sole element x of measure ;
- Deep pr ms m sf builds the tree of prefix pr, subtree m of measure ms and suffix sf.
This inductive family is indexed by values of type v.
Why we need a dependent type follows from a simple observation. If we want
the cached measure on the Deep node and the invariant that the measure is
really the one of the middle tree, we must have a way to refer
to the measure of the middle tree, but we are actually defining trees,
so we cannot recurse on them at this time.
Note also that we need recursion to go into the middle subtrees of Deep nodes.
Adding elements
We can add an element a to the left of a tree t of measure s to get a tree with measure . Due to polymorphic recursion, our functions will now always have A and measure arguments as they are arguments which will change during recursive calls. If t is empty, a singleton or a tree with a left prefix which is not full, then we simply put a at the leftmost position of the tree. Otherwise, we need to reorganize the tree to make space at the left for a, and this requires recursing polymorphically to add a node a to the left of the middle subtree in the Deep node.Program Fixpoint add_left (A : Type) (measure : A → v)
(a : A) (s : v) (t : fingertree measure s) {struct t} :
fingertree measure (measure a cdot s) :=
match t in fingertree _ s
return @fingertree A measure (measure a cdot s) with
| Empty ⇒ Single a
| Single b ⇒ Deep (One a) Empty (One b)
| Deep pr st' t' sf ⇒
match pr with
| Four b c d e ⇒
let sub := add_left (node3 measure c d e) t' in
Deep (Two a b) sub sf
| x ⇒ Deep (add_digit_left a pr) t' sf
end
end.
The first match expression uses dependent elimination. Its meaning is that
from a particular fingertree of measure s each branch will build a fingertree of measure measure a cdot s
where s has been substituted by the measure of the matched pattern.
For example in the first branch we must build an object of type
fingertree measure (measure a cdot epsilon).
However, the right hand side Single a has type
fingertree measure (measure a), hence we must use the inductive equivalence rule ,
which generates an obligation ,
easily solved using the right identity of the monoid. The in and return clauses
are mandatory in Coq due to undecidability of higher-order unification, but they can be
dropped in Russell, in which case substitution is replaced by equational reasoning.
Had we dropped the clauses, we would have the equation s = epsilon in the context of the first branch hence
the generated obligation .
This one would be solved by first substituting s by in the goal then
applying the right identity ; we just postponed the substitution.
The nested match expression uses an alias x for capturing prefixes which are not Four and applies the partial function
add_digit_left defined earlier. There is an application of the subset equivalence rule here, which
generates an obligation to show that pr is not full. It can be solved because we have
in the context.
It is an essential property of this function that the measure is preserved. To see that, let us instantiate
the Finger Tree by the list monoid and measure defined earlier.
You can check here that adding to the left is prepending the measure of the element to
the measure of the Finger Tree, hence consing to the sequence of elements of the tree with
the list monoid interpretation.
For each of the following operations, this correspondence with
the list instanciation of the measure will hold.
The view from the left... of a fingertree
We will now construct views on the Finger Trees which decompose a tree into its leftmost element and a remaining Finger Tree (View_L) (or dually, the rightmost one and the "preceding" Finger Tree). It serves to abstract from the implementation and give a list-like view of fingertree, which is used to write functions which recurse on fingertrees without having to deal with the intricacies of the structure (c.f. figure ). The View_L inductive is a little less polymorphic, as it does not need carry the measure function which views ignore. However View_L still stores the measure s of the rest of the tree in the cons_L constructor (s will be implicit), hence we need to abstract by the sequence's type seq which is indexed by v. It will be instanciated by fingertree measure.Inductive View_L (A : Type) (seq : v → Type) : Type :=
| nil_L : View_L A seq
| cons_L : A → ∀ s, seq s → View_L A seq.
Such a view will be constructed by the view_L function, by structural (polymorphic)
recursion on the fingertree. We can seamlessly use the digit_tail function which is partial (it accepts
only digits which are not One) and need not add any type annotations when calling view_L recursively on t'.
Note that we use a partial type application (fingertree measure) in the return type, which is perfectly legal.
Program Fixpoint view_L A (measure : A → v)
(s : v) (t : fingertree measure s) {struct t} :
View_L A (fingertree measure) :=
match t with
| Empty ⇒ nil_L
| Single x ⇒ cons_L x Empty
| Deep pr st' t' sf ⇒
match pr with
| One x ⇒
match view_L t' with
| nil_L ⇒ cons_L x (digit_to_tree _ sf)
| cons_L a st' t' ⇒
cons_L x (Deep (node_to_digit a) t' sf)
end
| y ⇒ cons_L (digit_head y) (Deep (digit_tail y) t' sf)
end
end.
We can show that view_L preserves the measure of the input tree;
had we indexed View_L by the measure of the input tree,
these generation lemmas would have appeared as obligations.
Lemma view_L_nil : ∀ A (measure : A → v) s
(t : fingertree measure s), view_L t = nil_L → s = epsilon.
Lemma view_L_cons : ∀ A (measure : A → v) s
(t : fingertree measure s) x st' t',
view_L t = cons_L x (s:=st') t' → s = measure x cdot st'.
Dependence hell
Our views are useful to give an high-level interfaces to Finger Trees, but in their current state they are very limited as we can write only non-recursive definitions on views. That is, we will not be able to convince Coq that functions defined by recursion on the view of a tree are just as valid as those defined by recursion on the tree itself. To do that, we must have a measure on our Finger Trees, e.g. their number of elements (named tree_size), from which we can trivially build a measure on the views View_L_size. Then it is only a matter of showing that for all t, view_L t returns a view of size tree_size t to prove that a recursive call on the tail of a view is correct (c.f. ).Lemma view_L_nil_JMeq_Empty : ∀ A (measure : A → v) s
(t : fingertree measure s), view_L t = nil_L →
JMeq t (Empty (measure:=measure)).
Now that we have shown the equality for a general s index, we can instantiate it
with a particular one, i.e. . Given that is now of measure ,
we can use the regular equality with Empty.
Lemma view_L_nil_Empty : ∀ A (measure : A → v)
(t : fingertree measure epsilon), view_L t = nil_L → t = Empty.
Proof.
intros ; apply JMeq_eq.
apply (view_L_nil_JMeq_Empty _ H).
Qed.
These technical lemmas about nil_L and Empty will help us
build our measure. Indeed, they are needed to prove the following:
Lemma view_L_size : ∀ A (measure : A → v) (s : v)
(t : fingertree measure s), View_L_size (view_L t) = tree_size t.
Proof.
intros.
unfold View_L_size, tree_size ; apply view_L_size_gen.
Qed.
This gives us a decreasing measure on view_L results. We will use it when building
the instances.
Lemma view_L_size_measure : ∀ A (measure : A → v) (s : v)
(t : fingertree measure s) x st' (t' : fingertree measure st'),
view_L t = cons_L x t' → tree_size t' < tree_size t.
We can also define deep_L, the smart constructor
which rearranges a tree given a potentially empty digit
for the prefix. It is a dependent version of the internal function for the Deep case of view_L,
which is used when splitting Finger Trees.
The code has not changed at all from the non-dependent version hence we do not show it.
However, the specification would greatly benefit from an overloading mechanism to factor the different measure specializations.
Back to normal
We are now ready to implement regular deque viewing operations on our Finger Trees. We do not need recursion on the trees here, so we can define the operations for a fixed element type, measure function and measure index.
We define an isEmpty predicate for some operations will be partial,
i.e. defined only for non-empty fingertree objects. We do not match directly on the
object to maintain the abstraction from the implementation.
Emptiness is decidable.
The obvious operations are definable, we show the liat operation dual to tail.
Here we return the index along with the fingertree in a dependent pair of type
{s : v & fingertree measure s}, which corresponds
more closely to the view that the cached measure is really data, not only an index used to refine a datatype.
The pairing of a tree t with its measure m is denoted m :| t and reads
"m measures t".
Program Definition liat (t : fingertree measure s |
¬ isEmpty t) : { s' : v & fingertree measure s' } :=
match view_R t with
| nil_R ⇒ ! | cons_R st' t' last ⇒ st' :| t'
end.
End View.
Catenation & splitting, dependently
We can also define catenation using a dependent type which clearly states the function's specification. The implementation is the same as the one of Hinze & Paterson, except we proved the 100 obligations generated by concerning measures. The five mutually recursive functions hidden here which define app have the particularity of being quite big because they implement a specialization of concatenation for each combination of the digits at the right of the left tree and the left of the right tree. Hinze & Paterson even recommend to produce the 200 lines long definition automatically to prevent mistakes. Here the definition is checked directly: the type itself expresses that we did not mess up the obvious property of app on measures.Definition app : ∀ (A : Type) (measure : A → v)
(xs : v) (x : fingertree measure xs)
(ys : v) (y : fingertree measure ys),
fingertree measure (xs cdot ys).
Notation " x +:+ y " := (app x y) (at level 20).
The last and most interesting operation from the specifier's point of view, is
splitting a tree by a predicate on its measure. We begin by splitting a node.
We split a node n by finding where some predicate p on measures turns to true,
starting with an initial measure i and accumulating the measures of subtrees from left to right.
We simply copy the invariant given by here, there is nothing to change,
we only add the corresponding property on the measure which clearly generalizes
the to_list equation. The code is also a direct translation from Haskell to Russell except
we use a regular tuple instead of a custom split datatype.
We have also defined a split_digit function with the same specification.
The interesting case is for trees, as usual. Instead of returning a tuple,
we use a dependent tree_split inductive which directly captures
the invariant that a split is a decomposition of a fingertree preserving measures. We also put
the invariants on the left and right trees, which can be used by clients to prove properties about
their code.
This is one of the most distinctive parts of this work: we derive not only reusable code but
also reusable proofs using rich types. Indeed, split_tree can be seen as a proof combinator,
turning a property on a measure and monoid to a property on words on this monoid.
Section Trees.
Variable p : v → bool.
Inductive tree_split (A : Type) (measure : A → v)
(i : v) : v → Type :=
mkTreeSplit : ∀ (xsm : v) (xs : fingertree measure xsm |
isEmpty xs ∨ p (i cdot xsm) = false)
(x : A) (ysm : v) (ys : fingertree measure ysm |
isEmpty ys ∨ p (i cdot xsm cdot measure x) = true),
tree_split measure i (xsm cdot measure x cdot ysm).
This inductive combines both subsets and dependent inductive types, but we can still
write our code as usual.
Approximately 100 lines of proof are necessary to discharge the generated obligations.
This concludes our implementation of (dependent) Finger Trees with .
To sum up, we have proven that:
- All functions are total, once properly annotated with their preconditions.
- All functions respect the measure semantics and use them coherently.
- All functions respect the invariants given in the paper by Hinze & Paterson.
