1
\achapter{Extended pattern-matching}\defaultheaders
2
\aauthor{Cristina Cornes and Hugo Herbelin}
4
\label{Mult-match-full}
6
\index{ML-like patterns}
8
This section describes the full form of pattern-matching in {\Coq} terms.
10
\asection{Patterns}\label{implementation} The full syntax of {\tt
11
match} is presented in figures~\ref{term-syntax}
12
and~\ref{term-syntax-aux}. Identifiers in patterns are either
13
constructor names or variables. Any identifier that is not the
14
constructor of an inductive or coinductive type is considered to be a
15
variable. A variable name cannot occur more than once in a given
16
pattern. It is recommended to start variable names by a lowercase
19
If a pattern has the form $(c~\vec{x})$ where $c$ is a constructor
20
symbol and $\vec{x}$ is a linear vector of (distinct) variables, it is
21
called {\em simple}: it is the kind of pattern recognized by the basic
22
version of {\tt match}. On the opposite, if it is a variable $x$ or
23
has the form $(c~\vec{p})$ with $p$ not only made of variables, the
24
pattern is called {\em nested}.
26
A variable pattern matches any value, and the identifier is bound to
27
that value. The pattern ``\texttt{\_}'' (called ``don't care'' or
28
``wildcard'' symbol) also matches any value, but does not bind
29
anything. It may occur an arbitrary number of times in a
30
pattern. Alias patterns written \texttt{(}{\sl pattern} \texttt{as}
31
{\sl identifier}\texttt{)} are also accepted. This pattern matches the
32
same values as {\sl pattern} does and {\sl identifier} is bound to the
34
A pattern of the form {\pattern}{\tt |}{\pattern} is called
35
disjunctive. A list of patterns separated with commas is also
36
considered as a pattern and is called {\em multiple pattern}. However
37
multiple patterns can only occur at the root of pattern-matching
38
equations. Disjunctions of {\em multiple pattern} are allowed though.
40
Since extended {\tt match} expressions are compiled into the primitive
41
ones, the expressiveness of the theory remains the same. Once the
42
stage of parsing has finished only simple patterns remain. Re-nesting
43
of pattern is performed at printing time. An easy way to see the
44
result of the expansion is to toggle off the nesting performed at
45
printing (use here {\tt Set Printing Matching}), then by printing the term
46
with \texttt{Print} if the term is a constant, or using the command
49
The extended \texttt{match} still accepts an optional {\em elimination
50
predicate} given after the keyword \texttt{return}. Given a pattern
51
matching expression, if all the right-hand-sides of \texttt{=>} ({\em
52
rhs} in short) have the same type, then this type can be sometimes
53
synthesized, and so we can omit the \texttt{return} part. Otherwise
54
the predicate after \texttt{return} has to be provided, like for the basic
57
Let us illustrate through examples the different aspects of extended
58
pattern matching. Consider for example the function that computes the
59
maximum of two natural numbers. We can write it in primitive syntax
63
Fixpoint max (n m:nat) {struct m} : nat :=
66
| S n' => match m with
68
| S m' => S (max n' m')
73
\paragraph{Multiple patterns}
75
Using multiple patterns in the definition of {\tt max} allows to write:
79
Fixpoint max (n m:nat) {struct m} : nat :=
83
| S n', S m' => S (max n' m')
87
which will be compiled into the previous form.
89
The pattern-matching compilation strategy examines patterns from left
90
to right. A \texttt{match} expression is generated {\bf only} when
91
there is at least one constructor in the column of patterns. E.g. the
92
following example does not build a \texttt{match} expression.
95
Check (fun x:nat => match x return nat with
100
\paragraph{Aliasing subpatterns}
102
We can also use ``\texttt{as} {\ident}'' to associate a name to a
107
Fixpoint max (n m:nat) {struct n} : nat :=
111
| S n', S m' => S (max n' m')
115
\paragraph{Nested patterns}
117
Here is now an example of nested patterns:
120
Fixpoint even (n:nat) : bool :=
124
| S (S n') => even n'
128
This is compiled into:
134
In the previous examples patterns do not conflict with, but
135
sometimes it is comfortable to write patterns that admit a non
136
trivial superposition. Consider
137
the boolean function \texttt{lef} that given two natural numbers
138
yields \texttt{true} if the first one is less or equal than the second
139
one and \texttt{false} otherwise. We can write it as follows:
142
Fixpoint lef (n m:nat) {struct m} : bool :=
146
| S n, S m => lef n m
150
Note that the first and the second multiple pattern superpose because
151
the couple of values \texttt{O O} matches both. Thus, what is the result
152
of the function on those values? To eliminate ambiguity we use the
153
{\em textual priority rule}: we consider patterns ordered from top to
154
bottom, then a value is matched by the pattern at the $ith$ row if and
155
only if it is not matched by some pattern of a previous row. Thus in the
157
\texttt{O O} is matched by the first pattern, and so \texttt{(lef O O)}
158
yields \texttt{true}.
160
Another way to write this function is:
164
Fixpoint lef (n m:nat) {struct m} : bool :=
167
| S n, S m => lef n m
172
Here the last pattern superposes with the first two. Because
173
of the priority rule, the last pattern
174
will be used only for values that do not match neither the first nor
177
Terms with useless patterns are not accepted by the
178
system. Here is an example:
181
Set Printing Depth 50.
182
(********** The following is not correct and should produce **********)
183
(**************** Error: This clause is redundant ********************)
194
\paragraph{Disjunctive patterns}
196
Multiple patterns that share the same right-hand-side can be
197
factorized using the notation \nelist{\multpattern}{\tt |}. For instance,
198
{\tt max} can be rewritten as follows:
204
Fixpoint max (n m:nat) {struct m} : nat :=
206
| S n', S m' => S (max n' m')
211
Similarly, factorization of (non necessary multiple) patterns
212
that share the same variables is possible by using the notation
213
\nelist{\pattern}{\tt |}. Here is an example:
216
Definition filter_2_4 (n:nat) : nat :=
218
| 2 as m | 4 as m => m
223
Here is another example using disjunctive subpatterns.
226
Definition filter_some_square_corners (p:nat*nat) : nat*nat :=
228
| ((2 as m | 4 as m), (3 as n | 5 as n)) => (m,n)
233
\asection{About patterns of parametric types}
234
When matching objects of a parametric type, constructors in patterns
235
{\em do not expect} the parameter arguments. Their value is deduced
237
Consider for example the type of polymorphic lists:
240
Inductive List (A:Set) : Set :=
242
| cons : A -> List A -> List A.
245
We can check the function {\em tail}:
257
When we use parameters in patterns there is an error message:
260
Set Printing Depth 50.
261
(********** The following is not correct and should produce **********)
262
(******** Error: The constructor cons expects 2 arguments ************)
275
\asection{Matching objects of dependent types}
276
The previous examples illustrate pattern matching on objects of
277
non-dependent types, but we can also
278
use the expansion strategy to destructure objects of dependent type.
279
Consider the type \texttt{listn} of lists of a certain length:
283
Inductive listn : nat -> Set :=
285
| consn : forall n:nat, nat -> listn n -> listn (S n).
288
\asubsection{Understanding dependencies in patterns}
289
We can define the function \texttt{length} over \texttt{listn} by:
292
Definition length (n:nat) (l:listn n) := n.
295
Just for illustrating pattern matching,
296
we can define it by case analysis:
300
Definition length (n:nat) (l:listn n) :=
307
We can understand the meaning of this definition using the
308
same notions of usual pattern matching.
311
% Constraining of dependencies is not longer valid in V7
314
Now suppose we split the second pattern of \texttt{length} into two
316
alternative definition using nested patterns:
318
Definition length1 (n:nat) (l:listn n) :=
321
| consn n _ niln => S n
322
| consn n _ (consn _ _ _) => S n
326
It is obvious that \texttt{length1} is another version of
327
\texttt{length}. We can also give the following definition:
329
Definition length2 (n:nat) (l:listn n) :=
332
| consn n _ niln => 1
333
| consn n _ (consn m _ _) => S (S m)
337
If we forget that \texttt{listn} is a dependent type and we read these
338
definitions using the usual semantics of pattern matching, we can conclude
339
that \texttt{length1}
340
and \texttt{length2} are different functions.
341
In fact, they are equivalent
342
because the pattern \texttt{niln} implies that \texttt{n} can only match
343
the value $0$ and analogously the pattern \texttt{consn} determines that \texttt{n} can
344
only match values of the form $(S~v)$ where $v$ is the value matched by
347
The converse is also true. If
348
we destructure the length value with the pattern \texttt{O} then the list
349
value should be $niln$.
350
Thus, the following term \texttt{length3} corresponds to the function
351
\texttt{length} but this time defined by case analysis on the dependencies instead of on the list:
354
Definition length3 (n:nat) (l:listn n) :=
358
| consn (S n) _ _ => S (S n)
362
When we have nested patterns of dependent types, the semantics of
363
pattern matching becomes a little more difficult because
364
the set of values that are matched by a sub-pattern may be conditioned by the
365
values matched by another sub-pattern. Dependent nested patterns are
366
somehow constrained patterns.
367
In the examples, the expansion of
368
\texttt{length1} and \texttt{length2} yields exactly the same term
370
expansion of \texttt{length3} is completely different. \texttt{length1} and
371
\texttt{length2} are expanded into two nested case analysis on
372
\texttt{listn} while \texttt{length3} is expanded into a case analysis on
373
\texttt{listn} containing a case analysis on natural numbers inside.
376
In practice the user can think about the patterns as independent and
377
it is the expansion algorithm that cares to relate them. \\
383
\asubsection{When the elimination predicate must be provided}
384
The examples given so far do not need an explicit elimination predicate
385
because all the rhs have the same type and the
386
strategy succeeds to synthesize it.
387
Unfortunately when dealing with dependent patterns it often happens
388
that we need to write cases where the type of the rhs are
389
different instances of the elimination predicate.
390
The function \texttt{concat} for \texttt{listn}
391
is an example where the branches have different type
392
and we need to provide the elimination predicate:
395
Fixpoint concat (n:nat) (l:listn n) (m:nat) (l':listn m) {struct l} :
397
match l in listn n return listn (n + m) with
399
| consn n' a y => consn (n' + m) a (concat n' y m l')
402
The elimination predicate is {\tt fun (n:nat) (l:listn n) => listn~(n+m)}.
403
In general if $m$ has type $(I~q_1\ldots q_r~t_1\ldots t_s)$ where
404
$q_1\ldots q_r$ are parameters, the elimination predicate should be of
406
{\tt fun $y_1$\ldots $y_s$ $x$:($I$~$q_1$\ldots $q_r$~$y_1$\ldots
409
In the concrete syntax, it should be written~:
410
\[ \kw{match}~m~\kw{as}~x~\kw{in}~(I~\_\ldots \_~y_1\ldots y_s)~\kw{return}~Q~\kw{with}~\ldots~\kw{end}\]
412
The variables which appear in the \kw{in} and \kw{as} clause are new
413
and bounded in the property $Q$ in the \kw{return} clause. The
414
parameters of the inductive definitions should not be mentioned and
415
are replaced by \kw{\_}.
417
Recall that a list of patterns is also a pattern. So, when
418
we destructure several terms at the same time and the branches have
419
different type we need to provide
420
the elimination predicate for this multiple pattern.
421
It is done using the same scheme, each term may be associated to an
422
\kw{as} and \kw{in} clause in order to introduce a dependent product.
424
For example, an equivalent definition for \texttt{concat} (even though the matching on the second term is trivial) would have
429
Fixpoint concat (n:nat) (l:listn n) (m:nat) (l':listn m) {struct l} :
431
match l in listn n, l' return listn (n + m) with
433
| consn n' a y, x => consn (n' + m) a (concat n' y m x)
437
% Notice that this time, the predicate \texttt{[n,\_:nat](listn (plus n
438
% m))} is binary because we
439
% destructure both \texttt{l} and \texttt{l'} whose types have arity one.
440
% In general, if we destructure the terms $e_1\ldots e_n$
441
% the predicate will be of arity $m$ where $m$ is the sum of the
442
% number of dependencies of the type of $e_1, e_2,\ldots e_n$
443
% (the $\lambda$-abstractions
444
% should correspond from left to right to each dependent argument of the
445
% type of $e_1\ldots e_n$).
446
When the arity of the predicate (i.e. number of abstractions) is not
447
correct Coq raises an error message. For example:
452
Set Printing Depth 50.
453
(********** The following is not correct and should produce ***********)
454
(** Error: the term l' has type listn m while it is expected to have **)
455
(** type listn (?31 + ?32) **)
459
(n:nat) (l:listn n) (m:nat)
460
(l':listn m) {struct l} : listn (n + m) :=
463
| consn n' a y, x => consn (n' + m) a (concat n' y m x)
467
\asection{Using pattern matching to write proofs}
468
In all the previous examples the elimination predicate does not depend
469
on the object(s) matched. But it may depend and the typical case
470
is when we write a proof by induction or a function that yields an
471
object of dependent type. An example of proof using \texttt{match} in
472
given in section \ref{refine-example}
474
For example, we can write
475
the function \texttt{buildlist} that given a natural number
476
$n$ builds a list of length $n$ containing zeros as follows:
479
Fixpoint buildlist (n:nat) : listn n :=
480
match n return listn n with
482
| S n => consn n 0 (buildlist n)
486
We can also use multiple patterns.
487
Consider the following definition of the predicate less-equal
491
Inductive LE : nat -> nat -> Prop :=
492
| LEO : forall n:nat, LE 0 n
493
| LES : forall n m:nat, LE n m -> LE (S n) (S m).
496
We can use multiple patterns to write the proof of the lemma
497
\texttt{forall (n m:nat), (LE n m)}\verb=\/=\texttt{(LE m n)}:
500
Fixpoint dec (n m:nat) {struct n} : LE n m \/ LE m n :=
501
match n, m return LE n m \/ LE m n with
502
| O, x => or_introl (LE x 0) (LEO x)
503
| x, O => or_intror (LE x 0) (LEO x)
504
| S n as n', S m as m' =>
506
| or_introl h => or_introl (LE m' n') (LES n m h)
507
| or_intror h => or_intror (LE n' m') (LES m n h)
511
In the example of \texttt{dec},
512
the first \texttt{match} is dependent while
515
% In general, consider the terms $e_1\ldots e_n$,
516
% where the type of $e_i$ is an instance of a family type
517
% $\lb (\vec{d_i}:\vec{D_i}) \mto T_i$ ($1\leq i
518
% \leq n$). Then, in expression \texttt{match} $e_1,\ldots,
519
% e_n$ \texttt{of} \ldots \texttt{end}, the
520
% elimination predicate ${\cal P}$ should be of the form:
521
% $[\vec{d_1}:\vec{D_1}][x_1:T_1]\ldots [\vec{d_n}:\vec{D_n}][x_n:T_n]Q.$
523
The user can also use \texttt{match} in combination with the tactic
524
\texttt{refine} (see section \ref{refine}) to build incomplete proofs
525
beginning with a \texttt{match} construction.
527
\asection{Pattern-matching on inductive objects involving local
530
If local definitions occur in the type of a constructor, then there
531
are two ways to match on this constructor. Either the local
532
definitions are skipped and matching is done only on the true arguments
533
of the constructors, or the bindings for local definitions can also
534
be caught in the matching.
540
Require Import Arith.
544
Inductive list : nat -> Set :=
546
| cons : forall n:nat, let m := (2 * n) in list m -> list (S (S m)).
549
In the next example, the local definition is not caught.
552
Fixpoint length n (l:list n) {struct l} : nat :=
555
| cons n l0 => S (length (2 * n) l0)
559
But in this example, it is.
562
Fixpoint length' n (l:list n) {struct l} : nat :=
565
| cons _ m l0 => S (length' m l0)
569
\Rem for a given matching clause, either none of the local
570
definitions or all of them can be caught.
572
\asection{Pattern-matching and coercions}
574
If a mismatch occurs between the expected type of a pattern and its
575
actual type, a coercion made from constructors is sought. If such a
576
coercion can be found, it is automatically inserted around the
585
Coercion C1 : nat >-> I.
586
Check (fun x => match x with
593
\asection{When does the expansion strategy fail ?}\label{limitations}
594
The strategy works very like in ML languages when treating
595
patterns of non-dependent type.
596
But there are new cases of failure that are due to the presence of
599
The error messages of the current implementation may be sometimes
600
confusing. When the tactic fails because patterns are somehow
601
incorrect then error messages refer to the initial expression. But the
602
strategy may succeed to build an expression whose sub-expressions are
603
well typed when the whole expression is not. In this situation the
604
message makes reference to the expanded expression. We encourage
605
users, when they have patterns with the same outer constructor in
606
different equations, to name the variable patterns in the same
607
positions with the same name.
608
E.g. to write {\small\texttt{(cons n O x) => e1}}
609
and {\small\texttt{(cons n \_ x) => e2}} instead of
610
{\small\texttt{(cons n O x) => e1}} and
611
{\small\texttt{(cons n' \_ x') => e2}}.
612
This helps to maintain certain name correspondence between the
613
generated expression and the original.
615
Here is a summary of the error messages corresponding to each situation:
618
\item \sverb{The constructor } {\sl
619
ident} \sverb{expects } {\sl num} \sverb{arguments}
621
\sverb{The variable } {\sl ident} \sverb{is bound several times
622
in pattern } {\sl term}
624
\sverb{Found a constructor of inductive type} {\term}
625
\sverb{while a constructor of} {\term} \sverb{is expected}
627
Patterns are incorrect (because constructors are not applied to
628
the correct number of the arguments, because they are not linear or
629
they are wrongly typed)
631
\item \errindex{Non exhaustive pattern-matching}
633
the pattern matching is not exhaustive
635
\item \sverb{The elimination predicate } {\sl term} \sverb{should be
636
of arity } {\sl num} \sverb{(for non dependent case) or } {\sl
637
num} \sverb{(for dependent case)}
639
The elimination predicate provided to \texttt{match} has not the
643
%\item the whole expression is wrongly typed
646
% , or the synthesis of
647
% implicit arguments fails (for example to find the elimination
648
% predicate or to resolve implicit arguments in the rhs).
650
% There are {\em nested patterns of dependent type}, the elimination
651
% predicate corresponds to non-dependent case and has the form
652
% $[x_1:T_1]...[x_n:T_n]T$ and {\bf some} $x_i$ occurs {\bf free} in
653
% $T$. Then, the strategy may fail to find out a correct elimination
654
% predicate during some step of compilation. In this situation we
655
% recommend the user to rewrite the nested dependent patterns into
656
% several \texttt{match} with {\em simple patterns}.
658
\item {\tt Unable to infer a match predicate\\
659
Either there is a type incompatiblity or the problem involves\\
662
There is a type mismatch between the different branches
664
Then the user should provide an elimination predicate.
667
% \item because of nested patterns, it may happen that even though all
668
% the rhs have the same type, the strategy needs dependent elimination
669
% and so an elimination predicate must be provided. The system warns
670
% about this situation, trying to compile anyway with the
671
% non-dependent strategy. The risen message is:
674
% \item {\tt Warning: This pattern matching may need dependent
675
% elimination to be compiled. I will try, but if fails try again
676
% giving dependent elimination predicate.}
680
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
681
% % LA PROPAGATION DES CONTRAINTES ARRIERE N'EST PAS FAITE DANS LA V7
683
% \item there are {\em nested patterns of dependent type} and the
684
% strategy builds a term that is well typed but recursive calls in fix
685
% point are reported as illegal:
687
% \item {\tt Error: Recursive call applied to an illegal term ...}
690
% This is because the strategy generates a term that is correct w.r.t.
691
% the initial term but which does not pass the guard condition. In
692
% this situation we recommend the user to transform the nested dependent
693
% patterns into {\em several \texttt{match} of simple patterns}. Let us
694
% explain this with an example. Consider the following definition of a
695
% function that yields the last element of a list and \texttt{O} if it is
698
% \begin{coq_example}
699
% Fixpoint last [n:nat; l:(listn n)] : nat :=
701
% (consn _ a niln) => a
702
% | (consn m _ x) => (last m x) | niln => O
706
% It fails because of the priority between patterns, we know that this
707
% definition is equivalent to the following more explicit one (which
710
% \begin{coq_example*}
711
% Fixpoint last [n:nat; l:(listn n)] : nat :=
713
% (consn _ a niln) => a
714
% | (consn n _ (consn m b x)) => (last n (consn m b x))
719
% Note that the recursive call {\tt (last n (consn m b x))} is not
720
% guarded. When treating with patterns of dependent types the strategy
721
% interprets the first definition of \texttt{last} as the second
722
% one\footnote{In languages of the ML family the first definition would
723
% be translated into a term where the variable \texttt{x} is shared in
724
% the expression. When patterns are of non-dependent types, Coq
725
% compiles as in ML languages using sharing. When patterns are of
726
% dependent types the compilation reconstructs the term as in the
727
% second definition of \texttt{last} so to ensure the result of
728
% expansion is well typed.}. Thus it generates a term where the
729
% recursive call is rejected by the guard condition.
731
% You can get rid of this problem by writing the definition with
732
% \emph{simple patterns}:
734
% \begin{coq_example}
735
% Fixpoint last [n:nat; l:(listn n)] : nat :=
736
% <[_:nat]nat>match l of
737
% (consn m a x) => Cases x of niln => a | _ => (last m x) end
747
%%% TeX-master: "Reference-Manual"