From 653f571826b43e1f9b20f88d932aa13c00974d63 Mon Sep 17 00:00:00 2001 From: aarne Date: Fri, 16 Jun 2006 08:35:24 +0000 Subject: dep types in new tutorial --- doc/tutorial/gf-tutorial2.html | 1032 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 978 insertions(+), 54 deletions(-) (limited to 'doc/tutorial/gf-tutorial2.html') diff --git a/doc/tutorial/gf-tutorial2.html b/doc/tutorial/gf-tutorial2.html index d657f7cc8..de2d63dcb 100644 --- a/doc/tutorial/gf-tutorial2.html +++ b/doc/tutorial/gf-tutorial2.html @@ -7,7 +7,7 @@

Grammatical Framework Tutorial

Author: Aarne Ranta <aarne (at) cs.chalmers.se>
-Last update: Fri Jun 16 01:02:28 2006 +Last update: Fri Jun 16 10:32:52 2006

@@ -100,37 +100,40 @@ Last update: Fri Jun 16 01:02:28 2006 -
  • More features of the module system +
  • More features of the module system -
  • Using the standard resource library +
  • Using the standard resource library -
  • Transfer modules -
  • Practical issues +
  • Transfer modules +
  • Practical issues -
  • Case studies +
  • Case studies @@ -2213,29 +2216,950 @@ is a set of type and function declarations. The following is an example of such a theory, represented as an abstract module in GF.

    -    abstract Geometry = {
    -      cat 
    -        Line ; Point ; Circle ;            -- basic types of figures
    -        Prop ;                             -- proposition
    -      fun
    -        Parallel : Line -> Line -> Prop ;  -- x is parallel to y
    -        Centre : Circle -> Point ;         -- the centre of c
    +  abstract Arithm = {
    +    cat
    +      Prop ;    -- proposition
    +      Nat ;     -- natural number
    +    fun
    +      Zero : Nat ;                  -- 0
    +      Succ : Nat -> Nat ;           -- successor of x
    +      Even : Nat -> Prop ;          -- x is even
    +      And  : Prop -> Prop -> Prop ; -- A and B
           } 
     
    -

    +

    +A concrete syntax is given below, as an example of using the resource grammar +library. +

    Dependent types

    +

    +Dependent types are a characteristic feature of GF, +inherited from the +constructive type theory of Martin-Löf and +distinguishing GF from most other grammar formalisms and +functional programming languages. +The initial main motivation for developing GF was, indeed, +to have a grammar formalism with dependent types. +As can be inferred from the fact that we introduce them only now, +after having written lots of grammars without them, +dependent types are no longer the only motivation for GF. +But they are still important and interesting. +

    +

    +Dependent types can be used for stating stronger +conditions of well-formedness than non-dependent types. +A simple example is postal addresses. Ignoring the other details, +let us take a look at addresses consisting of +a street, a city, and a country. +

    +
    +  abstract Address = {
    +    cat 
    +      Address ; Country ; City ; Street ;
    +  
    +    fun
    +      mkAddress : Country -> City -> Street -> Address ;
    +  
    +      UK, France : Country ;
    +      Paris, London, Grenoble : City ;
    +      OxfordSt, ShaftesburyAve, BdRaspail, RueBlondel, AvAlsaceLorraine : Street ;
    +    }
    +
    +

    +The linearization rules +are straightforward, +

    +
    +    lin
    +  
    +      mkAddress country city street = ss (street ++ "," ++ city ++ "," ++ country) ;
    +  
    +      UK = ss ("U.K.") ;
    +      France = ss ("France") ;
    +      Paris = ss ("Paris") ;
    +      London = ss ("London") ;
    +      Grenoble = ss ("Grenoble") ;
    +      OxfordSt = ss ("Oxford" ++ "Street") ;
    +      ShaftesburyAve = ss ("Shaftesbury" ++ "Avenue") ;
    +      BdRaspail = ss ("boulevard" ++ "Raspail") ;
    +      RueBlondel = ss ("rue" ++ "Blondel") ;
    +      AvAlsaceLorraine = ss ("avenue" ++ "Alsace-Lorraine") ;
    +
    +

    +with the exception of mkAddress, where we have +reversed the order of the constituents. The type of mkAddress +in the abstract syntax takes its arguments in a "logical" order, +with increasing precision. (This order is sometimes even used in the concrete +syntax of addresses, e.g. in Russia). +

    +

    +Both existing and non-existing addresses are recognized by this +grammar. The non-existing ones in the following randomly generated +list have afterwards been marked by *: +

    +
    +    > gr -cat=Address -number=7 | l
    +  
    +    * Oxford Street , Paris , France
    +    * Shaftesbury Avenue , Grenoble , U.K.
    +    boulevard Raspail , Paris , France
    +    * rue Blondel , Grenoble , U.K.
    +    * Shaftesbury Avenue , Grenoble , France
    +    * Oxford Street , London , France
    +    * Shaftesbury Avenue , Grenoble , France
    +
    +

    +Dependent types provide a way to guarantee that addresses are +well-formed. What we do is to include contexts in +cat judgements: +

    +
    +    cat Address ; 
    +    cat Country ; 
    +    cat City Country ; 
    +    cat Street (x : Country)(y : City x) ;
    +
    +

    +The first two judgements are as before, but the third one makes +City dependent on Country: there are no longer just cities, +but cities of the U.K. and cities of France. The fourth judgement +makes Street dependent on City; but since +City is itself dependent on Country, we must +include them both in the context, moreover guaranteeing that +the city is one of the given country. Since the context itself +is built by using a dependent type, we have to use variables +to indicate the dependencies. The judgement we used for City +is actually shorthand for +

    +
    +    cat City (x : Country)
    +
    +

    +which is only possible if the subsequent context does not depend on x. +

    +

    +The fun judgements of the grammar are modified accordingly: +

    +
    +    fun
    +    
    +    mkAddress : (x : Country) -> (y : City x) -> Street x y -> Address ;
    +    
    +    UK : Country ;
    +    France : Country ;
    +    Paris : City France ; 
    +    London : City UK ; 
    +    Grenoble : City France ;
    +    OxfordSt : Street UK London ; 
    +    ShaftesburyAve : Street UK London ;
    +    BdRaspail : Street France Paris ; 
    +    RueBlondel : Street France Paris ; 
    +    AvAlsaceLorraine : Street France Grenoble ;
    +
    +

    +Since the type of mkAddress now has dependencies among +its argument types, we have to use variables just like we used in +the context of Street above. What we claimed to be the +"logical" order of the arguments is now forced by the type system +of GF: a variable must be declared (=bound) before it can be +referenced (=used). +

    +

    +The effect of dependent types is that the *-marked addresses above are +no longer well-formed. What the GF parser actually does is that it +initially accepts them (by using a context-free parsing algorithm) +and then rejects them (by type checking). The random generator does not produce +illegal addresses (this could be useful in bulk mailing!). +The linearization algorithm does not care about type dependencies; +actually, since the categories (ignoring their arguments) +are the same in both abstract syntaxes, +we use the same concrete syntax +for both of them. +

    +

    +Remark. Function types without +variables are actually a shorthand notation: writing +

    +
    +    fun PredV1 : NP -> V1 -> S
    +
    +

    +is shorthand for +

    +
    +    fun PredV1 : (x : NP) -> (y : V1) -> S
    +
    +

    +or any other naming of the variables. Actually the use of variables +sometimes shortens the code, since we can write e.g. +

    +
    +    fun  ConjNP : Conj -> (x,y : NP) -> NP ;
    +    oper triple : (x,y,z : Str) -> Str = \x,y,z -> x ++ y ++ z ;
    +
    +

    -

    Higher-order abstract syntax

    +

    Dependent types in syntax editing

    +

    +An extra advantage of dependent types is seen in +syntax editing: +when menus with possible refinements are created, +only those functions are shown that are type-correct. +For instance, if the editor state is +

    +
    +    mkAddress : Address  
    +       UK : Country  
    +    *  ?2 : City UK  
    +       ?3 : Street UK ?2
    +
    +

    +only the cities of the U.K. are shown in the city menu. +

    +

    +What is more, editing in the state +

    +
    +    mkAddress : Address  
    +       ?1 : Country  
    +       ?2 : City (?1)  
    +    *  ?3 : Street (?1) (?2)  
    +
    +

    +starts from the Street argument, +which enables GF automatically to infer the city and the country. +Thus, in addition to guaranteeing the meaningfulness of the results, +dependent types can shorten editing sessions considerably. +

    -

    Semantic definitions

    +

    Dependent types in concrete syntax

    +

    +The functional fragment of GF +terms and types comprises function types, applications, lambda +abstracts, constants, and variables. This fragment is similar in +abstract and concrete syntax. In particular, +dependent types are also available in concrete syntax. +We have not made use of them yet, +but we will now look at one example of how they +can be used. +

    +

    +Those readers who are familiar with functional programming languages +like ML and Haskell, may already have missed polymorphic +functions. For instance, Haskell programmers have access to +the functions +

    +
    +    const :: a -> b -> a
    +    const c x = c
    +  
    +    flip :: (a -> b ->c) -> b -> a -> c
    +    flip f y x = f x y
    +
    +

    +which can be used for any given types a,b, and c. +

    +

    +The GF counterpart of polymorphic functions are monomorphic +functions with explicit type variables. Thus the above +definitions can be written +

    +
    +    oper const :(a,b : Type) -> a -> b -> a =
    +      \_,_,c,x -> c ;
    +  
    +    oper flip : (a,b,c : Type) -> (a -> b ->c) -> b -> a -> c =
    +      \_,_,_,f,x,y -> f y x ;
    +
    +

    +When the operations are used, the type checker requires +them to be equipped with all their arguments; this may be a nuisance +for the Haskell or ML programmer. +

    -

    List categories

    +

    Expressing selectional restrictions

    +

    +This section introduces a way of using dependent types to +formalize a notion known as selectional restrictions +in linguistics. We first present a mathematical model +of the notion, and then integrate it in a linguistically +motivated syntax. +

    +

    +In linguistics, a +grammar is usually thought of as being about syntactic well-formedness +in a rather liberal sense: an expression can be well-formed without +being meaningful, in other words, without being +semantically well-formed. +For instance, the sentence +

    +
    +    the number 2 is equilateral
    +
    +

    +is syntactically well-formed but semantically ill-formed. +It is well-formed because it combines a well-formed +noun phrase ("the number 2") with a well-formed +verb phrase ("is equilateral") in accordance with the +rule that the verb phrase is inflected in the +number of the noun phrase: +

    +
    +    fun PredV1 : NP -> V1 -> S ;
    +    lin PredV1 np v1 = {s = np.s ++ v1.s ! np.n} ;
    +
    +

    +It is ill-formed because the predicate "is equilateral" +is only defined for triangles, not for numbers. +

    +

    +In a straightforward type-theoretical formalization of +mathematics, domains of mathematical objects +are defined as types. In GF, we could write +

    +
    +    cat Nat ;
    +    cat Triangle ;
    +    cat Prop ;
    +
    +

    +for the types of natural numbers, triangles, and propositions, +respectively. +Noun phrases are typed as objects of basic types other than +Prop, whereas verb phrases are functions from basic types +to Prop. For instance, +

    +
    +    fun two : Nat ;
    +    fun Even : Nat -> Prop ;
    +    fun Equilateral : Triangle -> Prop ;
    +
    +

    +With these judgements, and the linearization rules +

    +
    +    lin two = ss ["the number 2"] ;
    +    lin Even x = ss (x.s ++ ["is even"]) ;
    +    lin Equilateral x = ss (x.s ++ ["is equilateral"]) ;
    +
    +

    +we can form the proposition Even two +

    +
    +    the number 2 is even
    +
    +

    +but no proposition linearized to +

    +
    +    the number 2 is equilateral
    +
    +

    +since Equilateral two is not a well-formed type-theoretical object. +

    +

    +When formalizing mathematics, e.g. in the purpose of +computer-assisted theorem proving, we are certainly interested +in semantic well-formedness: we want to be sure that a proposition makes +sense before we make the effort of proving it. The straightforward typing +of nouns and predicates shown above is the way in which this +is guaranteed in various proof systems based on type theory. +(Notice that it is still possible to form false propositions, +e.g. "the number 3 is even". +False and meaningless are different things.) +

    +

    +As shown by the linearization rules for two, Even, +etc, it is possible to use straightforward mathematical typings +as the abstract syntax of a grammar. However, this syntax is not very +expressive linguistically: for instance, there is no distinction between +adjectives and verbs. It is hard to give rules for structures like +adjectival modification ("even number") and conjunction of predicates +("even or odd"). +

    +

    +By using dependent types, it is simple to combine a linguistically +motivated system of categories with mathematically motivated +type restrictions. What we need is a category of domains of +individual objects, +

    +
    +    cat Dom
    +
    +

    +and dependencies of other categories on this: +

    +
    +    cat 
    +      S ;            -- sentence
    +      V1 Dom ;       -- one-place verb
    +      V2 Dom Dom ;   -- two-place verb
    +      A1 Dom ;       -- one-place adjective
    +      A2 Dom Dom ;   -- two-place adjective
    +      PN Dom ;       -- proper name
    +      NP Dom ;       -- noun phrase
    +      Conj ;         -- conjunction
    +      Det ;          -- determiner
    +
    +

    +The number of Dom arguments depends on the semantic type +corresponding to the category: one-place verbs and adjectives +correspond to types of the form +

    +
    +    A -> Prop
    +
    +

    +whereas two-place verbs and adjectives correspond to types of the form +

    +
    +    A -> B -> Prop
    +
    +

    +where the domains A and B can be distinct. +Proper names correspond to types of the form +

    +
    +    A
    +
    +

    +that is, individual objects of the domain A. Noun phrases +correspond to +

    +
    +    (A -> Prop) -> Prop
    +
    +

    +that is, quantifiers over the domain A. +Sentences, conjunctions, and determiners correspond to +

    +
    +    Prop
    +    Prop -> Prop -> Prop
    +    (A : Dom) -> (A -> Prop) -> Prop
    +
    +

    +respectively, +and are thus independent of domain. As for common nouns CN, +the simplest semantics is that they correspond to +

    +
    +    Dom
    +
    +

    +In this section, we will, in fact, write Dom instead of CN. +

    +

    +Having thus parametrized categories on domains, we have to reformulate +the rules of predication, etc, accordingly. This is straightforward: +

    +
    +    fun
    +      PredV1  : (A   : Dom) -> NP A -> V1 A -> S ;
    +      ComplV2 : (A,B : Dom) -> V2 A B -> NP B -> V1 A ;
    +      UsePN   : (A   : Dom) -> PN A -> NP A ;
    +      DetCN   :                Det -> (A : Dom) -> NP A ;
    +      ModA1   :                (A : Dom) -> A1 A -> Dom ;
    +      ConjS   :                Conj -> S -> S -> S ;
    +      ConjV1  : (A   : Dom) -> Conj -> V1 A -> V1 A -> V1 A ;
    +
    +

    +In linearization rules, +we typically use wildcards for the domain arguments, +to get arities right: +

    +
    +    lin
    +      PredV1 _ np vp = ss (np.s ++ vp.s) ;
    +      ComplV2 _ _ v2 np = ss (v2.s ++ np.s) ;
    +      UsePN _ pn = pn ;
    +      DetCN det cn = ss (det.s ++ cn.s) ;
    +      ModA1 cn a1 = ss (a1.s ++ cn.s) ;
    +      ConjS conj s1 s2 = ss (s1.s ++ conj.s ++ s2.s) ;
    +      ConjV1 _ conj v1 v2 = ss (v1.s ++ conj.s ++ v2.s) ;
    +
    +

    +The domain arguments thus get suppressed in linearization. +Parsing initially returns metavariables for them, +but type checking can usually restore them +by inference from those arguments that are not suppressed. +

    +

    +One traditional linguistic example of domain restrictions +(= selectional restrictions) is the contrast between the two sentences +

    +
    +    John plays golf
    +    golf plays John
    +
    +

    +To explain the contrast, we introduce the functions +

    +
    +    human : Dom ; 
    +    game : Dom ;
    +    play : V2 human game ;
    +    John : PN human ;
    +    Golf : PN game ;
    +
    +

    +Both sentences still pass the context-free parser, +returning trees with lots of metavariables of type Dom: +

    +
    +    PredV1 ?0 (UsePN ?1 John) (ComplV2 ?2 ?3 play (UsePN ?4 Golf))
    +  
    +    PredV1 ?0 (UsePN ?1 Golf) (ComplV2 ?2 ?3 play (UsePN ?4 John))
    +
    +

    +But only the former sentence passes the type checker, which moreover +infers the domain arguments: +

    +
    +    PredV1 human (UsePN human John) (ComplV2 human game play (UsePN game Golf))
    +
    +

    +

    +A known problem with selectional restrictions is that they can be more +or less liberal. For instance, +

    +
    +    John loves Mary
    +    John loves golf
    +
    +

    +both make sense, even though Mary and golf +are of different types. A natural solution in this case is to +formalize love as a polymorphic verb, which takes +a human as its first argument but an object of any type as its second +argument: +

    +
    +    fun love : (A : Dom) -> V2 human A ;
    +    lin love _ = ss "loves" ;
    +
    +

    +Problems remain, such as subtyping (e.g. what +is meaningful for a human is also meaningful for +a man and a woman, but not the other way round) +and the extended use of expressions (e.g. a metaphoric use that +makes sense of "golf plays John"). +

    -

    More features of the module system

    +

    Proof objects

    +

    +Perhaps the most well-known feature of constructive type theory is +the Curry-Howard isomorphism, also known as the +propositions as types principle. Its earliest formulations +were attempts to give semantics to the logical systems of +propositional and predicate calculus. In this section, we will consider +a more elementary example, showing how the notion of proof is useful +outside mathematics, as well. +

    +

    +We first define the category of unary (also known as Peano-style) +natural numbers: +

    +
    +    cat Nat ; 
    +    fun Zero : Nat ;
    +    fun Succ : Nat -> Nat ;
    +
    +

    +The successor function Succ generates an infinite +sequence of natural numbers, beginning from Zero. +

    +

    +We then define what it means for a number x to be less than +a number y. Our definition is based on two axioms: +

    + + +

    +The most straightforward way of expressing these axioms in type theory +is as typing judgements that introduce objects of a type Less x y: +

    +
    +    cat Less Nat Nat ; 
    +    fun lessZ : (y : Nat) -> Less Zero (Succ y) ;
    +    fun lessS : (x,y : Nat) -> Less x y -> Less (Succ x) (Succ y) ;
    +
    +

    +Objects formed by lessZ and lessS are +called proof objects: they establish the truth of certain +mathematical propositions. +For instance, the fact that 2 is less that +4 has the proof object +

    +
    +    lessS (Succ Zero) (Succ (Succ (Succ Zero)))
    +          (lessS Zero (Succ (Succ Zero)) (lessZ (Succ Zero)))
    +
    +

    +whose type is +

    +
    +    Less (Succ (Succ Zero)) (Succ (Succ (Succ (Succ Zero))))
    +
    +

    +which is the same thing as the proposition that 2 is less than 4. +

    +

    +GF grammars can be used to provide a semantic control of +well-formedness of expressions. We have already seen examples of this: +the grammar of well-formed addresses and the grammar with +selectional restrictions above. By introducing proof objects +we have now added a very powerful +technique of expressing semantic conditions. +

    +

    +A simple example of the use of proof objects is the definition of +well-formed time spans: a time span is expected to be from an earlier to +a later time: +

    +
    +    from 3 to 8
    +
    +

    +is thus well-formed, whereas +

    +
    +    from 8 to 3
    +
    +

    +is not. The following rules for spans impose this condition +by using the Less predicate: +

    +
    +    cat Span ;
    +    fun span : (m,n : Nat) -> Less m n -> Span ;
    +
    +

    -

    Interfaces, instances, and functors

    +

    Variable bindings

    +

    +Mathematical notation and programming languages have lots of +expressions that bind variables. For instance, +a universally quantifier proposition +

    +
    +    (All x)B(x)
    +
    +

    +consists of the binding (All x) of the variable x, +and the body B(x), where the variable x is +said to occur bound. +

    +

    +Variable bindings appear in informal mathematical language as well, for +instance, +

    +
    +    for all x, x is equal to x
    +  
    +    the function that for any numbers x and y returns the maximum of x+y
    +    and x*y
    +
    +

    +

    +In type theory, variable-binding expression forms can be formalized +as functions that take functions as arguments. The universal +quantifier is defined +

    +
    +    fun All : (Ind -> Prop) -> Prop
    +
    +

    +where Ind is the type of individuals and Prop, +the type of propositions. If we have, for instance, the equality predicate +

    +
    +    fun Eq : Ind -> Ind -> Prop
    +
    +

    +we may form the tree +

    +
    +    All (\x -> Eq x x)
    +
    +

    +which corresponds to the ordinary notation +

    +
    +    (All x)(x = x).
    +
    +

    +

    +An abstract syntax where trees have functions as arguments, as in +the two examples above, has turned out to be precisely the right +thing for the semantics and computer implementation of +variable-binding expressions. The advantage lies in the fact that +only one variable-binding expression form is needed, the lambda abstract +\x -> b, and all other bindings can be reduced to it. +This makes it easier to implement mathematical theories and reason +about them, since variable binding is tricky to implement and +to reason about. The idea of using functions as arguments of +syntactic constructors is known as higher-order abstract syntax. +

    +

    +The question now arises: how to define linearization rules +for variable-binding expressions? +Let us first consider universal quantification, +

    +
    +    fun All : (Ind -> Prop) -> Prop.
    +
    +

    +We write +

    +
    +    lin All B = {s = "(" ++ "All" ++ B.$0 ++ ")" ++ B.s}
    +
    +

    +to obtain the form shown above. +This linearization rule brings in a new GF concept - the v +field of B containing a bound variable symbol. +The general rule is that, if an argument type of a function is +itself a function type A -> C, the linearization type of +this argument is the linearization type of C +together with a new field $0 : Str. In the linearization rule +for All, the argument B thus has the linearization +type +

    +
    +    {$0 : Str ; s : Str},
    +
    +

    +since the linearization type of Prop is +

    +
    +    {s : Str}
    +
    +

    +(we remind that the order of fields in a record does not matter). +In other words, the linearization of a function +consists of a linearization of the body together with a +field for a linearization of the bound variable. +Those familiar with type theory or lambda calculus +should notice that GF requires trees to be in +eta-expanded form in order to be linearizable: +any function of type +

    +
    +    A -> C
    +
    +

    +always has a syntax tree of the form +

    +
    +    \x -> c
    +
    +

    +where c : C under the assumption x : A. +It is in this form that an expression can be analysed +as having a bound variable and a body. +

    +

    +Given the linearization rule +

    +
    +    lin Eq a b = {s = "(" ++ a.s ++ "=" ++ b.s ++ ")"}
    +
    +

    +the linearization of +

    +
    +    \x -> Eq x x
    +
    +

    +is the record +

    +
    +    {$0 = "x", s = ["( x = x )"]}
    +
    +

    +Thus we can compute the linearization of the formula, +

    +
    +    All (\x -> Eq x x)  --> {s = "[( All x ) ( x = x )]"}.
    +
    +

    +

    +How did we get the linearization of the variable x +into the string "x"? GF grammars have no rules for +this: it is just hard-wired in GF that variable symbols are +linearized into the same strings that represent them in +the print-out of the abstract syntax. +

    +

    +To be able to +parse variable symbols, however, GF needs to know what +to look for (instead of e.g. trying to parse any +string as a variable). What strings are parsed as variable symbols +is defined in the lexical analysis part of GF parsing (see below). +

    +

    +When editing with grammars that have +bound variables, the names of bound variables are +selected automatically, but can be changed at any time by +using an Alpha Conversion command. +

    +

    +If several variables are bound in the same argument, the +labels are $0, $1, $2, etc. +

    +

    Semantic definitions

    +

    +We have seen that, +just like functional programming languages, GF has declarations +of functions, telling what the type of a function is. +But we have not yet shown how to compute +these functions: all we can do is provide them with arguments +and linearize the resulting terms. +Since our main interest is the well-formedness of expressions, +this has not yet bothered +us very much. As we will see, however, computation does play a role +even in the well-formedness of expressions when dependent types are +present. +

    +

    +GF has a form of judgement for semantic definitions, +recognized by the key word def. At its simplest, it is just +the definition of one constant, e.g. +

    +
    +    def one = succ zero ;
    +
    +

    +We can also define a function with arguments, +

    +
    +    def Neg A = Impl A Abs ;
    +
    +

    +which is still a special case of the most general notion of +definition, that of a group of pattern equations: +

    +
    +    def sum x zero = x ;
    +    def sum x (succ y) = succ (sum x y) ;
    +
    +

    +To compute a term is, as in functional programming languages, +simply to follow a chain of reductions until no definition +can be applied. For instance, we compute +

    +
    +    sum one one -->
    +    sum (succ zero) (succ zero) -->
    +    succ (sum (succ zero) zero) -->
    +    succ (succ zero)
    +
    +

    +

    +The def definitions of a grammar induce a notion of +definitional equality among trees: two trees are +definitionally equal if they compute into the same tree. +Thus, trivially, all trees in a chain of computation +(such as the one above) +are definitionally equal to each other. So are the trees +

    +
    +    sum zero (succ one)
    +    succ one
    +    sum (sum zero zero) (sum (succ zero) one)
    +
    +

    +and infinitely many other trees. +

    +

    +A fact that has to be emphasized about def definitions is that +they are not performed as a first step of linearization. +We say that linearization is intensional, which means that +the definitional equality of two trees does not imply that +they have the same linearizations. For instance, the seven terms +above all have different linearizations in arithmetic notation: +

    +
    +    1 + 1
    +    s(0) + s(0)
    +    s(s(0) + 0)
    +    s(s(0))
    +    0 + s(0)
    +    s(1)
    +    0 + 0 + s(0) + 1
    +
    +

    +This notion of intensionality is +no more exotic than the intensionality of any pretty-printing +function of a programming language (function that shows +the expressions of the language as strings). It is vital for +pretty-printing to be intensional in this sense - if we want, +for instance, to trace a chain of computation by pretty-printing each +intermediate step, what we want to see is a sequence of different +expression, which are definitionally equal. +

    +

    +What is more exotic is that GF has two ways of referring to the +abstract syntax objects. In the concrete syntax, the reference is intentional. +In the abstract syntax itself, the reference is always extensional, since +type checking is extensional. The reason is that, +in the type theory with dependent types, types may depend on terms. +Two types depending on terms that are definitionally equal are +equal types. For instance, +

    +
    +    Proof (Od one)
    +    Proof (Od (succ zero))
    +
    +

    +are equal types. Hence, any tree that type checks as a proof that +1 is odd also type checks as a proof that the successor of 0 is odd. +(Recall, in this connection, that the +arguments a category depends on never play any role +in the linearization of trees of that category, +nor in the definition of the linearization type.) +

    +

    +In addition to computation, definitions impose a +paraphrase relation on expressions: +two strings are paraphrases if they +are linearizations of trees that are +definitionally equal. +Paraphrases are sometimes interesting for +translation: the direct translation +of a string, which is the linearization of the same tree +in the targer language, may be inadequate because it is e.g. +unidiomatic or ambiguous. In such a case, +the translation algorithm may be made to consider +translation by a paraphrase. +

    +

    +To stress express the distinction between +constructors (=canonical functions) +and other functions, GF has a judgement form +data to tell that certain functions are canonical, e.g. +

    +
    +    data Nat = succ | zero ;
    +
    +

    +Unlike in Haskell, but similarly to ALF (where constructor functions +are marked with a flag C), +new constructors can be added to +a type with new data judgements. The type signatures of constructors +are given separately, in ordinary fun judgements. +

    + +

    More features of the module system

    + +

    Interfaces, instances, and functors

    +

    Resource grammars and their reuse

    A resource grammar is a grammar built on linguistic grounds, @@ -2308,15 +3232,15 @@ The rest of the modules (black) come from the resource.

    - +

    Restricted inheritance and qualified opening

    - +

    Using the standard resource library

    The example files of this chapter can be found in the directory arithm.

    - +

    The simplest way

    The simplest way is to open a top-level Lang module @@ -2383,7 +3307,7 @@ Here is an example. }

    - +

    How to find resource functions

    The definitions in this example were found by parsing: @@ -2404,7 +3328,7 @@ The definitions in this example were found by parsing: The use of parsing can be systematized by example-based grammar writing, to which we will return later.

    - +

    A functor implementation

    The interesting thing now is that the @@ -2482,7 +3406,7 @@ Here, again, a complete example (abstract Arithm is as above): }

    - +

    Transfer modules

    Transfer means noncompositional tree-transforming operations. @@ -2501,9 +3425,9 @@ See the transfer language documentation for more information.

    - +

    Practical issues

    - +

    Lexers and unlexers

    Lexers and unlexers can be chosen from @@ -2539,7 +3463,7 @@ Given by help -lexer, help -unlexer:

    - +

    Efficiency of grammars

    Issues: @@ -2550,7 +3474,7 @@ Issues:

  • parsing efficiency: -mcfg vs. others - +

    Speech input and output

    Thespeak_aloud = sa command sends a string to the speech @@ -2580,7 +3504,7 @@ The method words only for grammars of English. Both Flite and ATK are freely available through the links above, but they are not distributed together with GF.

    - +

    Multilingual syntax editor

    The @@ -2597,12 +3521,12 @@ Here is a snapshot of the editor: The grammars of the snapshot are from the Letter grammar package.

    - +

    Interactive Development Environment (IDE)

    Forthcoming.

    - +

    Communicating with GF

    Other processes can communicate with the GF command interpreter, @@ -2619,7 +3543,7 @@ Thus the most silent way to invoke GF is - +

    Embedded grammars in Haskell, Java, and Prolog

    GF grammars can be used as parts of programs written in the @@ -2631,15 +3555,15 @@ following languages. The links give more documentation.

  • Prolog - +

    Alternative input and output grammar formats

    A summary is given in the following chart of GF grammar compiler phases:

    - +

    Case studies

    - +

    Interfacing formal and natural languages

    Formal and Informal Software Specifications, -- cgit v1.2.3