tutorial; mkMorpho bug fix

author: aarne <aarne@cs.chalmers.se> 2005-12-17 20:44:20 +0000
committer: aarne <aarne@cs.chalmers.se> 2005-12-17 20:44:20 +0000
commit: 14defedc653f50d11a52cecba13632688d1ec811 (patch)
tree: 23749f9d5f4c6d33402e9f837e105f70b2f714e5 /doc/tutorial/gf-tutorial2.txt
parent: d3157ad7e7a85a78e60a5bc406ec6cc805037e06 (diff)
1 files changed, 147 insertions, 47 deletions
diff --git a/doc/tutorial/gf-tutorial2.txt b/doc/tutorial/gf-tutorial2.txt
index c2b8b853d..72f3cce3a 100644
--- a/doc/tutorial/gf-tutorial2.txt
+++ b/doc/tutorial/gf-tutorial2.txt
@@ -464,18 +464,11 @@ type used for linearization in GF is
 ```
 which has one field, with **label** ``s`` and type ``Str``.
 
-
-
 Examples of records of this type are
 ```
   {s = "foo"}
   {s = "hello" ++ "world"}
 ```
-The type ``Str`` is really the type of **token lists**, but
-most of the time one can conveniently think of it as the type of strings,
-denoted by string literals in double quotes.
-
-
 
 Whenever a record ``r`` of type ``{s : Str}`` is given,
 ``r.s`` is an object of type ``Str``. This is
@@ -485,6 +478,23 @@ of fields from a record:
 - if //r// : ``{`` ... //p// : //T// ... ``}`` then //r.p// : //T//
 
 
+The type ``Str`` is really the type of **token lists**, but
+most of the time one can conveniently think of it as the type of strings,
+denoted by string literals in double quotes. 
+
+Notice that
+```  "hello world"
+is not recommended as an expression of type ``Str``. It denotes
+a token with a space in it, and will usually 
+not work with the lexical analysis that precedes parsing. A shorthand
+exemplified by
+```  ["hello world and people"]  === "hello" ++ "world" ++ "and" ++ "people"
+can be used for lists of tokens. The expression
+```  []
+denotes the empty token list.
+
+
+
 %--!
 ===An abstract syntax example===
 
@@ -1274,8 +1284,6 @@ different linearization types of noun phrases and verb phrases:
 We say that the number of ``NP`` is an **inherent feature**,
 whereas the number of  ``NP`` is **parametric**.
 
-
-
 The agreement rule itself is expressed in the linearization rule of
 the predication structure:
 ```
@@ -1295,28 +1303,33 @@ the formation of noun phrases and verb phrases.
 ===English concrete syntax with parameters===
 
 ```
-concrete PaleolithicEng of Paleolithic = open MorphoEng in {
+concrete PaleolithicEng of Paleolithic = open Prelude, MorphoEng in {
 lincat 
-  S, A          = {s : Str} ; 
+  S, A          = SS ; 
   VP, CN, V, TV = {s : Number => Str} ; 
   NP            = {s : Str ; n : Number} ; 
 lin
-  PredVP np vp  = {s = np.s ++ vp.s ! np.n} ;
+  PredVP np vp  = ss (np.s ++ vp.s ! np.n) ;
   UseV   v      = v ;
   ComplTV tv np = {s = \\n => tv.s ! n ++ np.s} ;
-  UseA   a   = {s = \\n => case n of {Sg => "is" ; Pl => "are"} ++ a.s} ;
-  This  cn   = {s = "this" ++ cn.s ! Sg } ; 
-  Indef cn   = {s = "a" ++ cn.s ! Sg} ; 
-  All   cn   = {s = "all" ++ cn.s ! Pl} ; 
-  Two   cn   = {s = "two" ++ cn.s ! Pl} ; 
+  UseA a   = {s = \\n => case n of {Sg => "is" ; Pl => "are"} ++ a.s} ;
+  This     = det Sg "this" ;
+  Indef    = det Sg "a" ;
+  All      = det Pl "all" ;
+  Two      = det Pl "two" ;
   ModA  a cn = {s = \\n => a.s ++ cn.s ! n} ;
   Louse  = mkNoun "louse" "lice" ;
   Snake  = regNoun "snake" ;
-  Green  = {s = "green"} ;
-  Warm   = {s = "warm"} ;
+  Green  = ss "green" ;
+  Warm   = ss "warm" ;
   Laugh  = regVerb "laugh" ;
   Sleep  = regVerb "sleep" ;
   Kill   = regVerb "kill" ;
+oper
+  det : Number -> Str -> Noun -> {s : Str ; n : Number} = \n,d,cn -> {
+    s = d ++ n.s ! n ;
+    n = n
+    } ;
 }
 ```
 
@@ -1326,22 +1339,18 @@ lin
 ===Hierarchic parameter types===
 
 The reader familiar with a functional programming language such as
-<a href="http://www.haskell.org">Haskell<a> must have noticed the similarity
-between parameter types in GF and algebraic datatypes (``data`` definitions
+[Haskell http://www.haskell.org] must have noticed the similarity
+between parameter types in GF and **algebraic datatypes** (``data`` definitions
 in Haskell). The GF parameter types are actually a special case of algebraic
 datatypes: the main restriction is that in GF, these types must be finite.
-(This restriction makes it possible to invert linearization rules into
+(It is this restriction that makes it possible to invert linearization rules into
 parsing methods.)
 
-
-
 However, finite is not the same thing as enumerated. Even in GF, parameter
 constructors can take arguments, provided these arguments are from other
-parameter types (recursion is forbidden). Such parameter types impose a
-hierarchic order among parameters. They are often useful to define
-linguistically accurate parameter systems.
-
-
+parameter types - only recursion is forbidden. Such parameter types impose a
+hierarchic order among parameters. They are often needed to define
+the linguistically most accurate parameter systems.
 
 To give an example, Swedish adjectives
 are inflected in number (singular or plural) and
@@ -1396,7 +1405,7 @@ file for later use, by the command ``morpho_list = ml``
 ```
   > morpho_list -number=25 -cat=V
 ```
-The number flag gives the number of exercises generated.
+The ``number`` flag gives the number of exercises generated.
 
 
 
@@ -1409,9 +1418,7 @@ verbs, such as //switch off//. The linearization of
 a sentence may place the object between the verb and the particle:
 //he switched it off//.
 
-
-
-The first of the following judgements defines transitive verbs as a
+The first of the following judgements defines transitive verbs as
 **discontinuous constituents**, i.e. as having a linearization
 type with two strings and not just one. The second judgement
 shows how the constituents are separated by the object in complementization.
@@ -1419,38 +1426,106 @@ shows how the constituents are separated by the object in complementization.
   lincat TV = {s : Number => Str ; s2 : Str} ;
   lin ComplTV tv obj = {s = \\n => tv.s ! n ++ obj.s ++ tv.s2} ;
 ```
+There is no restriction in the number of discontinuous constituents
+(or other fields) a  ``lincat`` may contain. The only condition is that
+the fields must be of finite types, i.e. built from records, tables,
+parameters, and ``Str``, and not functions. A mathematical result
+about parsing in GF says that the worst-case complexity of parsing
+increases with the number of discontinuous constituents. Moreover,
+the parsing and linearization commands only give reliable results
+for categories whose linearization type has a unique ``Str`` valued
+field labelled ``s``.
 
 
+%--!
+==More constructs for concrete syntax==
 
-GF currently requires that all fields in linearization records that
-have a table with value type ``Str`` have as labels
-either ``s`` or ``s`` with an integer index.
 
+%--!
+===Free variation===
 
+Sometimes there are many alternative ways to define a concrete syntax.
+For instance, the verb negation in English can be expressed both by
+//does not// and //doesn't//. In linguistic terms, these expressions
+are in **free variation**. The ``variants`` construct of GF can
+be used to give a list of strings in free variation. For example,
+```
+  NegVerb verb = {s = variants {["does not"] ; "doesn't} ++ verb.s} ;
+```
+An empty variant list
+```
+  variants {}
+```
+can be used e.g. if a word lacks a certain form.
 
+In general, ``variants`` should be used cautiously. It is not
+recommended for modules aimed to be libraries, because the
+user of the library has no way to choose among the variants.
+Moreover, even though ``variants`` admits lists of any type,
+its semantics for complex types can cause surprises.
 
-%--!
-==Topics still to be written==
 
 
-===Free variation===
 
+===Record extension and subtyping===
 
+Record types and records can be **extended** with new fields. For instance,
+in German it is natural to see transitive verbs as verbs with a case.
+The symbol ``**`` is used for both constructs.
+```
+  lincat TV = Verb ** {c : Case} ;
 
-===Record extension, tuples===
+  lin Follow = regVerb "folgen" ** {c = Dative} ; 
+```
+To extend a record type or a record with a field whose label it
+already has is a type error.
 
+A record type //T// is a **subtype** of another one //R//, if //T// has
+all the fields of //R// and possibly other fields. For instance,
+an extension of a record type is always a subtype of it.
 
+If //T// is a subtype of //R//, an object of //T// can be used whenever
+an object of //R// is required. For instance, a transitive verb can
+be used whenever a verb is required.
 
-===Predefined types and operations===
+**Contravariance** means that a function taking an //R// as argument
+can also be applied to any object of a subtype //T//.
 
 
 
-===Lexers and unlexers===
+===Tuples and product types===
 
+Product types and tuples are syntactic sugar for record types and records:
+```
+  T1 * ... * Tn   ===   {p1 : T1 ; ... ; pn : Tn}
+  <t1, ...,  tn>  ===   {p1 = T1 ; ... ; pn = Tn}
+```
+Thus the labels ``p1, p2,...``` are hard-coded.
 
 
-===Grammars of formal languages===
 
+===Predefined types and operations===
+
+GF has the following predefined categories in abstract syntax:
+```
+  cat Int ;     -- integers, e.g. 0, 5, 743145151019
+  cat Float ;   -- floats, e.g.   0.0, 3.1415926
+  cat String ;  -- strings, e.g.  "", "foo", "123"
+```
+The objects of each of these categories are **literals**
+as indicated in the comments above. No ``fun`` definition
+can have a predefined category as its value type, but
+they can be used as arguments. For example:
+```
+  fun StreetAddress : Int -> String -> Address ;
+  lin StreetAddress number street = {s = number.s ++ street.s} ;
+
+  -- e.g. (StreetAddress 10 "Downing Street") : Address
+```
+
+
+%--!
+==More features of the module system==
 
 
 ===Resource grammars and their reuse===
@@ -1459,19 +1534,44 @@ either ``s`` or ``s`` with an integer index.
 ===Interfaces, instances, and functors===
 
 
-===Speech input and output===
+===Restricted inheritance and qualified opening===
 
 
+==More concepts of abstract syntax==
 
-===Embedded grammars in Haskell, Java, and Prolog===
 
+===Dependent types===
+
+===Higher-order abstract syntax===
+
+===Semantic definitions===
 
+===Case study: grammars of formal languages===
 
-===Dependent types, variable bindings, semantic definitions===
 
 
 
-===Transfer modules===
+
+==Transfer modules==
+
+
+
+==Practical issues==
+
+
+===Lexers and unlexers===
+
+
+===Efficiency of grammars===
+
+
+===Speech input and output===
+
+
+===Communicating with GF===
+
+
+===Embedded grammars in Haskell, Java, and Prolog===
 
 
 ===Alternative input and output grammar formats===
author	aarne <aarne@cs.chalmers.se>	2005-12-17 20:44:20 +0000
committer	aarne <aarne@cs.chalmers.se>	2005-12-17 20:44:20 +0000
commit	14defedc653f50d11a52cecba13632688d1ec811 (patch)
tree	23749f9d5f4c6d33402e9f837e105f70b2f714e5 /doc/tutorial/gf-tutorial2.txt
parent	d3157ad7e7a85a78e60a5bc406ec6cc805037e06 (diff)