summaryrefslogtreecommitdiff
path: root/doc/tutorial/gf-tutorial2.txt
diff options
context:
space:
mode:
authoraarne <aarne@cs.chalmers.se>2005-12-17 20:44:20 +0000
committeraarne <aarne@cs.chalmers.se>2005-12-17 20:44:20 +0000
commit14defedc653f50d11a52cecba13632688d1ec811 (patch)
tree23749f9d5f4c6d33402e9f837e105f70b2f714e5 /doc/tutorial/gf-tutorial2.txt
parentd3157ad7e7a85a78e60a5bc406ec6cc805037e06 (diff)
tutorial; mkMorpho bug fix
Diffstat (limited to 'doc/tutorial/gf-tutorial2.txt')
-rw-r--r--doc/tutorial/gf-tutorial2.txt194
1 files changed, 147 insertions, 47 deletions
diff --git a/doc/tutorial/gf-tutorial2.txt b/doc/tutorial/gf-tutorial2.txt
index c2b8b853d..72f3cce3a 100644
--- a/doc/tutorial/gf-tutorial2.txt
+++ b/doc/tutorial/gf-tutorial2.txt
@@ -464,18 +464,11 @@ type used for linearization in GF is
```
which has one field, with **label** ``s`` and type ``Str``.
-
-
Examples of records of this type are
```
{s = "foo"}
{s = "hello" ++ "world"}
```
-The type ``Str`` is really the type of **token lists**, but
-most of the time one can conveniently think of it as the type of strings,
-denoted by string literals in double quotes.
-
-
Whenever a record ``r`` of type ``{s : Str}`` is given,
``r.s`` is an object of type ``Str``. This is
@@ -485,6 +478,23 @@ of fields from a record:
- if //r// : ``{`` ... //p// : //T// ... ``}`` then //r.p// : //T//
+The type ``Str`` is really the type of **token lists**, but
+most of the time one can conveniently think of it as the type of strings,
+denoted by string literals in double quotes.
+
+Notice that
+``` "hello world"
+is not recommended as an expression of type ``Str``. It denotes
+a token with a space in it, and will usually
+not work with the lexical analysis that precedes parsing. A shorthand
+exemplified by
+``` ["hello world and people"] === "hello" ++ "world" ++ "and" ++ "people"
+can be used for lists of tokens. The expression
+``` []
+denotes the empty token list.
+
+
+
%--!
===An abstract syntax example===
@@ -1274,8 +1284,6 @@ different linearization types of noun phrases and verb phrases:
We say that the number of ``NP`` is an **inherent feature**,
whereas the number of ``NP`` is **parametric**.
-
-
The agreement rule itself is expressed in the linearization rule of
the predication structure:
```
@@ -1295,28 +1303,33 @@ the formation of noun phrases and verb phrases.
===English concrete syntax with parameters===
```
-concrete PaleolithicEng of Paleolithic = open MorphoEng in {
+concrete PaleolithicEng of Paleolithic = open Prelude, MorphoEng in {
lincat
- S, A = {s : Str} ;
+ S, A = SS ;
VP, CN, V, TV = {s : Number => Str} ;
NP = {s : Str ; n : Number} ;
lin
- PredVP np vp = {s = np.s ++ vp.s ! np.n} ;
+ PredVP np vp = ss (np.s ++ vp.s ! np.n) ;
UseV v = v ;
ComplTV tv np = {s = \\n => tv.s ! n ++ np.s} ;
- UseA a = {s = \\n => case n of {Sg => "is" ; Pl => "are"} ++ a.s} ;
- This cn = {s = "this" ++ cn.s ! Sg } ;
- Indef cn = {s = "a" ++ cn.s ! Sg} ;
- All cn = {s = "all" ++ cn.s ! Pl} ;
- Two cn = {s = "two" ++ cn.s ! Pl} ;
+ UseA a = {s = \\n => case n of {Sg => "is" ; Pl => "are"} ++ a.s} ;
+ This = det Sg "this" ;
+ Indef = det Sg "a" ;
+ All = det Pl "all" ;
+ Two = det Pl "two" ;
ModA a cn = {s = \\n => a.s ++ cn.s ! n} ;
Louse = mkNoun "louse" "lice" ;
Snake = regNoun "snake" ;
- Green = {s = "green"} ;
- Warm = {s = "warm"} ;
+ Green = ss "green" ;
+ Warm = ss "warm" ;
Laugh = regVerb "laugh" ;
Sleep = regVerb "sleep" ;
Kill = regVerb "kill" ;
+oper
+ det : Number -> Str -> Noun -> {s : Str ; n : Number} = \n,d,cn -> {
+ s = d ++ n.s ! n ;
+ n = n
+ } ;
}
```
@@ -1326,22 +1339,18 @@ lin
===Hierarchic parameter types===
The reader familiar with a functional programming language such as
-<a href="http://www.haskell.org">Haskell<a> must have noticed the similarity
-between parameter types in GF and algebraic datatypes (``data`` definitions
+[Haskell http://www.haskell.org] must have noticed the similarity
+between parameter types in GF and **algebraic datatypes** (``data`` definitions
in Haskell). The GF parameter types are actually a special case of algebraic
datatypes: the main restriction is that in GF, these types must be finite.
-(This restriction makes it possible to invert linearization rules into
+(It is this restriction that makes it possible to invert linearization rules into
parsing methods.)
-
-
However, finite is not the same thing as enumerated. Even in GF, parameter
constructors can take arguments, provided these arguments are from other
-parameter types (recursion is forbidden). Such parameter types impose a
-hierarchic order among parameters. They are often useful to define
-linguistically accurate parameter systems.
-
-
+parameter types - only recursion is forbidden. Such parameter types impose a
+hierarchic order among parameters. They are often needed to define
+the linguistically most accurate parameter systems.
To give an example, Swedish adjectives
are inflected in number (singular or plural) and
@@ -1396,7 +1405,7 @@ file for later use, by the command ``morpho_list = ml``
```
> morpho_list -number=25 -cat=V
```
-The number flag gives the number of exercises generated.
+The ``number`` flag gives the number of exercises generated.
@@ -1409,9 +1418,7 @@ verbs, such as //switch off//. The linearization of
a sentence may place the object between the verb and the particle:
//he switched it off//.
-
-
-The first of the following judgements defines transitive verbs as a
+The first of the following judgements defines transitive verbs as
**discontinuous constituents**, i.e. as having a linearization
type with two strings and not just one. The second judgement
shows how the constituents are separated by the object in complementization.
@@ -1419,38 +1426,106 @@ shows how the constituents are separated by the object in complementization.
lincat TV = {s : Number => Str ; s2 : Str} ;
lin ComplTV tv obj = {s = \\n => tv.s ! n ++ obj.s ++ tv.s2} ;
```
+There is no restriction in the number of discontinuous constituents
+(or other fields) a ``lincat`` may contain. The only condition is that
+the fields must be of finite types, i.e. built from records, tables,
+parameters, and ``Str``, and not functions. A mathematical result
+about parsing in GF says that the worst-case complexity of parsing
+increases with the number of discontinuous constituents. Moreover,
+the parsing and linearization commands only give reliable results
+for categories whose linearization type has a unique ``Str`` valued
+field labelled ``s``.
+%--!
+==More constructs for concrete syntax==
-GF currently requires that all fields in linearization records that
-have a table with value type ``Str`` have as labels
-either ``s`` or ``s`` with an integer index.
+%--!
+===Free variation===
+Sometimes there are many alternative ways to define a concrete syntax.
+For instance, the verb negation in English can be expressed both by
+//does not// and //doesn't//. In linguistic terms, these expressions
+are in **free variation**. The ``variants`` construct of GF can
+be used to give a list of strings in free variation. For example,
+```
+ NegVerb verb = {s = variants {["does not"] ; "doesn't} ++ verb.s} ;
+```
+An empty variant list
+```
+ variants {}
+```
+can be used e.g. if a word lacks a certain form.
+In general, ``variants`` should be used cautiously. It is not
+recommended for modules aimed to be libraries, because the
+user of the library has no way to choose among the variants.
+Moreover, even though ``variants`` admits lists of any type,
+its semantics for complex types can cause surprises.
-%--!
-==Topics still to be written==
-===Free variation===
+===Record extension and subtyping===
+Record types and records can be **extended** with new fields. For instance,
+in German it is natural to see transitive verbs as verbs with a case.
+The symbol ``**`` is used for both constructs.
+```
+ lincat TV = Verb ** {c : Case} ;
-===Record extension, tuples===
+ lin Follow = regVerb "folgen" ** {c = Dative} ;
+```
+To extend a record type or a record with a field whose label it
+already has is a type error.
+A record type //T// is a **subtype** of another one //R//, if //T// has
+all the fields of //R// and possibly other fields. For instance,
+an extension of a record type is always a subtype of it.
+If //T// is a subtype of //R//, an object of //T// can be used whenever
+an object of //R// is required. For instance, a transitive verb can
+be used whenever a verb is required.
-===Predefined types and operations===
+**Contravariance** means that a function taking an //R// as argument
+can also be applied to any object of a subtype //T//.
-===Lexers and unlexers===
+===Tuples and product types===
+Product types and tuples are syntactic sugar for record types and records:
+```
+ T1 * ... * Tn === {p1 : T1 ; ... ; pn : Tn}
+ <t1, ..., tn> === {p1 = T1 ; ... ; pn = Tn}
+```
+Thus the labels ``p1, p2,...``` are hard-coded.
-===Grammars of formal languages===
+===Predefined types and operations===
+
+GF has the following predefined categories in abstract syntax:
+```
+ cat Int ; -- integers, e.g. 0, 5, 743145151019
+ cat Float ; -- floats, e.g. 0.0, 3.1415926
+ cat String ; -- strings, e.g. "", "foo", "123"
+```
+The objects of each of these categories are **literals**
+as indicated in the comments above. No ``fun`` definition
+can have a predefined category as its value type, but
+they can be used as arguments. For example:
+```
+ fun StreetAddress : Int -> String -> Address ;
+ lin StreetAddress number street = {s = number.s ++ street.s} ;
+
+ -- e.g. (StreetAddress 10 "Downing Street") : Address
+```
+
+
+%--!
+==More features of the module system==
===Resource grammars and their reuse===
@@ -1459,19 +1534,44 @@ either ``s`` or ``s`` with an integer index.
===Interfaces, instances, and functors===
-===Speech input and output===
+===Restricted inheritance and qualified opening===
+==More concepts of abstract syntax==
-===Embedded grammars in Haskell, Java, and Prolog===
+===Dependent types===
+
+===Higher-order abstract syntax===
+
+===Semantic definitions===
+===Case study: grammars of formal languages===
-===Dependent types, variable bindings, semantic definitions===
-===Transfer modules===
+
+==Transfer modules==
+
+
+
+==Practical issues==
+
+
+===Lexers and unlexers===
+
+
+===Efficiency of grammars===
+
+
+===Speech input and output===
+
+
+===Communicating with GF===
+
+
+===Embedded grammars in Haskell, Java, and Prolog===
===Alternative input and output grammar formats===