updated tutorial and resource howto

author: aarne <aarne@cs.chalmers.se> 2006-06-15 23:05:42 +0000
committer: aarne <aarne@cs.chalmers.se> 2006-06-15 23:05:42 +0000
commit: cb3dfbd9bf54f9b3cf403ba5e1629bf7fff132f4 (patch)
tree: b3987feacd3b9ca8db55bd906de0698ac6582f77 /doc/tutorial/gf-tutorial2.html
parent: a25c73cb1ae11c5a249ccd1466bf91bc2965f145 (diff)
1 files changed, 378 insertions, 118 deletions
diff --git a/doc/tutorial/gf-tutorial2.html b/doc/tutorial/gf-tutorial2.html
index 00caa1d58..d657f7cc8 100644
--- a/doc/tutorial/gf-tutorial2.html
+++ b/doc/tutorial/gf-tutorial2.html
@@ -7,7 +7,7 @@
 <P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
 <FONT SIZE="4">
 <I>Author: Aarne Ranta &lt;aarne (at) cs.chalmers.se&gt;</I><BR>
-Last update: Wed Jan 25 16:03:03 2006
+Last update: Fri Jun 16 01:02:28 2006
 </FONT></CENTER>
 
 <P></P>
@@ -34,7 +34,7 @@ Last update: Wed Jan 25 16:03:03 2006
       <LI><A HREF="#toc15">Labelled context-free grammars</A>
       <LI><A HREF="#toc16">The labelled context-free format</A>
       </UL>
-    <LI><A HREF="#toc17">The ``.gf`` grammar format</A>
+    <LI><A HREF="#toc17">The .gf grammar format</A>
       <UL>
       <LI><A HREF="#toc18">Abstract and concrete syntax</A>
       <LI><A HREF="#toc19">Judgement forms</A>
@@ -70,8 +70,8 @@ Last update: Wed Jan 25 16:03:03 2006
       <UL>
       <LI><A HREF="#toc42">Parameters and tables</A>
       <LI><A HREF="#toc43">Inflection tables, paradigms, and ``oper`` definitions</A>
-      <LI><A HREF="#toc44">Worst-case macros and data abstraction</A>
-      <LI><A HREF="#toc45">A system of paradigms using ``Prelude`` operations</A>
+      <LI><A HREF="#toc44">Worst-case functions and data abstraction</A>
+      <LI><A HREF="#toc45">A system of paradigms using Prelude operations</A>
       <LI><A HREF="#toc46">An intelligent noun paradigm using ``case`` expressions</A>
       <LI><A HREF="#toc47">Pattern matching</A>
       <LI><A HREF="#toc48">Morphological ``resource`` modules</A>
@@ -96,34 +96,41 @@ Last update: Wed Jan 25 16:03:03 2006
       <LI><A HREF="#toc63">Prefix-dependent choices</A>
       <LI><A HREF="#toc64">Predefined types and operations</A>
       </UL>
-    <LI><A HREF="#toc65">More features of the module system</A>
+    <LI><A HREF="#toc65">More concepts of abstract syntax</A>
       <UL>
-      <LI><A HREF="#toc66">Interfaces, instances, and functors</A>
-      <LI><A HREF="#toc67">Resource grammars and their reuse</A>
-      <LI><A HREF="#toc68">Restricted inheritance and qualified opening</A>
+      <LI><A HREF="#toc66">GF as a logical framework</A>
+      <LI><A HREF="#toc67">Dependent types</A>
+      <LI><A HREF="#toc68">Higher-order abstract syntax</A>
+      <LI><A HREF="#toc69">Semantic definitions</A>
+      <LI><A HREF="#toc70">List categories</A>
       </UL>
-    <LI><A HREF="#toc69">More concepts of abstract syntax</A>
+    <LI><A HREF="#toc71">More features of the module system</A>
       <UL>
-      <LI><A HREF="#toc70">Dependent types</A>
-      <LI><A HREF="#toc71">Higher-order abstract syntax</A>
-      <LI><A HREF="#toc72">Semantic definitions</A>
-      <LI><A HREF="#toc73">List categories</A>
+      <LI><A HREF="#toc72">Interfaces, instances, and functors</A>
+      <LI><A HREF="#toc73">Resource grammars and their reuse</A>
+      <LI><A HREF="#toc74">Restricted inheritance and qualified opening</A>
       </UL>
-    <LI><A HREF="#toc74">Transfer modules</A>
-    <LI><A HREF="#toc75">Practical issues</A>
+    <LI><A HREF="#toc75">Using the standard resource library</A>
       <UL>
-      <LI><A HREF="#toc76">Lexers and unlexers</A>
-      <LI><A HREF="#toc77">Efficiency of grammars</A>
-      <LI><A HREF="#toc78">Speech input and output</A>
-      <LI><A HREF="#toc79">Multilingual syntax editor</A>
-      <LI><A HREF="#toc80">Interactive Development Environment (IDE)</A>
-      <LI><A HREF="#toc81">Communicating with GF</A>
-      <LI><A HREF="#toc82">Embedded grammars in Haskell, Java, and Prolog</A>
-      <LI><A HREF="#toc83">Alternative input and output grammar formats</A>
+      <LI><A HREF="#toc76">The simplest way</A>
+      <LI><A HREF="#toc77">How to find resource functions</A>
+      <LI><A HREF="#toc78">A functor implementation</A>
       </UL>
-    <LI><A HREF="#toc84">Case studies</A>
+    <LI><A HREF="#toc79">Transfer modules</A>
+    <LI><A HREF="#toc80">Practical issues</A>
       <UL>
-      <LI><A HREF="#toc85">Interfacing formal and natural languages</A>
+      <LI><A HREF="#toc81">Lexers and unlexers</A>
+      <LI><A HREF="#toc82">Efficiency of grammars</A>
+      <LI><A HREF="#toc83">Speech input and output</A>
+      <LI><A HREF="#toc84">Multilingual syntax editor</A>
+      <LI><A HREF="#toc85">Interactive Development Environment (IDE)</A>
+      <LI><A HREF="#toc86">Communicating with GF</A>
+      <LI><A HREF="#toc87">Embedded grammars in Haskell, Java, and Prolog</A>
+      <LI><A HREF="#toc88">Alternative input and output grammar formats</A>
+      </UL>
+    <LI><A HREF="#toc89">Case studies</A>
+      <UL>
+      <LI><A HREF="#toc90">Interfacing formal and natural languages</A>
       </UL>
     </UL>
 
@@ -222,7 +229,8 @@ These grammars can be used as <B>libraries</B> to define application grammars.
 In this way, it is possible to write a high-quality grammar without
 knowing about linguistics: in general, to write an application grammar
 by using the resource library just requires practical knowledge of
-the target language.
+the target language. and all theoretical knowledge about its grammar
+is given by the libraries.
 </P>
 <A NAME="toc4"></A>
 <H3>Who is this tutorial for</H3>
@@ -258,9 +266,10 @@ notation (also known as BNF). The BNF format is often a good
 starting point for GF grammar development, because it is
 simple and widely used. However, the BNF format is not
 good for multilingual grammars. While it is possible to
-translate the words contained in a BNF grammar to another
-language, proper translation usually involves more, e.g.
-changing the word order in
+"translate" by just changing the words contained in a 
+BNF grammar to words of some other
+language, proper translation usually involves more.
+For instance, the order of words may have to be changed:
 </P>
 <PRE>
   Italian cheese ===&gt; formaggio italiano
@@ -279,14 +288,14 @@ Italian adjectives usually have four forms where English
 has just one:
 </P>
 <PRE>
-    delicious (wine | wines | pizza | pizzas)
+    delicious (wine, wines, pizza, pizzas)
     vino delizioso, vini deliziosi, pizza deliziosa, pizze deliziose
 </PRE>
 <P>
 The <B>morphology</B> of a language describes the
 forms of its words. While the complete description of morphology
-belongs to resource grammars, the tutorial will explain the
-main programming concepts involved. This will moreover
+belongs to resource grammars, this tutorial will explain the
+programming concepts involved in morphology. This will moreover
 make it possible to grow the fragment covered by the food example.
 The tutorial will in fact build a toy resource grammar in order
 to illustrate the module structure of library-based application
@@ -584,7 +593,7 @@ a sentence but a sequence of ten sentences.
 <H3>Labelled context-free grammars</H3>
 <P>
 The syntax trees returned by GF's parser in the previous examples
-are not so nice to look at. The identifiers of form <CODE>Mks</CODE>
+are not so nice to look at. The identifiers that form the tree
 are <B>labels</B> of the BNF rules. To see which label corresponds to
 which rule, you can use the <CODE>print_grammar = pg</CODE> command
 with the <CODE>printer</CODE> flag set to <CODE>cf</CODE> (which means context-free):
@@ -631,7 +640,7 @@ labels to each rule.
 In files with the suffix <CODE>.cf</CODE>, you can prefix rules with
 labels that you provide yourself - these may be more useful
 than the automatically generated ones. The following is a possible
-labelling of <CODE>paleolithic.cf</CODE> with nicer-looking labels.
+labelling of <CODE>food.cf</CODE> with nicer-looking labels.
 </P>
 <PRE>
     Is.        S       ::= Item "is" Quality ;
@@ -661,7 +670,7 @@ With this grammar, the trees look as follows:
 <IMG ALIGN="middle" SRC="Tree2.png" BORDER="0" ALT="">
 </P>
 <A NAME="toc17"></A>
-<H2>The ``.gf`` grammar format</H2>
+<H2>The .gf grammar format</H2>
 <P>
 To see what there is in GF's shell state when a grammar
 has been imported, you can give the plain command
@@ -696,7 +705,7 @@ A GF grammar consists of two main parts:
 </UL>
 
 <P>
-The EBNF and CF formats fuse these two things together, but it is possible
+The CF format fuses these two things together, but it is possible
 to take them apart. For instance, the sentence formation rule
 </P>
 <PRE>
@@ -773,7 +782,7 @@ judgement forms:
 <P>
 We return to the precise meanings of these judgement forms later.
 First we will look at how judgements are grouped into modules, and
-show how the paleolithic grammar is
+show how the food grammar is
 expressed by using modules and judgements.
 </P>
 <A NAME="toc20"></A>
@@ -950,7 +959,7 @@ A system with this property is called a <B>multilingual grammar</B>.
 </P>
 <P>
 Multilingual grammars can be used for applications such as
-translation. Let us buid an Italian concrete syntax for
+translation. Let us build an Italian concrete syntax for
 <CODE>Food</CODE> and then test the resulting 
 multilingual grammar.
 </P>
@@ -1179,10 +1188,11 @@ The graph uses
 <LI>square boxes  for concrete modules
 <LI>black-headed arrows for inheritance
 <LI>white-headed arrows for the concrete-of-abstract relation
-<P></P>
-<IMG ALIGN="middle" SRC="Foodmarket.png" BORDER="0" ALT="">
 </UL>
 
+<P>
+<IMG ALIGN="middle" SRC="Foodmarket.png" BORDER="0" ALT="">
+</P>
 <A NAME="toc34"></A>
 <H2>System commands</H2>
 <P>
@@ -1203,7 +1213,7 @@ shell escape symbol <CODE>!</CODE>. The resulting graph was shown in the previou
 <P>
 The command <CODE>print_multi = pm</CODE> is used for printing the current multilingual
 grammar in various formats, of which the format <CODE>-printer=graph</CODE> just
-shows the module dependencies. Use the <CODE>help</CODE> to see what other formats
+shows the module dependencies. Use <CODE>help</CODE> to see what other formats
 are available:
 </P>
 <PRE>
@@ -1216,9 +1226,9 @@ are available:
 <A NAME="toc36"></A>
 <H3>The golden rule of functional programming</H3>
 <P>
-In comparison to the <CODE>.cf</CODE> format, the <CODE>.gf</CODE> format still looks rather
+In comparison to the <CODE>.cf</CODE> format, the <CODE>.gf</CODE> format looks rather
 verbose, and demands lots more characters to be written. You have probably
-done this by the copy-paste-modify method, which is a standard way to
+done this by the copy-paste-modify method, which is a common way to
 avoid repeating work.
 </P>
 <P>
@@ -1232,8 +1242,8 @@ method. The <B>golden rule of functional programming</B> says that
 <P>
 A function separates the shared parts of different computations from the
 changing parts, parameters. In functional programming languages, such as
-<A HREF="http://www.haskell.org">Haskell</A>, it is possible to share muc more than in
-the languages such as C and Java.
+<A HREF="http://www.haskell.org">Haskell</A>, it is possible to share much more than in
+languages such as C and Java.
 </P>
 <A NAME="toc37"></A>
 <H3>Operation definitions</H3>
@@ -1283,11 +1293,8 @@ strings and records.
     resource StringOper = {
       oper
         SS : Type = {s : Str} ;
-  
         ss : Str -&gt; SS = \x -&gt; {s = x} ;
-  
         cc : SS -&gt; SS -&gt; SS = \x,y -&gt; ss (x.s ++ y.s) ;
-  
         prefix : Str -&gt; SS -&gt; SS = \p,x -&gt; ss (p ++ x.s) ;
     }
 </PRE>
@@ -1433,7 +1440,7 @@ forms of a word are formed.
 </P>
 <P>
 From GF point of view, a paradigm is a function that takes a <B>lemma</B> -
-a string also known as a <B>dictionary form</B> - and returns an inflection
+also known as a <B>dictionary form</B> - and returns an inflection
 table of desired type. Paradigms are not functions in the sense of the
 <CODE>fun</CODE> judgements of abstract syntax (which operate on trees and not
 on strings), but operations defined in <CODE>oper</CODE> judgements.
@@ -1457,13 +1464,13 @@ are written together to form one <B>token</B>. Thus, for instance,
 </PRE>
 <P></P>
 <A NAME="toc44"></A>
-<H3>Worst-case macros and data abstraction</H3>
+<H3>Worst-case functions and data abstraction</H3>
 <P>
 Some English nouns, such as <CODE>mouse</CODE>, are so irregular that
 it makes no sense to see them as instances of a paradigm. Even
 then, it is useful to perform <B>data abstraction</B> from the
 definition of the type <CODE>Noun</CODE>, and introduce a constructor
-operation, a <B>worst-case macro</B> for nouns:
+operation, a <B>worst-case function</B> for nouns:
 </P>
 <PRE>
     oper mkNoun : Str -&gt; Str -&gt; Noun = \x,y -&gt; {
@@ -1490,7 +1497,7 @@ and
 instead of writing the inflection table explicitly.
 </P>
 <P>
-The grammar engineering advantage of worst-case macros is that
+The grammar engineering advantage of worst-case functions is that
 the author of the resource module may change the definitions of
 <CODE>Noun</CODE> and <CODE>mkNoun</CODE>, and still retain the
 interface (i.e. the system of type signatures) that makes it
@@ -1498,7 +1505,7 @@ correct to use these functions in concrete modules. In programming
 terms, <CODE>Noun</CODE> is then treated as an <B>abstract datatype</B>.
 </P>
 <A NAME="toc45"></A>
-<H3>A system of paradigms using ``Prelude`` operations</H3>
+<H3>A system of paradigms using Prelude operations</H3>
 <P>
 In addition to the completely regular noun paradigm <CODE>regNoun</CODE>, 
 some other frequent noun paradigms deserve to be
@@ -1707,7 +1714,7 @@ The rule of subject-verb agreement in English says that the verb
 phrase must be inflected in the number of the subject. This
 means that a noun phrase (functioning as a subject), inherently
 <I>has</I> a number, which it passes to the verb. The verb does not
-<I>have</I> a number, but must be able to receive whatever number the
+<I>have</I> a number, but must be able to <I>receive</I> whatever number the
 subject has. This distinction is nicely represented by the
 different linearization types of <B>noun phrases</B> and <B>verb phrases</B>:
 </P>
@@ -1717,7 +1724,8 @@ different linearization types of <B>noun phrases</B> and <B>verb phrases</B>:
 </PRE>
 <P>
 We say that the number of <CODE>NP</CODE> is an <B>inherent feature</B>,
-whereas the number of  <CODE>NP</CODE> is <B>parametric</B>.
+whereas the number of  <CODE>NP</CODE> is a <B>variable feature</B> (or a
+<B>parametric feature</B>).
 </P>
 <P>
 The agreement rule itself is expressed in the linearization rule of
@@ -1823,7 +1831,7 @@ Here is an example of pattern matching, the paradigm of regular adjectives.
       }
 </PRE>
 <P>
-A constructor can have patterns as arguments. For instance,
+A constructor can be used as a pattern that has patterns as arguments. For instance,
 the adjectival paradigm in which the two singular forms are the same, 
 can be defined
 </P>
@@ -1837,9 +1845,9 @@ can be defined
 <A NAME="toc54"></A>
 <H3>Morphological analysis and morphology quiz</H3>
 <P>
-Even though in GF morphology
-is mostly seen as an auxiliary of syntax, a morphology once defined
-can be used on its own right. The command <CODE>morpho_analyse = ma</CODE>
+Even though morphology is in GF
+mostly used as an auxiliary for syntax, it
+can also be useful on its own right. The command <CODE>morpho_analyse = ma</CODE>
 can be used to read a text and return for each word the analyses that
 it has in the current concrete syntax.
 </P>
@@ -1865,11 +1873,12 @@ the category is set to be something else than <CODE>S</CODE>. For instance,
     Score 0/1
 </PRE>
 <P>
-Finally, a list of morphological exercises and save it in a
+Finally, a list of morphological exercises can be generated
+off-line saved in a
 file for later use, by the command <CODE>morpho_list = ml</CODE>
 </P>
 <PRE>
-    &gt; morpho_list -number=25 -cat=V
+    &gt; morpho_list -number=25 -cat=V | wf exx.txt
 </PRE>
 <P>
 The <CODE>number</CODE> flag gives the number of exercises generated.
@@ -1884,25 +1893,36 @@ a sentence may place the object between the verb and the particle:
 <I>he switched it off</I>.
 </P>
 <P>
-The first of the following judgements defines transitive verbs as
+The following judgement defines transitive verbs as
 <B>discontinuous constituents</B>, i.e. as having a linearization
-type with two strings and not just one. The second judgement
+type with two strings and not just one. 
+</P>
+<PRE>
+    lincat TV = {s : Number =&gt; Str ; part : Str} ;
+</PRE>
+<P>
+This linearization rule
 shows how the constituents are separated by the object in complementization.
 </P>
 <PRE>
-    lincat TV         = {s : Number =&gt; Str ; part : Str} ;
     lin PredTV tv obj = {s = \\n =&gt; tv.s ! n ++ obj.s ++ tv.part} ;
 </PRE>
 <P>
 There is no restriction in the number of discontinuous constituents
 (or other fields) a  <CODE>lincat</CODE> may contain. The only condition is that
 the fields must be of finite types, i.e. built from records, tables,
-parameters, and <CODE>Str</CODE>, and not functions. A mathematical result
+parameters, and <CODE>Str</CODE>, and not functions. 
+</P>
+<P>
+A mathematical result
 about parsing in GF says that the worst-case complexity of parsing
-increases with the number of discontinuous constituents. Moreover,
-the parsing and linearization commands only give reliable results
-for categories whose linearization type has a unique <CODE>Str</CODE> valued
-field labelled <CODE>s</CODE>.
+increases with the number of discontinuous constituents. This is
+potentially a reason to avoid discontinuous constituents.
+Moreover, the parsing and linearization commands only give accurate
+results for categories whose linearization type has a unique <CODE>Str</CODE> 
+valued field labelled <CODE>s</CODE>. Therefore, discontinuous constituents
+are not a good idea in top-level categories accessed by the users
+of a grammar application.
 </P>
 <A NAME="toc56"></A>
 <H2>More constructs for concrete syntax</H2>
@@ -1953,8 +1973,25 @@ can be used e.g. if a word lacks a certain form.
 In general, <CODE>variants</CODE> should be used cautiously. It is not
 recommended for modules aimed to be libraries, because the
 user of the library has no way to choose among the variants.
-Moreover, even though <CODE>variants</CODE> admits lists of any type,
-its semantics for complex types can cause surprises.
+Moreover, <CODE>variants</CODE> is only defined for basic types (<CODE>Str</CODE>
+and parameter types). The grammar compiler will admit
+<CODE>variants</CODE> for any types, but it will push it to the
+level of basic types in a way that may be unwanted.
+For instance, German has two words meaning "car", 
+<I>Wagen</I>, which is Masculine, and <I>Auto</I>, which is Neuter.
+However, if one writes
+</P>
+<PRE>
+    variants {{s = "Wagen" ; g = Masc} ; {s = "Auto" ; g = Neutr}}
+</PRE>
+<P>
+this will compute to
+</P>
+<PRE>
+    {s = variants {"Wagen" ; "Auto"} ; g = variants {Masc ; Neutr}}
+</PRE>
+<P>
+which will also accept erroneous combinations of strings and genders.
 </P>
 <A NAME="toc59"></A>
 <H3>Record extension and subtyping</H3>
@@ -2039,9 +2076,6 @@ possible to write, slightly surprisingly,
 <A NAME="toc62"></A>
 <H3>Regular expression patterns</H3>
 <P>
-(New since 7 January 2006.)
-</P>
-<P>
 To define string operations computed at compile time, such
 as in morphology, it is handy to use regular expression patterns:
 </P>
@@ -2076,7 +2110,6 @@ Another example: English noun plural formation.
       x + "y"                           =&gt; x + "ies" ;
       _                                 =&gt; w + "s"
       } ;
-  
 </PRE>
 <P>
 Semantics: variables are always bound to the <B>first match</B>, which is the first
@@ -2085,8 +2118,10 @@ in the sequence of binding lists <CODE>Match p v</CODE> defined as follows. In t
 </P>
 <PRE>
     Match (p1|p2) v = Match p1 v ++ Match p2 v
-    Match (p1+p2) s = [Match p1 s1 ++ Match p2 s2 | i &lt;- [0..length s], (s1,s2) = splitAt i s]
-    Match p*      s = Match "" s ++ Match p s ++ Match (p + p) s ++ ...
+    Match (p1+p2) s = [Match p1 s1 ++ Match p2 s2 | 
+                         i &lt;- [0..length s], (s1,s2) = splitAt i s]
+    Match p*      s = [[]] if Match "" s ++ Match p s ++ Match (p+p) s ++... /= []
+    Match -p      v = [[]] if Match p v = []
     Match c       v = [[]] if c == v  -- for constant and literal patterns c
     Match x       v = [[(x,v)]]       -- for variable patterns x
     Match x@p     v = [[(x,v)]] + M   if M = Match p v /= []
@@ -2097,14 +2132,18 @@ Examples:
 </P>
 <UL>
 <LI><CODE>x + "e" + y</CODE> matches <CODE>"peter"</CODE> with <CODE>x = "p", y = "ter"</CODE>
-<LI><CODE>x@("foo"*)</CODE> matches any token with <CODE>x = ""</CODE>
-<LI><CODE>x + y@("er"*)</CODE> matches <CODE>"burgerer"</CODE> with <CODE>x = "burg", y = "erer"</CODE>
+<LI><CODE>x + "er"*</CODE> matches <CODE>"burgerer"</CODE> with ``x = "burg"
 </UL>
 
 <A NAME="toc63"></A>
 <H3>Prefix-dependent choices</H3>
 <P>
-The construct exemplified in
+Sometimes a token has different forms depending on the token
+that follows. An example is the English indefinite article,
+which is <I>an</I> if a vowel follows, <I>a</I> otherwise.
+Which form is chosen can only be decided at run time, i.e.
+when a string is actually build. GF has a special construct for
+such tokens, the <CODE>pre</CODE> construct exemplified in
 </P>
 <PRE>
     oper artIndef : Str = 
@@ -2152,22 +2191,61 @@ they can be used as arguments. For example:
   
     -- e.g. (StreetAddress 10 "Downing Street") : Address
 </PRE>
-<P></P>
+<P>
+The linearization type is <CODE>{s : Str}</CODE> for all these categories.
+</P>
 <A NAME="toc65"></A>
-<H2>More features of the module system</H2>
+<H2>More concepts of abstract syntax</H2>
 <A NAME="toc66"></A>
-<H3>Interfaces, instances, and functors</H3>
+<H3>GF as a logical framework</H3>
+<P>
+In this section, we will show how 
+to encode advanced semantic concepts in an abstract syntax.
+We use concepts inherited from <B>type theory</B>. Type theory
+is the basis of many systems known as <B>logical frameworks</B>, which are
+used for representing mathematical theorems and their proofs on a computer.
+In fact, GF has a logical framework as its proper part:
+this part is the abstract syntax.
+</P>
+<P>
+In a logical framework, the formalization of a mathematical theory
+is a set of type and function declarations. The following is an example
+of such a theory, represented as an <CODE>abstract</CODE> module in GF.
+</P>
+<PRE>
+    abstract Geometry = {
+      cat 
+        Line ; Point ; Circle ;            -- basic types of figures
+        Prop ;                             -- proposition
+      fun
+        Parallel : Line -&gt; Line -&gt; Prop ;  -- x is parallel to y
+        Centre : Circle -&gt; Point ;         -- the centre of c
+      } 
+</PRE>
+<P></P>
 <A NAME="toc67"></A>
+<H3>Dependent types</H3>
+<A NAME="toc68"></A>
+<H3>Higher-order abstract syntax</H3>
+<A NAME="toc69"></A>
+<H3>Semantic definitions</H3>
+<A NAME="toc70"></A>
+<H3>List categories</H3>
+<A NAME="toc71"></A>
+<H2>More features of the module system</H2>
+<A NAME="toc72"></A>
+<H3>Interfaces, instances, and functors</H3>
+<A NAME="toc73"></A>
 <H3>Resource grammars and their reuse</H3>
 <P>
 A resource grammar is a grammar built on linguistic grounds,
 to describe a language rather than a domain.
-The GF resource grammar library contains resource grammars for
+The GF resource grammar library, which contains resource grammars for
 10 languages, is described more closely in the following
 documents:
 </P>
 <UL>
-<LI><A HREF="../../lib/resource/doc/gf-resource.html">Resource library API documentation</A>:
+<LI><A HREF="../../lib/resource-1.0/doc/">Resource library API documentation</A>:
   for application grammarians using the resource.
 <LI><A HREF="../../lib/resource-1.0/doc/Resource-HOWTO.html">Resource writing HOWTO</A>:
   for resource grammarians developing the resource.
@@ -2177,21 +2255,41 @@ documents:
 However, to give a flavour of both using and writing resource grammars,
 we have created a miniature resource, which resides in the
 subdirectory <A HREF="resource"><CODE>resource</CODE></A>. Its API consists of the following
-modules:
+three modules:
 </P>
-<UL>
-<LI><A HREF="resource/Syntax.gf">Syntax</A>: syntactic structures, language-independent
-<LI><A HREF="resource/LexEng.gf">LexEng</A>: lexical paradigms, English
-<LI><A HREF="resource/LexIta.gf">LexIta</A>: lexical paradigms, Italian
-</UL>
-
+<P>
+<A HREF="resource/Syntax.gf">Syntax</A> - syntactic structures, language-independent:
+</P>
+<PRE>
+  
+</PRE>
+<P>
+<A HREF="resource/LexEng.gf">LexEng</A> - lexical paradigms, English:
+</P>
+<PRE>
+  
+</PRE>
+<P>
+<A HREF="resource/LexIta.gf">LexIta</A> - lexical paradigms, Italian:
+</P>
+<PRE>
+  
+</PRE>
+<P></P>
 <P>
 Only these three modules should be <CODE>open</CODE>ed in applications.
 The implementations of the resource are given in the following four modules:
 </P>
+<P>
+<A HREF="resource/MorphoEng.gf">MorphoEng</A>,
+</P>
+<PRE>
+  
+</PRE>
+<P>
+<A HREF="resource/MorphoIta.gf">MorphoIta</A>: low-level morphology
+</P>
 <UL>
-<LI><A HREF="resource/MorphoEng.gf">MorphoEng</A>,
-  <A HREF="resource/MorphoIta.gf">MorphoIta</A>: low-level morphology
 <LI><A HREF="resource/SyntaxEng.gf">SyntaxEng</A>.
   <A HREF="resource/SyntaxIta.gf">SyntaxIta</A>: definitions of syntactic structures
 </UL>
@@ -2210,19 +2308,181 @@ The rest of the modules (black) come from the resource.
 <P>
 <IMG ALIGN="middle" SRC="Multi.png" BORDER="0" ALT="">
 </P>
-<A NAME="toc68"></A>
-<H3>Restricted inheritance and qualified opening</H3>
-<A NAME="toc69"></A>
-<H2>More concepts of abstract syntax</H2>
-<A NAME="toc70"></A>
-<H3>Dependent types</H3>
-<A NAME="toc71"></A>
-<H3>Higher-order abstract syntax</H3>
-<A NAME="toc72"></A>
-<H3>Semantic definitions</H3>
-<A NAME="toc73"></A>
-<H3>List categories</H3>
 <A NAME="toc74"></A>
+<H3>Restricted inheritance and qualified opening</H3>
+<A NAME="toc75"></A>
+<H2>Using the standard resource library</H2>
+<P>
+The example files of this chapter can be found in
+the directory <A HREF="./arithm"><CODE>arithm</CODE></A>.
+</P>
+<A NAME="toc76"></A>
+<H3>The simplest way</H3>
+<P>
+The simplest way is to <CODE>open</CODE> a top-level <CODE>Lang</CODE> module
+and a <CODE>Paradigms</CODE> module: 
+</P>
+<PRE>
+    abstract Foo = ...
+  
+    concrete FooEng = open LangEng, ParadigmsEng in ...
+    concrete FooSwe = open LangSwe, ParadigmsSwe in ...
+</PRE>
+<P>
+Here is an example.
+</P>
+<PRE>
+  abstract Arithm = {
+    cat
+      Prop ;
+      Nat ;
+    fun
+      Zero : Nat ;
+      Succ : Nat -&gt; Nat ;
+      Even : Nat -&gt; Prop ;
+      And  : Prop -&gt; Prop -&gt; Prop ;
+  }
+  
+  --# -path=.:alltenses:prelude
+  
+  concrete ArithmEng of Arithm = open LangEng, ParadigmsEng in {
+    lincat
+      Prop = S ;
+      Nat  = NP ;
+    lin
+      Zero = 
+        UsePN (regPN "zero" nonhuman) ;
+      Succ n = 
+        DetCN (DetSg (SgQuant DefArt) NoOrd) (ComplN2 (regN2 "successor") n) ;
+      Even n = 
+        UseCl TPres ASimul PPos 
+          (PredVP n (UseComp (CompAP (PositA (regA "even"))))) ;
+      And x y = 
+        ConjS and_Conj (BaseS x y) ;
+  
+  }
+  
+  --# -path=.:alltenses:prelude
+  
+  concrete ArithmSwe of Arithm = open LangSwe, ParadigmsSwe in {
+    lincat
+      Prop = S ;
+      Nat  = NP ;
+    lin
+      Zero = 
+        UsePN (regPN "noll" neutrum) ;
+      Succ n = 
+        DetCN (DetSg (SgQuant DefArt) NoOrd) 
+          (ComplN2 (mkN2 (mk2N "efterf�ljare" "efterf�ljare") 
+             (mkPreposition "till")) n) ;
+      Even n = 
+        UseCl TPres ASimul PPos 
+          (PredVP n (UseComp (CompAP (PositA (regA "j�mn"))))) ;
+      And x y = 
+        ConjS and_Conj (BaseS x y) ;
+  }
+</PRE>
+<P></P>
+<A NAME="toc77"></A>
+<H3>How to find resource functions</H3>
+<P>
+The definitions in this example were found by parsing:
+</P>
+<PRE>
+    &gt; i LangEng.gf
+  
+    -- for Successor:
+    &gt; p -cat=NP -mcfg -parser=topdown "the mother of Paris"
+  
+    -- for Even:
+    &gt; p -cat=S -mcfg -parser=topdown "Paris is old"
+  
+    -- for And:
+    &gt; p -cat=S -mcfg -parser=topdown "Paris is old and I am old"
+</PRE>
+<P>
+The use of parsing can be systematized by <B>example-based grammar writing</B>,
+to which we will return later. 
+</P>
+<A NAME="toc78"></A>
+<H3>A functor implementation</H3>
+<P>
+The interesting thing now is that the
+code in <CODE>ArithmSwe</CODE> is similar to the code in <CODE>ArithmEng</CODE>, except for
+some lexical items ("noll" vs. "zero", "efterf�ljare" vs. "successor",
+"j�mn" vs. "even").  How can we exploit the similarities and
+actually share code between the languages?
+</P>
+<P>
+The solution is to use a functor: an <CODE>incomplete</CODE> module that opens
+an <CODE>abstract</CODE> as an <CODE>interface</CODE>, and then instantiate it to different
+languages that implement the interface. The structure is as follows:
+</P>
+<PRE>
+    abstract Foo ...
+  
+    incomplete concrete FooI = open Lang, Lex in ...
+  
+    concrete FooEng of Foo = FooI with (Lang=LangEng), (Lex=LexEng) ;
+    concrete FooSwe of Foo = FooI with (Lang=LangSwe), (Lex=LexSwe) ;
+</PRE>
+<P>
+where <CODE>Lex</CODE> is an abstract lexicon that includes the vocabulary
+specific to this application:
+</P>
+<PRE>
+    abstract Lex = Cat ** ...
+  
+    concrete LexEng of Lex = CatEng ** open ParadigmsEng in ...
+    concrete LexSwe of Lex = CatSwe ** open ParadigmsSwe in ...  
+</PRE>
+<P>
+Here, again, a complete example (<CODE>abstract Arithm</CODE> is as above):
+</P>
+<PRE>
+  incomplete concrete ArithmI of Arithm = open Lang, Lex in {
+    lincat
+      Prop = S ;
+      Nat  = NP ;
+    lin
+      Zero = 
+        UsePN zero_PN ;
+      Succ n = 
+        DetCN (DetSg (SgQuant DefArt) NoOrd) (ComplN2 successor_N2 n) ;
+      Even n = 
+        UseCl TPres ASimul PPos 
+          (PredVP n (UseComp (CompAP (PositA even_A)))) ;
+      And x y = 
+        ConjS and_Conj (BaseS x y) ;
+  }
+  
+  --# -path=.:alltenses:prelude
+  concrete ArithmEng of Arithm = ArithmI with
+    (Lang = LangEng),
+    (Lex = LexEng) ;
+  
+  --# -path=.:alltenses:prelude
+  concrete ArithmSwe of Arithm = ArithmI with
+    (Lang = LangSwe),
+    (Lex = LexSwe) ;
+  
+  abstract Lex = Cat ** {
+    fun
+      zero_PN : PN ;
+      successor_N2 : N2 ;  
+      even_A : A ;
+  }
+  
+  concrete LexSwe of Lex = CatSwe ** open ParadigmsSwe in {
+    lin 
+      zero_PN = regPN "noll" neutrum ;
+      successor_N2 = 
+        mkN2 (mk2N "efterf�ljare" "efterf�ljare") (mkPreposition "till") ;
+      even_A = regA "j�mn" ;
+  }
+</PRE>
+<P></P>
+<A NAME="toc79"></A>
 <H2>Transfer modules</H2>
 <P>
 Transfer means noncompositional tree-transforming operations.
@@ -2241,9 +2501,9 @@ See the
 <A HREF="../transfer.html">transfer language documentation</A>
 for more information.
 </P>
-<A NAME="toc75"></A>
+<A NAME="toc80"></A>
 <H2>Practical issues</H2>
-<A NAME="toc76"></A>
+<A NAME="toc81"></A>
 <H3>Lexers and unlexers</H3>
 <P>
 Lexers and unlexers can be chosen from
@@ -2279,7 +2539,7 @@ Given by <CODE>help -lexer</CODE>, <CODE>help -unlexer</CODE>:
   
 </PRE>
 <P></P>
-<A NAME="toc77"></A>
+<A NAME="toc82"></A>
 <H3>Efficiency of grammars</H3>
 <P>
 Issues:
@@ -2290,7 +2550,7 @@ Issues:
 <LI>parsing efficiency: <CODE>-mcfg</CODE> vs. others
 </UL>
 
-<A NAME="toc78"></A>
+<A NAME="toc83"></A>
 <H3>Speech input and output</H3>
 <P>
 The<CODE>speak_aloud = sa</CODE> command sends a string to the speech
@@ -2320,7 +2580,7 @@ The method words only for grammars of English.
 Both Flite and ATK are freely available through the links
 above, but they are not distributed together with GF.
 </P>
-<A NAME="toc79"></A>
+<A NAME="toc84"></A>
 <H3>Multilingual syntax editor</H3>
 <P>
 The 
@@ -2337,12 +2597,12 @@ Here is a snapshot of the editor:
 The grammars of the snapshot are from the
 <A HREF="http://www.cs.chalmers.se/~aarne/GF/examples/letter">Letter grammar package</A>.
 </P>
-<A NAME="toc80"></A>
+<A NAME="toc85"></A>
 <H3>Interactive Development Environment (IDE)</H3>
 <P>
 Forthcoming.
 </P>
-<A NAME="toc81"></A>
+<A NAME="toc86"></A>
 <H3>Communicating with GF</H3>
 <P>
 Other processes can communicate with the GF command interpreter,
@@ -2359,7 +2619,7 @@ Thus the most silent way to invoke GF is
 </PRE>
 </UL>
 
-<A NAME="toc82"></A>
+<A NAME="toc87"></A>
 <H3>Embedded grammars in Haskell, Java, and Prolog</H3>
 <P>
 GF grammars can be used as parts of programs written in the
@@ -2371,15 +2631,15 @@ following languages. The links give more documentation.
 <LI><A HREF="http://www.cs.chalmers.se/~peb/software.html">Prolog</A>
 </UL>
 
-<A NAME="toc83"></A>
+<A NAME="toc88"></A>
 <H3>Alternative input and output grammar formats</H3>
 <P>
 A summary is given in the following chart of GF grammar compiler phases:
 <IMG ALIGN="middle" SRC="../gf-compiler.png" BORDER="0" ALT="">
 </P>
-<A NAME="toc84"></A>
+<A NAME="toc89"></A>
 <H2>Case studies</H2>
-<A NAME="toc85"></A>
+<A NAME="toc90"></A>
 <H3>Interfacing formal and natural languages</H3>
 <P>
 <A HREF="http://www.cs.chalmers.se/~krijo/thesis/thesisA4.pdf">Formal and Informal Software Specifications</A>,
@@ -2392,6 +2652,6 @@ English and German.
 A simpler example will be explained here.
 </P>
 
-<!-- html code generated by txt2tags 2.0 (http://txt2tags.sf.net) -->
+<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
 <!-- cmdline: txt2tags -\-toc gf-tutorial2.txt -->
 </BODY></HTML>
author	aarne <aarne@cs.chalmers.se>	2006-06-15 23:05:42 +0000
committer	aarne <aarne@cs.chalmers.se>	2006-06-15 23:05:42 +0000
commit	cb3dfbd9bf54f9b3cf403ba5e1629bf7fff132f4 (patch)
tree	b3987feacd3b9ca8db55bd906de0698ac6582f77 /doc/tutorial/gf-tutorial2.html
parent	a25c73cb1ae11c5a249ccd1466bf91bc2965f145 (diff)