summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authoraarne <aarne@cs.chalmers.se>2007-12-21 15:10:38 +0000
committeraarne <aarne@cs.chalmers.se>2007-12-21 15:10:38 +0000
commit5ee1714fd23e974d1cf2511fa398b6ce310a9807 (patch)
tree7a82f85d4f4681086430fdefd7903e4a26015c3f
parentc5017f28aad7702838b9861aa3f6cbf7b3bacca5 (diff)
new tutorial and reference manual
-rw-r--r--doc/10lang-small.pngbin0 -> 66840 bytes
-rw-r--r--doc/German.pngbin0 -> 21000 bytes
-rw-r--r--doc/Syntax.pngbin0 -> 14804 bytes
-rw-r--r--doc/categories.pngbin0 -> 4241 bytes
-rw-r--r--doc/food-translet.pngbin0 -> 22916 bytes
-rw-r--r--doc/food1.pngbin0 -> 22805 bytes
-rw-r--r--doc/food2.pngbin0 -> 31506 bytes
-rw-r--r--doc/foodmarket.pngbin0 -> 2099 bytes
-rw-r--r--doc/gf-refman.html4545
-rw-r--r--doc/gf-tutorial.html7952
-rw-r--r--doc/mytree.pngbin0 -> 2230 bytes
11 files changed, 12497 insertions, 0 deletions
diff --git a/doc/10lang-small.png b/doc/10lang-small.png
new file mode 100644
index 000000000..49a3d0a98
--- /dev/null
+++ b/doc/10lang-small.png
Binary files differ
diff --git a/doc/German.png b/doc/German.png
new file mode 100644
index 000000000..7c6303897
--- /dev/null
+++ b/doc/German.png
Binary files differ
diff --git a/doc/Syntax.png b/doc/Syntax.png
new file mode 100644
index 000000000..1cc8161b1
--- /dev/null
+++ b/doc/Syntax.png
Binary files differ
diff --git a/doc/categories.png b/doc/categories.png
new file mode 100644
index 000000000..afc5873c5
--- /dev/null
+++ b/doc/categories.png
Binary files differ
diff --git a/doc/food-translet.png b/doc/food-translet.png
new file mode 100644
index 000000000..dd622a4bf
--- /dev/null
+++ b/doc/food-translet.png
Binary files differ
diff --git a/doc/food1.png b/doc/food1.png
new file mode 100644
index 000000000..767069dab
--- /dev/null
+++ b/doc/food1.png
Binary files differ
diff --git a/doc/food2.png b/doc/food2.png
new file mode 100644
index 000000000..b36a01b22
--- /dev/null
+++ b/doc/food2.png
Binary files differ
diff --git a/doc/foodmarket.png b/doc/foodmarket.png
new file mode 100644
index 000000000..6b0e3fbd7
--- /dev/null
+++ b/doc/foodmarket.png
Binary files differ
diff --git a/doc/gf-refman.html b/doc/gf-refman.html
new file mode 100644
index 000000000..b84079ecf
--- /dev/null
+++ b/doc/gf-refman.html
@@ -0,0 +1,4545 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
+<HTML>
+<HEAD>
+<META NAME="generator" CONTENT="http://txt2tags.sf.net">
+<TITLE>GF Language Reference Manual</TITLE>
+</HEAD><BODY BGCOLOR="white" TEXT="black">
+<P ALIGN="center"><CENTER><H1>GF Language Reference Manual</H1>
+<FONT SIZE="4">
+<I>Aarne Ranta</I><BR>
+</FONT></CENTER>
+
+<P></P>
+<HR NOSHADE SIZE=1>
+<P></P>
+ <UL>
+ <LI><A HREF="#toc1">Overview of GF</A>
+ <LI><A HREF="#toc2">The module system</A>
+ <UL>
+ <LI><A HREF="#toc3">Top-level and supplementary module structure</A>
+ <LI><A HREF="#toc4">Compilation units</A>
+ <LI><A HREF="#toc5">Names</A>
+ <LI><A HREF="#toc6">The structure of a module</A>
+ <LI><A HREF="#toc7">Module types, headers, and bodies</A>
+ <LI><A HREF="#toc8">Digression: the logic of module types</A>
+ <LI><A HREF="#toc9">Inheritance</A>
+ <LI><A HREF="#toc10">Opening</A>
+ <LI><A HREF="#toc11">Name resolution</A>
+ <LI><A HREF="#toc12">Functor instantiations</A>
+ <LI><A HREF="#toc13">Completeness</A>
+ </UL>
+ <LI><A HREF="#toc14">Judgements</A>
+ <UL>
+ <LI><A HREF="#toc15">Overview of the forms of judgement</A>
+ <LI><A HREF="#toc16">Category declarations, cat</A>
+ <LI><A HREF="#toc17">Hypotheses and contexts</A>
+ <LI><A HREF="#toc18">Function declarations, fun</A>
+ <LI><A HREF="#toc19">Function definitions, def</A>
+ <LI><A HREF="#toc20">Data constructor definitions, data</A>
+ <LI><A HREF="#toc21">The semantic status of an abstract syntax function</A>
+ <LI><A HREF="#toc22">Linearization type definitions, lincat</A>
+ <LI><A HREF="#toc23">Linearization definitions, lin</A>
+ <LI><A HREF="#toc24">Linearization default definitions, lindef</A>
+ <LI><A HREF="#toc25">Printname definitions, printname cat and printname fun</A>
+ <LI><A HREF="#toc26">Parameter type definitions, param</A>
+ <LI><A HREF="#toc27">Parameter values</A>
+ <LI><A HREF="#toc28">Operation definitions, oper</A>
+ <LI><A HREF="#toc29">Operation overloading</A>
+ <LI><A HREF="#toc30">Flag definitions, flags</A>
+ </UL>
+ <LI><A HREF="#toc31">Types and expressions</A>
+ <UL>
+ <LI><A HREF="#toc32">Overview of expression forms</A>
+ <LI><A HREF="#toc33">The functional fragment: expressions in abstract syntax</A>
+ <LI><A HREF="#toc34">Conversions</A>
+ <LI><A HREF="#toc35">Syntax trees</A>
+ <LI><A HREF="#toc36">Predefined types in abstract syntax</A>
+ <LI><A HREF="#toc37">Overview of expressions in concrete syntax</A>
+ <LI><A HREF="#toc38">Values, canonical forms, and run-time variables</A>
+ <LI><A HREF="#toc39">Token lists, tokens, and strings</A>
+ <LI><A HREF="#toc40">Records and record types</A>
+ <LI><A HREF="#toc41">Subtyping</A>
+ <LI><A HREF="#toc42">Tables and table types</A>
+ <LI><A HREF="#toc43">Pattern matching</A>
+ <LI><A HREF="#toc44">Free variation</A>
+ <LI><A HREF="#toc45">Local definitions</A>
+ <LI><A HREF="#toc46">Function applications in concrete syntax</A>
+ <LI><A HREF="#toc47">Reusing top-level grammars as resources</A>
+ <LI><A HREF="#toc48">Predefined concrete syntax types</A>
+ <LI><A HREF="#toc49">Predefined concrete syntax operations</A>
+ </UL>
+ <LI><A HREF="#toc50">Flags and pragmas</A>
+ <UL>
+ <LI><A HREF="#toc51">Some flags and their values</A>
+ <LI><A HREF="#toc52">Compiler pragmas</A>
+ </UL>
+ <LI><A HREF="#toc53">Alternative grammar input formats</A>
+ <UL>
+ <LI><A HREF="#toc54">Old GF without modules</A>
+ <LI><A HREF="#toc55">Context-free grammars</A>
+ <LI><A HREF="#toc56">Extended BNF grammars</A>
+ <LI><A HREF="#toc57">Example-based grammars</A>
+ </UL>
+ <LI><A HREF="#toc58">The grammar of GF</A>
+ <LI><A HREF="#toc59">The lexical structure of GF</A>
+ <UL>
+ <LI><A HREF="#toc60">Identifiers</A>
+ <LI><A HREF="#toc61">Literals</A>
+ <LI><A HREF="#toc62">Reserved words and symbols</A>
+ <LI><A HREF="#toc63">Comments</A>
+ </UL>
+ <LI><A HREF="#toc64">The syntactic structure of GF</A>
+ </UL>
+
+<P></P>
+<HR NOSHADE SIZE=1>
+<P></P>
+<P>
+
+</P>
+<P>
+This document is a reference manual to the GF programming language.
+GF, Grammatical Framework, is a special-purpose programming language,
+designed to support definitions of grammars.
+</P>
+<P>
+This document is not an introduction to GF; such introduction can be
+found in the GF tutorial available on line on the GF web page,
+</P>
+<P>
+<A HREF="http://digitalgrammars.com/gf"><CODE>digitalgrammars.com/gf</CODE></A>
+</P>
+<P>
+This manual covers only the language, not the GF compiler or
+interactive system. We will however make some references to different
+compiler versions, if they involve changes of behaviour having to
+do with the language specification.
+</P>
+<P>
+This manual is meant to be fully compatible with GF version 3.0
+(forthcoming). Main discrepancies with version 2.8 are indicated,
+as well as with the reference article on GF,
+</P>
+<P>
+A. Ranta, "Grammatical Framework. A Type Theoretical Grammar Formalism",
+<I>The Journal of Functional Programming</I> 14(2), 2004, pp. 145-189.
+</P>
+<P>
+This article will referred to as "the JFP article".
+</P>
+<P>
+As metalinguistic notation, we will use the symbols
+</P>
+<UL>
+<LI><I>a</I> === <I>b</I> to say that <I>a</I> is syntactic sugar for <I>b</I>
+<LI><I>a</I> ==> <I>b</I> to say that <I>a</I> is computed (or compiled) to <I>b</I>
+</UL>
+
+<A NAME="toc1"></A>
+<H2>Overview of GF</H2>
+<P>
+GF is a typed functional language,
+borrowing many of its constructs from ML and Haskell: algebraic datatypes,
+higher-order functions, pattern matching. The module system bears resemblance
+to ML (functors) but also to object-oriented languages (inheritance).
+The type theory used in the abstract syntax part of GF is inherited from
+logical frameworks, in particular ALF ("Another Logical Framework"; in a
+sense, GF is Yet Another ALF). From ALF comes also the use of dependent
+types, including the use of explicit type variables instead of
+Hindley-Milner polymorphism.
+</P>
+<P>
+The look and feel of GF is close to Java and
+C, due to the use of curly brackets and semicolons in structuring the code;
+the expression syntax, however, follows Haskell in using juxtaposition for
+function application and parentheses only for grouping.
+</P>
+<P>
+To understand the constructs of GF, and especially their limitations in comparison
+to general-purpose programming languages, it is essential to keep in mind that
+GF is a special-purpose and non-turing-complete language. Every GF program is
+ultimately compiled to a <B>multilingual grammar</B>, which consists of an
+<B>abstract syntax</B> and a set of <B>concrete syntaxes</B>. The abstract syntax
+defines a system of <B>syntax trees</B>, and each concrete syntax defines a
+mapping from those syntax trees to <B>nested tuples</B> of strings and integers.
+This mapping is <B>compositional</B>, i.e. <B>homomorphic</B>, and moreover
+<B>reversible</B>: given a nested tuple, there exists an effective way of finding
+the set of syntax trees that map to this tuple. The procedure of applying the
+mapping to a tree to produce a tuple is called <B>linearization</B>, and the
+reverse search procedure is called <B>parsing</B>. It is ultimately the requirement
+of reversibility that restricts GF to be less than turing-complete. This is
+reflected in restrictions to recursion in concrete syntax. Tree formation in
+abstract syntax, however, is fully recursive.
+</P>
+<P>
+Even though run-time GF grammars manipulate just nested tuples, at compile
+time these are represented by by the more fine-grained labelled records
+and finite functions over algebraic datatypes. This enables the programmer
+to write on a higher abstraction level, and also adds type distinctions
+and hence raises the level of checking of programs.
+</P>
+<A NAME="toc2"></A>
+<H2>The module system</H2>
+<A NAME="toc3"></A>
+<H3>Top-level and supplementary module structure</H3>
+<P>
+The big picture of GF as a programming language for multilingual grammars
+explains its principal module structure. Any GF grammar must have an
+abstract syntax module; it can in addition have any number of concrete
+syntax modules matching that abstract syntax. Before going to details,
+we give a simple example: a module defining the <B>category</B> <CODE>A</CODE>
+of adjectives and one adjective-forming <B>function</B>, the zero-place function
+<CODE>Even</CODE>. We give the module the name <CODE>Adj</CODE>. The GF code for the
+module looks as follows:
+</P>
+<PRE>
+ abstract Adj = {
+ cat A ;
+ fun Even : A ;
+ }
+</PRE>
+<P>
+Here are two concrete syntax modules, one intended for mapping the trees
+to English, the other to Swedish. The mappling is defined by
+<CODE>lincat</CODE> definitions assigning a <B>linearization type</B> to each category,
+and <CODE>lin</CODE> definitions assigning a <B>linearization</B> to each function.
+</P>
+<PRE>
+ concrete AdjEng of Adj = {
+ lincat A = {s : Str} ;
+ lin Even = {s = "even"} ;
+ }
+
+ concrete AdjSwe of Adj = {
+ lincat A = {s : AForm =&gt; Str} ;
+ lin Even = {s = table {
+ ASg Utr =&gt; "jämn" ;
+ ASg Neutr =&gt; "jämnt" ;
+ APl =&gt; "jämna"
+ }
+ } ;
+ param AForm = ASg Gender | APl ;
+ param Gender = Utr | Neutr ;
+ }
+</PRE>
+<P>
+These examples illustrate the main ideas of multilingual grammars:
+</P>
+<UL>
+<LI>the concrete syntax must match the abstract syntax:
+ <UL>
+ <LI>every <CODE>cat</CODE> is given a <CODE>lincat</CODE>
+ <LI>every <CODE>fun</CODE> is given a <CODE>lin</CODE>
+ </UL>
+</UL>
+
+<UL>
+<LI>the concrete syntax is internally coherent:
+ <UL>
+ <LI>the <CODE>lin</CODE> rules respect the types defined by <CODE>lincat</CODE> rules
+ </UL>
+</UL>
+
+<UL>
+<LI>concrete syntaxes are independent of each other
+ <UL>
+ <LI>they can use different <CODE>lincat</CODE> and <CODE>lin</CODE> definitions
+ <LI>they can define their own <B>parameter types</B> (<CODE>param</CODE>)
+ </UL>
+</UL>
+
+<P>
+The first two ideas form the core of the <B>static checking</B> of GF
+grammars, eliminating the possibility of run-time errors in
+linearization and parsing. The third idea gives GF the expressive
+power needed to map abstract syntax to vastly different languages.
+</P>
+<P>
+Abstract and concrete modules are called <B>top-level grammar modules</B>,
+since they are the ones that remain in grammar systems at run time.
+However, in order to support <B>modular grammar engineering</B>, GF provides
+much more module structure than strictly required in top-level grammars.
+</P>
+<P>
+<B>Inheritance</B>, also known as <B>extension</B>, means that a module can inherit the
+contents of one or more other modules to which new judgements are added,
+e.g.
+</P>
+<PRE>
+ abstract MoreAdj = Adj ** {
+ fun Odd : A ;
+ }
+</PRE>
+<P>
+<B>Resource modules</B> define parameter types and <B>operations</B> usable
+in several concrete syntaxes,
+</P>
+<PRE>
+ resource MorphoFre = {
+ param Number = Sg | Pl ;
+ param Gender = Masc | Fem ;
+ oper regA : Str -&gt; {s : Gender =&gt; Number =&gt; Str} =
+ \fin -&gt; {
+ s = table {
+ Masc =&gt; table {Sg =&gt; fin ; Pl =&gt; fin + "s"} ;
+ Fem =&gt; table {Sg =&gt; fin + "e" ; Pl =&gt; fin + "es"}
+ }
+ } ;
+ }
+</PRE>
+<P>
+By <B>opening</B>, a module can use the contents of a resource module
+without inheriting them, e.g.
+</P>
+<PRE>
+ concrete AdjFre of Adj = open MorphoFre in {
+ lincat A = {s : Gender =&gt; Number =&gt; Str} ;
+ lin Even = regA "pair" ;
+ }
+</PRE>
+<P>
+<B>Interfaces</B> and <B>instances</B> separate the contents of a resource module
+to type signatures and definitions, in a way analogous to abstract vs. concrete
+modules, e.g.
+</P>
+<PRE>
+ interface Lexicon = {
+ oper Adjective : Type ;
+ oper even_A : Adjective ;
+ }
+
+ instance LexiconEng of Lexicon = {
+ oper Adjective = {s : Str} ;
+ oper even_A = {s = "even"} ;
+ }
+</PRE>
+<P>
+<B>Functors</B> i.e. <B>parametrized modules</B> i.e. <B>incomplete modules</B>, defining
+a concrete syntax in terms of an interface.
+</P>
+<PRE>
+ incomplete concrete AdjI of Adj = open Lexicon in {
+ lincat A = Adjective ;
+ lin Even = even_A ;
+ }
+</PRE>
+<P>
+A functor can be <B>instantiated</B> by providing instances of its open interfaces.
+</P>
+<PRE>
+ concrete AdjEng of Adj = AdjI with (Lexicon = LexiconEng) ;
+</PRE>
+<P></P>
+<A NAME="toc4"></A>
+<H3>Compilation units</H3>
+<P>
+The compilation unit of GF source code is a file that contains a module.
+Judgements outside modules are supported only for backward compatibility,
+as explained <a href="#oldgf">here</a>.
+Every source file, suffixed <CODE>.gf</CODE>, is compiled to a "GF object file",
+suffixed <CODE>.gfo</CODE> (as of GF Version 3.0 and later). For runtime grammar objects
+used for parsing and linearization, a set of <CODE>.gfo</CODE> files is linked to
+a single file suffixed <CODE>.gfcc</CODE>. While <CODE>.gf</CODE> and <CODE>.gfo</CODE> files may contain
+modules of any kinds, a <CODE>.gfcc</CODE> file always contains a multilingual grammar
+with one abstract and a set of concrete syntaxes.
+</P>
+<P>
+The following diagram summarizes the files involved in the compilation process.
+<center>
+<CODE>module1.gf module2.gf ... modulen.gf</CODE>
+</P>
+<P>
+==>
+</P>
+<P>
+<CODE>module1.gfo module2.gfo ... modulen.gfo</CODE>
+</P>
+<P>
+==>
+</P>
+<P>
+grammar.gfcc
+</center>
+Both <CODE>.gf</CODE> and <CODE>.gfo</CODE> files are written in the GF source language;
+<CODE>.gfcc</CODE> files are written in a lower-level format. The process of translating
+<CODE>.gf</CODE> to <CODE>.gfo</CODE> consists of <B>name resolution</B>, <B>type annotation</B>,
+<B>partial evaluation</B>, and <B>optimization</B>.
+There is a great advantage in the possibility to do this
+separately for GF modules and saving the result in <CODE>.gfo</CODE> files. The partial
+evaluation phase, in particular, is time and memory consuming, and GF libraries
+are therefore distributed in <CODE>.gfo</CODE> to make their use less arduous.
+</P>
+<P>
+<I>In GF before version 3.0, the object files are in a format called <CODE>.gfc</CODE>,</I>
+<I>and the multilingual runtime grammar is in a format called <CODE>.gfcm</CODE>.</I>
+</P>
+<P>
+The standard compiler has a built-in <B>make facility</B>, which finds out what
+other modules are needed when compiling an explicitly given module.
+This facility builds a dependency graph and decides which of the involved
+modules need recompilation (from <CODE>.gf</CODE> to <CODE>.gfo</CODE>), and for which the
+GF object can be used directly.
+</P>
+<A NAME="toc5"></A>
+<H3>Names</H3>
+<P>
+Each module <I>M</I> defines a set of <B>names</B>, which are visible in <I>M</I>
+itself, in all modules extending <I>M</I> (unless excluded, as explained
+<a href="#restrictedinheritance">here</a>), and
+all modules opening <I>M</I>. These names can stand for abstract syntax
+categories and functions, parameter types and parameter constructors,
+and operations. All these names live in the same <B>name space</B>, which
+means that a name entering a module more than once due to inheritance or
+opening can lead to a <B>conflict</B>. It is specified
+<a href="#renaming">here</a> how these
+conflicts are resolved.
+</P>
+<P>
+The names of modules live in a name space separate from the other names.
+Even here, all names must be distinct in a set of files compiled to a
+multilingual grammar. In particular, even files residing in different directories
+must have different names, since GF has no notion of hierarchic
+module names.
+</P>
+<P>
+Lexically, names belong to the class of <B>identifiers</B>. An idenfifier is
+a letter followed by any number of letters, digits, undercores (<CODE>_</CODE>) and
+primes (<CODE>'</CODE>). Upper- and lower-case letters are treated as distinct.
+Nothing dictates the choice of upper or lower-case initials, but
+the standard libraries follow conventions similar to Haskell:
+</P>
+<UL>
+<LI>upper case is used for modules, abstract syntax categories and functions,
+ parameter types and constructors, and type synonyms
+<LI>lower case is used for non-type-valued operations and for variables
+</UL>
+
+<P>
+<a name="identifiers"></a>
+</P>
+<P>
+"Letters" as mentioned in the identifier syntax include all 7-bit ASCII
+letters. Iso-latin-1 and Unicode letters are supported in varying degrees
+by different tools and platforms, and are hence not recommended in identifiers.
+</P>
+<A NAME="toc6"></A>
+<H3>The structure of a module</H3>
+<P>
+Modules of all types have the following structure:
+<center>
+<I>moduletype</I> <I>name</I> <CODE>=</CODE> <I>extends</I> <I>opens</I> <I>body</I>
+</center>
+The part of the module preceding the body is its <B>header</B>. The header
+defines the type of the module and tells what other modules it inherits
+and opens. The body consists of the judgements that introduce all the new
+names defined by the module.
+</P>
+<P>
+Any of the parts <I>extends</I>, <I>opens</I>, and <I>body</I> may be empty.
+If they are all filled, delimiters and keywords separate the parts in the
+following way:
+<center>
+<I>moduletype</I> <I>name</I> <CODE>=</CODE>
+ <I>extends</I> <CODE>**</CODE> <CODE>open</CODE> <I>opens</I> <CODE>in</CODE> <CODE>{</CODE> <I>body</I> <CODE>}</CODE>
+</center>
+The part <I>moduletype</I> <I>name</I> looks slightly different if the
+type is <CODE>concrete</CODE> or <CODE>instance</CODE>: the <I>name</I> intrudes between
+the type keyword and the name of the module being implemented and which
+really belongs to the type of the module:
+<center>
+ <CODE>concrete</CODE> <I>name</I> <CODE>of</CODE> <I>abstractname</I>
+</center>
+The only exception to the schema of functor syntax
+is functor instantiations: the instantiation
+list is given in a special way between <I>extends</I> and <I>opens</I>:
+<center>
+<CODE>incomplete concrete</CODE> <I>name</I> <CODE>of</CODE> <I>abstractname</I> <CODE>=</CODE>
+ <I>extends</I> <CODE>**</CODE> <I>functorname</I> <CODE>with</CODE> <I>instantiations</I> <CODE>**</CODE>
+ <CODE>open</CODE> <I>opens</I> <CODE>in</CODE> <CODE>{</CODE> <I>body</I> <CODE>}</CODE>
+</center>
+Logically, the part "<I>functorname</I> <CODE>with</CODE> <I>instantiations</I>" should
+really be one of the <I>extends</I>. This is also shown by the fact that
+it can have restricted inheritance (concept defined <a href="#restrictedinheritance">here</a>).
+</P>
+<A NAME="toc7"></A>
+<H3>Module types, headers, and bodies</H3>
+<P>
+The <I>extends</I> and <I>opens</I> parts of a module header are lists of
+module names (with possible qualifications, as defined below <a href="#qualifiednames">here</a>).
+The first step of type checking a module consists of verifying that
+these names stand for modules of approptiate module types. As a rule
+of thumb,
+</P>
+<UL>
+<LI>the <I>extends</I> of a module must have the same <I>moduletype</I>
+<LI>the <I>opens</I> of a module must be of type <CODE>resource</CODE>
+</UL>
+
+<P>
+However, the precise rules are a little more fine-grained, because
+of the presence of interfaces and their instances, and the possibility
+to reuse abstract and concrete modules as resources. The following table
+gives, for all module types, the possible module types of their <I>extends</I>
+and <I>opens</I>, as well as the forms of judgement legal in that module type.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>module type</TH>
+<TH>extends</TH>
+<TH>opens</TH>
+<TH COLSPAN="2">body</TH>
+</TR>
+<TR>
+<TD><CODE>abstract</CODE></TD>
+<TD>abstract</TD>
+<TD>-</TD>
+<TD><CODE>cat, fun, def, data</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>concrete of</CODE> <I>abstract</I></TD>
+<TD>concrete</TD>
+<TD>resource*</TD>
+<TD><CODE>lincat, cat, oper, param</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>resource</CODE></TD>
+<TD>resource*</TD>
+<TD>resource*</TD>
+<TD><CODE>oper, param</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>interface</CODE></TD>
+<TD>resource+</TD>
+<TD>resource*</TD>
+<TD><CODE>oper, param</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>instance of</CODE> <I>interface</I></TD>
+<TD>resource*</TD>
+<TD>resource*</TD>
+<TD><CODE>oper, param</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>incomplete</CODE> concrete</TD>
+<TD>concrete+</TD>
+<TD>resource+</TD>
+<TD><CODE>lincat, cat, oper, param</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The table uses the following shorthands for lists of module types:
+</P>
+<UL>
+<LI>resource*: resource, instance, concrete
+<LI>resource+: resource*, interface, abstract
+<LI>concrete+: concrete, incomplete concrete
+</UL>
+
+<P>
+The legality of judgements in the body is checked before the judgements
+themselves are checked.
+</P>
+<P>
+The forms of judgement are explained <a href="#judgementforms">here</a>.
+</P>
+<A NAME="toc8"></A>
+<H3>Digression: the logic of module types</H3>
+<P>
+Why are the legality conditions of opens and extends so complicated? The best way
+to grasp them is probably to consider a simplified logical model of the module
+system, replacing modules by types and functions. This model could actually
+be developed towards treating modules in GF as first-class objects; so far,
+however, this step has not been motivated by any practical needs.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>module</TH>
+<TH COLSPAN="2">object and type</TH>
+</TR>
+<TR>
+<TD>abstract A = B</TD>
+<TD>A = B : type</TD>
+</TR>
+<TR>
+<TD>concrete C of A = B</TD>
+<TD>C = B : A -&gt; S</TD>
+</TR>
+<TR>
+<TD>interface I = B</TD>
+<TD>I = B : type</TD>
+</TR>
+<TR>
+<TD>instance J of I = B</TD>
+<TD>J = B : I</TD>
+</TR>
+<TR>
+<TD>incomplete concrete C of A = open I in B</TD>
+<TD>C = B : I -&gt; A -&gt; S</TD>
+</TR>
+<TR>
+<TD>concrete K of A = C with (I=J)</TD>
+<TD>K = B(J) : A -&gt; S</TD>
+</TR>
+<TR>
+<TD>resource R = B</TD>
+<TD>R = B : I</TD>
+</TR>
+<TR>
+<TD>concrete C of A = open R in B</TD>
+<TD>C = B(R) : A -&gt; S</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+A further step of defining modules as first-class objects would use
+GADTs and record types:
+</P>
+<UL>
+<LI>an abstract syntax is a Generalized Algebraic Datatype (GADT)
+<LI>the target type <CODE>S</CODE> of concrete syntax is the type of nested
+ tuples over strings and integers
+<LI>an interface is a labelled record type
+<LI>an instance is a record of the type defined by the interface
+<LI>a functor, with a module body opening an interface, is a function
+ on its instances
+<LI>the instantiation of a functor is an application of the function to
+ some instance
+<LI>a resource is a typed labelled record, putting together an interface and
+ an instance of it
+<LI>the body of a module opening a resource is as a function on the interface
+ implicit in the resource; this function is immediately applied to the instance
+ defined in the resource
+</UL>
+
+<P>
+Slightly unexpectedly, interfaces and instances are easier to understand
+in this way than resources - a resource is, indeed, more complex, since
+it fuses together an interface and an instance.
+</P>
+<P>
+<a name="openabstract"></a>
+</P>
+<P>
+When an abstract is used as an interface and a concrete as its instance, they
+are actually reinterpreted so that they match the model. Then the abstract is
+no longer a GADT, but a system of <I>abstract</I> datatypes, with a record field
+of type <CODE>Type</CODE> for each category, and a function among these types for each
+abstract syntax function. A concrete syntax instantiates this record with
+linearization types and linearizations.
+</P>
+<A NAME="toc9"></A>
+<H3>Inheritance</H3>
+<P>
+After checking that the <I>extends</I> of a module are of appropriate
+module types, the compiler adds the inherited judgements to the
+judgements included in the body. The inherited judgements are
+not copied entirely, but their names with links to the inherited module.
+Conflicts may arise in this process: a name can have two definitions in the combined
+pool of inherited and added judgements. Such a conflict is always an
+error: GF provides no way to redefine an inherited constant.
+</P>
+<P>
+Simple as the definition of a conflict may sound, it has to take care of the
+inheritance hierarchy. A very common pattern of inheritance is the
+<B>diamond</B>: inheritance from two modules which themselves inherit a common
+base module. Assume that the base module defines a name <CODE>f</CODE>:
+</P>
+<PRE>
+ N
+ / \
+ M1 M2
+ \ /
+ Base {f}
+</PRE>
+<P>
+Now, <CODE>N</CODE> inherits <CODE>f</CODE> from both <CODE>M1</CODE> and <CODE>M2</CODE>, so is there a
+conflict? The answer in GF is <I>no</I>, because the "two" <CODE>f</CODE>'s are in the
+end the same: the one defined in <CODE>Base</CODE>. The situation is thus simpler
+than in <B>multiple inheritance</B> in languages like C++, because definitions in
+GF are <B>immutable</B>: neither <CODE>M1</CODE> nor <CODE>M2</CODE> can possibly have changed
+the definition of <CODE>f</CODE> given in <CODE>Base</CODE>. In practice, the compiler manages
+inheritance through hierarchy in a very simple way, by just always creating
+a link not to the immediate parent, but the original ancestor; this ancestor
+can be read from the link provided by the immediate parent. Here is how
+links are created from source modules by the compiler:
+</P>
+<PRE>
+ Base {f}
+ M1 {m1} ===&gt; M1 {Base.f, m1}
+ M2 {m2} ===&gt; M2 {Base.f, m2}
+ N {n} ===&gt; N {Base.f, M1.m1, M2.m2, n}
+</PRE>
+<P></P>
+<P>
+<a name="restrictedinheritance"></a>
+</P>
+<P>
+Inheritance can be <B>restricted</B>. This means that a module can be specified
+as inheriting <I>only</I> explicitly listed constants, or all constants
+<I>except</I> ones explicitly listed. The syntax uses constant names in brackets,
+prefixed by a minus sign in the case of an exclusion list. In the following
+configuration, N inherits <CODE>a,b,c</CODE> from <CODE>M1</CODE>, and all names but <CODE>d</CODE>
+from <CODE>M2</CODE>
+</P>
+<PRE>
+ N = M1 {a,b,c}, M2-{d}
+</PRE>
+<P>
+Restrictions are performed as a part of inheritance linking, module by module:
+the link is created for a constant if and only if it is both
+included in the module and compatible with the restriction. Thus,
+for instance, an inadvertent usage can exclude a constant from one module
+but inherit it from another one. In the following
+configuration, <CODE>f</CODE> is inherited via <CODE>M1</CODE>, if <CODE>M1</CODE> inherits it.
+</P>
+<PRE>
+ N = M1 [a,b,c], M2-[f]
+</PRE>
+<P>
+Unintended inheritance may cause problems later in compilation, in the
+judgement-level dependency analysis phase. For instance, suppose a function
+<CODE>f</CODE> has category <CODE>C</CODE> as its type in <CODE>M</CODE>, and we only include <CODE>f</CODE>. The
+exclusion has the effect of creating an ill-formed module:
+</P>
+<PRE>
+ abstract M = {cat C ; fun f : C ;}
+ M [f] ===&gt; {fun f : C ;}
+</PRE>
+<P>
+One might expect inheritance restriction to be transitive: if an included
+constant <I>b</I> depends on some other constant <I>a</I>, then <I>a</I> should be
+included automatically. However, this rule would leave to hard-to-detect
+inheritances. And it could only be applied later in the compilation phase,
+when the compiler has not only collected the names defined, but also
+resolved the names used in definitions.
+</P>
+<P>
+Yet another pitfall with restricted inheritance is that it must be stated
+for each module separately. For instance, a concrete syntax of an abstract
+must exclude all those names that the abstract does, and a functor instantiation
+must replicate all restrictions of the functor.
+</P>
+<A NAME="toc10"></A>
+<H3>Opening</H3>
+<P>
+Opening makes constants from other modules usable in judgements, without
+inheriting them. This means that, unlike inheritance, opening is not
+transitive.
+</P>
+<P>
+<a name="qualifiednames"></a>
+</P>
+<P>
+Opening cannot be restricted as inheritance can, but it can be <B>qualified</B>.
+This means that the names from the opened modules cannot be used as such, but
+only as prefixed by a qualifier and a dot (<CODE>.</CODE>). The qualifier can be any
+identifier, including the name of the module. Here is an example of
+an <I>opens</I> list:
+</P>
+<PRE>
+ open A, (X = XSLTS), (Y = XSLTS), B
+</PRE>
+<P>
+If <CODE>A</CODE> defines the constant <CODE>a</CODE>, it can be accessed by the names
+</P>
+<PRE>
+ a A.a
+</PRE>
+<P>
+If <CODE>XSLTS</CODE> defines the constant <CODE>x</CODE>, it can be accessed by the names
+</P>
+<PRE>
+ X.x Y.x XSLTS.x
+</PRE>
+<P>
+Thus qualification by real module name is always possible, and one and the same
+module can be qualified in different ways at the same time (the latter can
+be useful if you want to be able to change the implementations of some
+constants to a different resource later). Since the qualification with real
+module name is always possible, it is not possible to "swap" the names of
+modules locally:
+</P>
+<PRE>
+ open (A=B), (B=A) -- NOT POSSIBLE!
+</PRE>
+<P>
+The list of qualifiers names and module names in a module header may
+thus not contain any duplicates.
+</P>
+<A NAME="toc11"></A>
+<H3>Name resolution</H3>
+<P>
+<a name="renaming"></a>
+</P>
+<P>
+<B>Name resolution</B> is the compiler phase taking place after inheritance
+linking. It qualifies all names occurring in the definition parts of judgements
+(that is, just excluding the defined names themselves) with the names of
+the modules they come from. If a name can come from different modules (that is,
+not from their common ancestor), a conflict is reported; this decision is
+hence not dependent on e.g. types, which are known only at a later phase.
+</P>
+<P>
+Qualification of names is the main device for avoiding conflicts in
+name resolution. No other information is used, such as priorities between
+modules. However, if a name is defined in different opened modules
+but never used in the module body,
+a conflict does not arise: conflicts arise only
+when names are used. Also in this respect, opening is thus different from
+inheritance, where conflicts are checked independently of use.
+</P>
+<P>
+As usual, inner scope has priority in name resolution. This means that
+if an identifier is in scope as a bound variable, it will not be
+interpreted as a constant, unless qualified by a module name
+(variable bindings are explained <a href="#variablebinding">here</a>).
+</P>
+<A NAME="toc12"></A>
+<H3>Functor instantiations</H3>
+<P>
+We have dealt with the principles of module headers, inheritance, and
+names in a general way that applies to all module types. The exception
+is functor instantiations, that have an extra part of the instantiating
+equations, assigning an instance to every interface. Here is a typical
+example, displaying the full generality:
+</P>
+<PRE>
+ concrete FoodsEng of Foods = PhrasesEng **
+ FoodsI-[Pizza] with
+ (Syntax = SyntaxEng),
+ (LexFoods = LexFoodsEng) **
+ open SyntaxEng, ParadigmsEng in {
+ lin Pizza = mkCN (mkA "Italian") (mkN "pie") ;
+ }
+</PRE>
+<P>
+(The example is modified from Section 5.9 in the GF Tutorial.)
+</P>
+<P>
+The instantiation syntax is similar to qualified <I>opens</I>. The left-hand-side
+names must be interfaces, the right-hand-side names their instances. (Recall
+that <CODE>abstract</CODE> can be use as <CODE>interface</CODE> and <CODE>concrete</CODE> as its
+<CODE>instance</CODE>.) Inheritance from the functor can be restricted, typically
+in the purpose of defining some excluded functions in language-specific
+ways in the module body.
+</P>
+<A NAME="toc13"></A>
+<H3>Completeness</H3>
+<P>
+<a name="completeness"></a>
+</P>
+<P>
+(This section refers to the forms of judgement introduced <a href="#judgementforms">here</a>.)
+</P>
+<P>
+A <CODE>concrete</CODE> is complete with respect to an <CODE>abstract</CODE>, if it
+contains a <CODE>lincat</CODE> definition for every <CODE>cat</CODE> declaration, and
+a <CODE>lin</CODE> definition for every <CODE>fun</CODE> declaration.
+</P>
+<P>
+The same completeness criterion applies to functor instantiations.
+It is not possible to use a partial functor instantiation, leading
+to another functor.
+</P>
+<P>
+Functors do not need to be complete in the sense concrete modules need.
+The missing definitions can then be provided in the body of each
+functor instantiation.
+</P>
+<P>
+A <CODE>resource</CODE> is complete, if all its <CODE>oper</CODE> and <CODE>param</CODE> judgements
+have a definition part. While a <CODE>resource</CODE> must be complete, an
+<CODE>interface</CODE> need not. For an <CODE>interface</CODE>, it is the definition
+parts of judgements are optional.
+</P>
+<P>
+An <CODE>instance</CODE> is complete with respect to an <CODE>interface</CODE>, if it
+gives the definition parts of all <CODE>oper</CODE> and <CODE>param</CODE> judgements
+that are omitted in the <CODE>interface</CODE>. Giving definitions to judgements
+that have already been defined in the <CODE>interface</CODE> is illegal.
+Type signatures, on the other hand, can be repeated if the same types
+are used.
+</P>
+<P>
+In addition to completing the definitions in an <CODE>interface</CODE>,
+its instance may contain other judgements, but these must all
+be complete with definitions.
+</P>
+<P>
+Here is an example of an instance and its interface showing the
+above variations:
+</P>
+<PRE>
+ interface Pos = {
+ param Case ; -- no definition
+ param Number = Sg | Pl ; -- definition given
+ oper Noun : Type = { -- relative definition given
+ s : Number =&gt; Case =&gt; Str
+ } ;
+ oper regNoun : Str -&gt; Noun ; -- no definition
+ }
+
+ instance PosEng of Pos = {
+ param Case = Nom | Gen ; -- definition of Case
+ -- Number and Noun inherited
+ oper regNoun = \dog -&gt; { -- type of regNoun inherited
+ s = table { -- definition of regNoun
+ Sg =&gt; table {
+ Nom =&gt; dog
+ -- etc
+ }
+ } ;
+ oper house_N : Noun = -- new definition
+ regNoun "house" ;
+ }
+</PRE>
+<P></P>
+<A NAME="toc14"></A>
+<H2>Judgements</H2>
+<A NAME="toc15"></A>
+<H3>Overview of the forms of judgement</H3>
+<P>
+<a name="judgementforms"></a>
+</P>
+<P>
+A module body in GF is a set of <B>judgements</B>. Judgements are
+definitions or declarations, sometimes combinations of the two; the
+common feature is that every judgement introduces a name, which is
+available in the module and whenever the module is extended or opened.
+</P>
+<P>
+There are several different <B>forms of judgement</B>, identified by different
+<B>judgement keywords</B>. Here is a list of all these forms, together
+with syntax descriptions and the types of modules in which each form can occur.
+The table moreover indicates whether the judgement has a default value, and
+whether it contributes to the <B>name base</B>, i.e. introduces a new
+name to the scope.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>judgement</TH>
+<TH>where</TH>
+<TH>module</TH>
+<TH>default</TH>
+<TH COLSPAN="2">base</TH>
+</TR>
+<TR>
+<TD><CODE>cat</CODE> C G</TD>
+<TD>G context</TD>
+<TD>abstract</TD>
+<TD>N/A</TD>
+<TD>yes</TD>
+</TR>
+<TR>
+<TD><CODE>fun</CODE> f : A</TD>
+<TD>A type</TD>
+<TD>abstract</TD>
+<TD>N/A</TD>
+<TD>yes</TD>
+</TR>
+<TR>
+<TD><CODE>def</CODE> f ps = t</TD>
+<TD>f fun, ps patterns, t term</TD>
+<TD>abstract</TD>
+<TD>yes</TD>
+<TD>no</TD>
+</TR>
+<TR>
+<TD><CODE>data</CODE> C = f <CODE>|</CODE> ... <CODE>|</CODE> g</TD>
+<TD>C cat, f...g fun</TD>
+<TD>abstract</TD>
+<TD>yes</TD>
+<TD>no</TD>
+</TR>
+<TR>
+<TD><CODE>lincat</CODE> C = T</TD>
+<TD>C cat, T type</TD>
+<TD>concrete*</TD>
+<TD>yes</TD>
+<TD>yes</TD>
+</TR>
+<TR>
+<TD><CODE>lin</CODE> f = t</TD>
+<TD>f fun, t term</TD>
+<TD>concrete*</TD>
+<TD>no</TD>
+<TD>yes</TD>
+</TR>
+<TR>
+<TD><CODE>lindef</CODE> f = t</TD>
+<TD>f fun, t term</TD>
+<TD>concrete*</TD>
+<TD>yes</TD>
+<TD>no</TD>
+</TR>
+<TR>
+<TD><CODE>printname cat</CODE> C = t</TD>
+<TD>C cat, t term</TD>
+<TD>concrete*</TD>
+<TD>yes</TD>
+<TD>no</TD>
+</TR>
+<TR>
+<TD><CODE>printname fun</CODE> f = t</TD>
+<TD>f fun, t term</TD>
+<TD>concrete*</TD>
+<TD>yes</TD>
+<TD>no</TD>
+</TR>
+<TR>
+<TD><CODE>param</CODE> P = C<CODE>|</CODE> ... <CODE>|</CODE> D</TD>
+<TD>C...D constructors</TD>
+<TD>resource*</TD>
+<TD>N/A</TD>
+<TD>yes</TD>
+</TR>
+<TR>
+<TD><CODE>oper</CODE> f : T = t</TD>
+<TD>T type, t term</TD>
+<TD>resource*</TD>
+<TD>N/A</TD>
+<TD>yes</TD>
+</TR>
+<TR>
+<TD><CODE>flags</CODE> o = v</TD>
+<TD>o flag, v value</TD>
+<TD>all</TD>
+<TD>yes</TD>
+<TD>N/A</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+Judgements that have default values are rarely used, except <CODE>lincat</CODE> and
+<CODE>flags</CODE>, which often need values different from the defaults.
+</P>
+<P>
+Introducing a name twice in the same module is an error. In other words,
+all judgements that have a "yes" in the name base column, must
+have distinct identifiers on their left-hand sides.
+</P>
+<P>
+All judgement end with semicolons (<CODE>;</CODE>).
+</P>
+<P>
+In addition to the syntax given in the table, many of the forms have
+syntactic sugar. This sugar will be explained below in connection to
+each form. There are moreover two kinds of syntactic sugar common to all forms:
+</P>
+<UL>
+<LI>the judgement keyword is shared between consecutive judgements
+ until a new keyword appears:
+<center>
+<CODE>keyw J ; K ;</CODE> === <CODE>keyw J ; keyw K ;</CODE>
+</center>
+<LI>the right-hand sides of colon (<CODE>:</CODE>) and equality (<CODE>=</CODE>)
+ can be shared, by using comma (<CODE>,</CODE>) as separator of left-hand sides, which
+ must consist of identifiers
+<center>
+<CODE>c,d : T</CODE> === <CODE>c : T ; d : T ;</CODE>
+<P></P>
+<CODE>c,d = t</CODE> === <CODE>c = t ; d = t ;</CODE>
+</center>
+</UL>
+
+<P>
+These conventions, like all syntactic sugar, are performed at an
+early compilation phase, directly after parsing. This means that e.g.
+</P>
+<PRE>
+ lin f,g = \x -&gt; x ;
+</PRE>
+<P>
+can be correct even though <CODE>f</CODE> and <CODE>g</CODE> required different
+function types.
+</P>
+<P>
+Within a module, judgements can occur in any order. In particular,
+a name can be used before it is introduced.
+</P>
+<P>
+The explanations of judgement forms refer to the notions
+of <B>type</B> and <B>term</B> (the latter also called <B>expression</B>).
+These notions will be explained in detail <a href="#expressions">here</a>.
+</P>
+<A NAME="toc16"></A>
+<H3>Category declarations, cat</H3>
+<P>
+<a name="catjudgements"></a>
+</P>
+<P>
+Category declarations
+<center>
+<CODE>cat</CODE> <I>C</I> <I>G</I>
+</center>
+define the <B>basic types</B> of abstract syntax.
+A basic type is formed from a category by giving values to all variables
+in the <B>context</B> <I>G</I>. If the context is empty, the
+basic type looks the same as the category itself. Otherwise, application
+syntax is used:
+<center>
+<I>C</I> <i>a</i><sub>1</sub>...<i>a</i><sub>n</sub>
+</center>
+</P>
+<A NAME="toc17"></A>
+<H3>Hypotheses and contexts</H3>
+<P>
+<a name="contexts"></a>
+</P>
+<P>
+A context is a sequence of <B>hypotheses</B>, i.e. variable-type pairs.
+A hypothesis is written
+<center>
+<CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE>
+</center>
+and a sequence does not have any separator symbols. As syntactic sugar,
+</P>
+<UL>
+<LI>variables can share a type,
+<center>
+<CODE>(</CODE> <I>x,y</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> === <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> <CODE>(</CODE> <I>y</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE>
+</center>
+<LI>a <B>wildcard</B> can be used for a variable not occurring in types
+ later in the context,
+<center>
+<CODE>(</CODE> <CODE>_</CODE> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> === <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE>
+</center>
+<LI>if the variable does not occur later, it can be omitted altogether, and
+ parentheses are not used,
+<center>
+ <I>T</I> === <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE>
+</center>
+ But if <I>T</I> is more complex than an identifier, it needs parentheses to
+ be separated from the rest of the context.
+</UL>
+
+<P>
+An abstract syntax has <B>dependent types</B>, if any of its categories has
+a non-empty context.
+</P>
+<A NAME="toc18"></A>
+<H3>Function declarations, fun</H3>
+<P>
+Function declarations,
+<center>
+ <CODE>fun</CODE> <I>f</I> <CODE>:</CODE> <I>T</I>
+</center>
+define the <B>syntactic constructors</B> of abstract
+syntax. The type <I>T</I> of <I>f</I>
+is built built from basic types (formed from categories) by using
+the function type constructor <CODE>-&gt;</CODE>. Thus its form is
+<center>
+ (<i>x</i><sub>1</sub> <CODE>:</CODE> <i>A</i><sub>1</sub>) <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> (<i>x</i><sub>n</sub> <CODE>:</CODE> <i>A</i><sub>n</sub>) <CODE>-&gt;</CODE> <I>B</I>
+</center>
+where <I>Ai</I> are types, called the <B>argument types</B>, and <I>B</I> is a
+basic type, called the <B>value type</B> of <I>f</I>. The <B>value category</B> of
+<I>f</I> is the category that forms the type <I>B</I>.
+</P>
+<P>
+A <B>syntax tree</B> is formed from <I>f</I> by applying it to a full list of
+arguments, so that the result is of a basic type.
+</P>
+<P>
+A <B>higher-order function</B> is one that has a function type as an
+argument. The concrete syntax of GF does not support displaying the
+bound variables of functions of higher than second order, but they are
+legal in abstract syntax.
+</P>
+<P>
+An abstract syntax is <B>context-free</B>, if it has neither dependent
+types nor higher-order functions. Grammars with context-free abstract
+syntax are an important subclass of GF, with more limited complexity
+than full GF. Whether the <I>concrete</I> syntax is context-free in the sense
+of the Chomsky hierarchy is independent of the context-freeness of
+the abstract syntax.
+</P>
+<A NAME="toc19"></A>
+<H3>Function definitions, def</H3>
+<P>
+Function definitions,
+<center>
+ <CODE>def</CODE> <I>f</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub> <CODE>=</CODE> <I>t</I>
+</center>
+where <I>f</I> is a <CODE>fun</CODE> function and <i>p</i><sub>i</sub># are patterns,
+impose a relation of <B>definitional equality</B> on abstract syntax
+trees. They form the basis of <B>computation</B>, which is used
+when comparing whether two types are equal; this notion is relevant
+only if the types are dependent. Computation can also be used for
+the <B>normalization</B> of syntax trees, which applies even in
+context-free abstract syntax.
+</P>
+<P>
+The set of <CODE>def</CODE> definitions for <I>f</I> can be scattered around
+the module in which <I>f</I> is introduced as a function. The compiler
+builds the set of pattern equations in the order in which the
+equations appear; this order is significant in the case of
+overlapping patterns. All equations must appear in the same module in
+which <I>f</I> itself declared.
+</P>
+<P>
+The syntax of patterns will be specified <a href="#patternmatching">here</a>, commonly for
+abstract and concrete syntax. In abstract
+syntax, <B>constructor patterns</B> are those of the form
+<center>
+ <I>C</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub>
+</center>
+where <I>C</I> is declared as <CODE>data</CODE> for some abstract syntax category
+(see next section). A <B>variable pattern</B> is either an identifier or
+a wildcard.
+</P>
+<P>
+A common pitfall is to forget to declare a constructor as data, which
+causes it to be interpreted as a variable pattern in definitions.
+</P>
+<P>
+Computation is performed by applying definitions and beta conversions,
+and in general by using <B>pattern matching</B>. Computation and pattern matching
+are explained commonly for abstract and concrete syntax <a href="#patternmatching">here</a>.
+</P>
+<P>
+In contrast to concrete syntax, abstract syntax computation is
+completely <B>symbolic</B>: it does not produce a value, but just another
+term. Hence it is not an error to have incomplete systems of
+pattern equations for a function. In addition, the definitions
+can be <B>recursive</B>, which means that computation can fail to terminate;
+this can never happen in concrete syntax.
+</P>
+<A NAME="toc20"></A>
+<H3>Data constructor definitions, data</H3>
+<P>
+A data constructor definition,
+<center>
+ <CODE>data</CODE> <I>C</I> <CODE>=</CODE> <i>f</i><sub>1</sub> <CODE>|</CODE> ... <CODE>|</CODE> <i>f</i><sub>n</sub>
+</center>
+defines the functions <I>f1</I>...<I>fn</I> to be <B>constructors</B>
+of the category <I>C</I>. This means that they are recognized as constructor
+patterns when used in function definitions.
+</P>
+<P>
+In order for the data constructor definition to be correct,
+<i>f</i><sub>1</sub>...<i>f</i><sub>n</sub> must be functions with <I>C</I> as their value category.
+</P>
+<P>
+The complete set of constructors for a category <I>C</I> is the union of
+all its data constructor definitions. Thus a category can be "extended"
+by new constructors afterwards. However, all these constructor definitions
+must appear in the same module in which the category is itself defined.
+</P>
+<P>
+There is syntactic sugar for declaring a function as a constructor at
+the same time as introducing it:
+<center>
+<CODE>data</CODE> <I>f</I> : <i>A</i><sub>1</sub> <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> <i>A</i><sub>n</sub> <CODE>-&gt;</CODE> <I>C</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>m</sub>
+</P>
+<P>
+ ===
+</P>
+<P>
+<CODE>fun</CODE> <I>f</I> : <i>A</i><sub>1</sub> <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> <i>A</i><sub>n</sub> <CODE>-&gt;</CODE> <I>C</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>m</sub> ;
+ <CODE>data</CODE> <I>C</I> = <I>f</I>
+</center>
+</P>
+<A NAME="toc21"></A>
+<H3>The semantic status of an abstract syntax function</H3>
+<P>
+There are three possible statuses for a function declared in a <CODE>fun</CODE> judgement:
+</P>
+<UL>
+<LI>primitive notion: the default status
+<LI>constructor: the function appears on the right-hand side in <CODE>data</CODE> judgement
+<LI>defined: the function has a <CODE>def</CODE> definition
+</UL>
+
+<P>
+The "constructor" and "defined" statuses are in contradiction with each other,
+whereas the primitive notion status is overridden by any of the two others.
+</P>
+<P>
+This distinction is relevant for the semantics of abstract syntax, not
+for concrete syntax. It shows in the way patterns are treated in
+equations in <CODE>def</CODE> definitions: a constructor
+in a pattern matches only itself, whereas
+any other name is treated as a variable pattern, which matches
+anything.
+</P>
+<A NAME="toc22"></A>
+<H3>Linearization type definitions, lincat</H3>
+<P>
+A linearization type definition,
+<center>
+ <CODE>lincat</CODE> <I>C</I> <CODE>=</CODE> <I>T</I>
+</center>
+defines the type of linearizations of trees whose type has category <I>C</I>.
+Type dependences have no effect on the linearization type.
+</P>
+<P>
+The type <I>T</I> must be a <B>legal linearization type</B>, which means that it
+is a <I>record type</I> whose fields have either parameter types, the type Str
+of strings, or table or record types of these. In particular, function types
+may not appear in <I>T</I>. A detailed explanation of types in concrete syntax
+will be given <a href="#cnctypes">here</a>.
+</P>
+<P>
+If <I>K</I> is the concrete syntax of an abstract syntax <I>A</I>, then <I>K</I> must
+define the linearization type of all categories declared in <I>A</I>. However,
+the definition can be omitted from the source code, in which case the default
+type <CODE>{s : Str}</CODE> is used.
+</P>
+<A NAME="toc23"></A>
+<H3>Linearization definitions, lin</H3>
+<P>
+A linearization definition,
+<center>
+ <CODE>lin</CODE> <I>f</I> <CODE>=</CODE> <I>t</I>
+</center>
+defines the linearizations function of function <I>f</I>, i.e. the function
+used for linearizing trees formed by <I>f</I>.
+</P>
+<P>
+The type of <I>t</I> must be the homomorphic image of the type of <I>f</I>.
+In other words, if
+<center>
+ <CODE>fun</CODE> <I>f</I> <CODE>:</CODE> <i>A</i><sub>1</sub> <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> <i>A</i><sub>n</sub> <CODE>-&gt;</CODE> <I>A</I>
+</center>
+then
+<center>
+ <CODE>lin</CODE> <I>f</I> <CODE>:</CODE> <i>A</i><sub>1</sub>* <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> <i>A</i><sub>n</sub>* <CODE>-&gt;</CODE> <I>A</I>*
+</center>
+where the type <I>T</I>* is defined as follows depending on <I>T</I>:
+</P>
+<UL>
+<LI>(<I>C</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>n</sub>)* = <I>T</I>, if <CODE>lincat</CODE> <I>C</I> <CODE>=</CODE> <I>T</I>
+<LI>(<i>B</i><sub>1</sub> <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> <i>B</i><sub>m</sub> <CODE>-&gt;</CODE> <I>B</I>)* = <I>B</I>* <CODE>** {$0,...,$m : Str}</CODE>
+</UL>
+
+<P>
+The second case is relevant for higher-order functions only. It says that
+the linearization type of the value type is extended by adding a string field
+for each argument types; these fields store the variable symbol used for
+the binding of each variable.
+</P>
+<P>
+<a name="HOAS"></a>
+</P>
+<P>
+Since the arguments of a function argument are treated as bare strings,
+orders higher than the second are irrelevant for concrete syntax.
+</P>
+<P>
+There is syntactic sugar for binding the variables of the linearization
+of a function on the left-hand side:
+<center>
+ <CODE>lin</CODE> <I>f</I> <I>p</I> <CODE>=</CODE> <I>t</I> === <CODE>lin</CODE> <I>f</I> <CODE>= \</CODE><I>p</I> <CODE>-&gt;</CODE> <I>t</I>
+</center>
+The pattern <I>p</I> must be either a variable or a wildcard (<CODE>_</CODE>); this is
+what the syntax of lambda abstracts (<CODE>\p -&gt; t</CODE>) requires.
+</P>
+<A NAME="toc24"></A>
+<H3>Linearization default definitions, lindef</H3>
+<P>
+<a name="lindefjudgements"></a>
+</P>
+<P>
+A linearization default definition,
+<center>
+ <CODE>lindef</CODE> <I>C</I> <CODE>=</CODE> <I>t</I>
+</center>
+defines the default linearization of category <I>C</I>, i.e. the function
+applicable to a string to make it into an object of the linearization
+type of <I>C</I>.
+</P>
+<P>
+Linearization defaults are invoked when linearizing variable bindings
+in higher-order abstract syntax. A variable symbol is then presented
+as a string, which must be converted to correct type in order for
+linearization not to fail with an error.
+</P>
+<P>
+The defaults can also be used for linearizing metavariables
+in an interactive syntax editor.
+</P>
+<P>
+Usually, linearization defaults are generated by using the default
+rule that "uses the symbol itself for every string, and the
+first value of the parameter type for every parameter". The precise
+definition is by structural recursion on the type:
+</P>
+<UL>
+<LI>default(Str,s) = s
+<LI>default(P,s) = #1(P)
+<LI>default(P =&gt; T,s) = <CODE>\\_ =&gt;</CODE> default(T,s)
+<LI>default(<CODE>{</CODE>... ; r : R ; ...<CODE>}</CODE>,s) = <CODE>{</CODE>... ; r : default(R,s) ; ...<CODE>}</CODE>
+</UL>
+
+<P>
+The notion of the first value of a parameter type (#1(P)) is defined
+<a href="#paramvalues">here</a> below.
+</P>
+<A NAME="toc25"></A>
+<H3>Printname definitions, printname cat and printname fun</H3>
+<P>
+A category printname definition,
+<center>
+ <CODE>printname cat</CODE> <I>C</I> <CODE>=</CODE> <I>s</I>
+</center>
+defines the printname of category <I>C</I>, i.e. the name used
+in some abstract syntax information shown to the user.
+</P>
+<P>
+Likewise, a function printname definition,
+<center>
+ <CODE>printname fun</CODE> <I>f</I> <CODE>=</CODE> <I>s</I>
+</center>
+defines the printname of function <I>f</I>, i.e. the name used
+in some abstract syntax information shown to the user.
+</P>
+<P>
+The most common use of printnames is in the interactive syntax
+editor, where printnames are displayed in menus. It is possible
+e.g. to adapt them to each language, or to embed HTML tooltips
+in them (as is used in some HTML-based editor GUIs).
+</P>
+<P>
+Usually, printnames are generated automatically from the symbol
+and/or concrete syntax information.
+</P>
+<A NAME="toc26"></A>
+<H3>Parameter type definitions, param</H3>
+<P>
+<a name="paramjudgements"></a>
+</P>
+<P>
+A parameter type definition,
+<center>
+ <CODE>param</CODE> <I>P</I> <CODE>=</CODE> <i>C</i><sub>1</sub> <i>G</i><sub>1</sub> <CODE>|</CODE> ... <CODE>|</CODE> <i>C</i><sub>n</sub> <i>G</i><sub>n</sub>
+</center>
+defines a parameter type <I>P</I> with the <B>parameter constructors</B>
+<i>C</i><sub>1</sub>...<i>C</i><sub>n</sub>, with their respective contexts <i>G</i><sub>1</sub>...<i>G</i><sub>n</sub>.
+</P>
+<P>
+<a name="paramtypes"></a>
+</P>
+<P>
+Contexts have the same syntax as in <CODE>cat</CODE> judgements, explained
+<a href="#catjudgements">here</a>. Since dependent types are not available in
+parameter type definitions, the use of variables is never
+necessary. The types in the context must themselves be <B>parameter types</B>,
+which are defined as follows:
+</P>
+<UL>
+<LI>Given the judgement <CODE>param</CODE> <I>P</I> ..., <I>P</I> is a parameter type.
+<LI>A record type of parameter types is a parameter type.
+<LI><CODE>Ints</CODE> <I>n</I> (an initial segment of integers) is a parameter type.
+</UL>
+
+<P>
+The names defined by a parameter type definition include both the
+type name <I>P</I> and the constructor names <i>C</i><sub>i</sub>. Therefore all these
+names must be distinct in a module.
+</P>
+<P>
+A parameter type may not be recursive, i.e. <I>P</I> itself may not occur in
+the contexts of its constructors. This restriction extends to mutual
+recursion: we say that <I>P</I> <B>depends</B> on the types that occur
+in the contexts of its constructors and on all types that those types
+depend on, and state that <I>P</I> may not depend on itself.
+</P>
+<P>
+In an <CODE>interface module</CODE>, it is possible to declare a parameter type
+without defining it,
+<center>
+ <CODE>param</CODE> <I>P</I> <CODE>;</CODE>
+</center>
+</P>
+<A NAME="toc27"></A>
+<H3>Parameter values</H3>
+<P>
+<a name="paramvalues"></a>
+</P>
+<P>
+All parameter types are finite, and the GF compiler will internally
+compute them to <B>lists of parameter values</B>. These lists are formed by
+traversing the <CODE>param</CODE> definitions, usually respecting the
+order of constructors in the source code. For records, bibliographical
+sorting is applied. However, both the order of traversal of <CODE>param</CODE>
+definitions and the order of fields in a record are specified
+in a compiler-internal way, which means that the programmer should not
+rely on any particular order.
+</P>
+<P>
+The order of the list of parameter values can affect the program in two
+cases:
+</P>
+<UL>
+<LI>in the default <CODE>lindef</CODE> definition (<a href="#lindefjudgements">here</a>),
+ the first value is chosen
+<LI>in course-of-value tables (<a href="#tables">here</a>), the compiler-internal order is
+ followed
+</UL>
+
+<P>
+The first usage implies that, if <CODE>lindef</CODE> definitions are essential for
+the application, they should be given manually. The second usage implies that
+course-of-value tables should be avoided in hand-written GF code.
+</P>
+<P>
+In run-time grammar generation, all parameter values are translated to
+integers denotions positions in these parameter lists.
+</P>
+<A NAME="toc28"></A>
+<H3>Operation definitions, oper</H3>
+<P>
+An operation definition,
+<center>
+ <CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> <CODE>=</CODE> <I>t</I>
+</center>
+defines an <B>operation</B> <I>h</I> of type <I>T</I>, with the computation rule
+<center>
+ <I>h</I> ==> <I>t</I>
+</center>
+The type <I>T</I> can be any concrete syntax type, including function
+types of any order. The term <I>t</I> must have the type <I>T</I>, as
+defined <a href="#expressions">here</a>.
+</P>
+<P>
+As syntactic sugar, the type can be omitted,
+<center>
+ <CODE>oper</CODE> <I>h</I> <CODE>=</CODE> <I>t</I>
+</center>
+which works in two cases
+</P>
+<UL>
+<LI>the type can be inferred from <I>t</I> (compiler-dependent)
+<LI>the definition occurs in an <CODE>instance</CODE> and the type is given in
+ the <CODE>interface</CODE>
+</UL>
+
+<P>
+It is also possible to give the type and the definition separately:
+<center>
+<CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> ; <CODE>oper</CODE> <I>h</I> <CODE>=</CODE> <I>t</I> ===
+ <CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> <CODE>=</CODE> <I>t</I>
+</center>
+The order of the type part and the definition part is free, and there
+can be other judgements in between. However, they must occur in the
+same <CODE>resource</CODE> module for it to be complete (as defined <a href="#completeness">here</a>).
+In an <CODE>interface</CODE> module, it is enough to give the type.
+</P>
+<P>
+When only the definition is given, it is possible to use a shorthand
+similar to <CODE>lin</CODE> judgements:
+<center>
+<CODE>oper</CODE> <I>h</I> <I>p</I> <CODE>=</CODE> <I>t</I> === <CODE>oper</CODE> <I>h</I> <CODE>=</CODE> <CODE>\</CODE><I>p</I> <CODE>-&gt;</CODE> <I>t</I>
+</center>
+The pattern <I>p</I> is either a variable or a wildcard (<CODE>_</CODE>).
+</P>
+<P>
+Operation definitions may not be recursive, not even mutually recursive.
+This condition ensures that functions can in the end be eliminated from
+concrete syntax code (as explained <a href="#functionelimination">here</a>).
+</P>
+<A NAME="toc29"></A>
+<H3>Operation overloading</H3>
+<P>
+<a name="overloading"></a>
+</P>
+<P>
+One and the same operation name <I>h</I> can be used for different operations,
+which have to have different types. For each call of <I>h</I>, the type checker
+selects one of these operations depending on what type is expected in the
+context of the call. The syntax of overloaded operation definitions is
+<center>
+<CODE>oper</CODE> <I>h</I>
+ <CODE>= overload {</CODE><I>h</I> : <i>T</i><sub>1</sub> = <i>t</i><sub>1</sub> ; ... ; <I>h</I> : <i>T</i><sub>n</sub> = <i>t</i><sub>n</sub><CODE>}</CODE>
+</center>
+Notice that <I>h</I> must be the same in all cases.
+This format can be used to give the complete implementation; to give just
+the types, e.g. in an interface, one can use the form
+<center>
+<CODE>oper</CODE> <I>h</I>
+ <CODE>: overload {</CODE><I>h</I> : <i>T</i><sub>1</sub> ; ... ; <I>h</I> : <i>T</i><sub>n</sub><CODE>}</CODE>
+</center>
+The implementation of this operation typing is given by a judgement of
+the first form. The order of branches need not be the same.
+</P>
+<A NAME="toc30"></A>
+<H3>Flag definitions, flags</H3>
+<P>
+A flag definition,
+<center>
+ <CODE>flags</CODE> <I>o</I> <CODE>=</CODE> <I>v</I>
+</center>
+sets the value of the flag <I>o</I>, to be used when compiling or using
+the module.
+</P>
+<P>
+The flag <I>o</I> is an identifier, and the value <I>v</I> is either an identifier
+or a quoted string.
+</P>
+<P>
+Flags are a kind of metadata, which do not strictly belong to the GF
+language. For instance, compilers do not necessarily check the
+consistency of flags, or the meaningfulness of their values.
+The inheritance of flags is not well-defined; the only certain rule
+is that flags set in the module body override the settings from
+inherited modules.
+</P>
+<P>
+Here are some flags commonly included in grammars.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>flag</TH>
+<TH>value</TH>
+<TH>description</TH>
+<TH COLSPAN="2">module</TH>
+</TR>
+<TR>
+<TD><CODE>coding</CODE></TD>
+<TD>character encoding</TD>
+<TD>encoding used in string literals</TD>
+<TD>concrete</TD>
+</TR>
+<TR>
+<TD><CODE>lexer</CODE></TD>
+<TD>predefined lexer</TD>
+<TD>lexer before parsing</TD>
+<TD>concrete</TD>
+</TR>
+<TR>
+<TD><CODE>startcat</CODE></TD>
+<TD>category</TD>
+<TD>default target of parsing</TD>
+<TD>abstract</TD>
+</TR>
+<TR>
+<TD><CODE>unlexer</CODE></TD>
+<TD>predefined unlexer</TD>
+<TD>unlexer after linearization</TD>
+<TD>concrete</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The possible values of these flags are specified <a href="#flagvalues">here</a>.
+</P>
+<A NAME="toc31"></A>
+<H2>Types and expressions</H2>
+<A NAME="toc32"></A>
+<H3>Overview of expression forms</H3>
+<P>
+<a name="expressions"></a>
+</P>
+<P>
+Like many dependently typed languages, GF makes no syntactic distinction
+between expressions and types. An illegal use of a type as an expression or
+vice versa comes out as a type error. Whether a variable, for instance,
+stands for a type or an expression value, can only be resolved from its
+context of use.
+</P>
+<P>
+One practical consequence of the common syntax is that global and local definitions
+(<CODE>oper</CODE> judgements and <CODE>let</CODE> expressions, respectively) work in the same way
+for types and expressions. Thus it is possible to abbreviate a type
+occurring in a type expression:
+</P>
+<PRE>
+ let A = {s : Str ; b : Bool} in A -&gt; A -&gt; A
+</PRE>
+<P>
+Type and other expressions have a system of <B>precedences</B>. The following table
+summarizes all expression forms, from the highest to the lowest precedence.
+Some expressions are moreover left- or right-associative.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>prec</TH>
+<TH>expression example</TH>
+<TH COLSPAN="2">explanation</TH>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>c</CODE></TD>
+<TD>constant or variable</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>Type</CODE></TD>
+<TD>the type of types</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>PType</CODE></TD>
+<TD>the type of parameter types</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>Str</CODE></TD>
+<TD>the type of strings/token lists</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>"foo"</CODE></TD>
+<TD>string literal</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>123</CODE></TD>
+<TD>integer literal</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>0.123</CODE></TD>
+<TD>floating point literal</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>?</CODE></TD>
+<TD>metavariable</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>[]</CODE></TD>
+<TD>empty token list</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>[C a b]</CODE></TD>
+<TD>list category</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>["foo bar"]</CODE></TD>
+<TD>token list</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>{"s : Str ; n : Num}</CODE></TD>
+<TD>record type</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>{"s = "foo" ; n = Sg}</CODE></TD>
+<TD>record</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>&lt;Sg,Fem,Gen&gt;</CODE></TD>
+<TD>tuple</TD>
+</TR>
+<TR>
+<TD>7</TD>
+<TD><CODE>&lt;n : Num&gt;</CODE></TD>
+<TD>type-annotated expression</TD>
+</TR>
+<TR>
+<TD>6 left</TD>
+<TD><CODE>t.r</CODE></TD>
+<TD>projection or qualification</TD>
+</TR>
+<TR>
+<TD>5 left</TD>
+<TD><CODE>f a</CODE></TD>
+<TD>function application</TD>
+</TR>
+<TR>
+<TD>5</TD>
+<TD><CODE>table {Sg =&gt; [] ; _ =&gt; "xs"}</CODE></TD>
+<TD>table</TD>
+</TR>
+<TR>
+<TD>5</TD>
+<TD><CODE>table P [a ; b ; c]</CODE></TD>
+<TD>course-of-values table</TD>
+</TR>
+<TR>
+<TD>5</TD>
+<TD><CODE>case n of {Sg =&gt; [] ; _ =&gt; "xs"}</CODE></TD>
+<TD>case expression</TD>
+</TR>
+<TR>
+<TD>5</TD>
+<TD><CODE>variants {"color" ; "colour"}</CODE></TD>
+<TD>free variation</TD>
+</TR>
+<TR>
+<TD>5</TD>
+<TD><CODE>pre {"a" ; "an"/vowel}</CODE></TD>
+<TD>prefix-dependent choice</TD>
+</TR>
+<TR>
+<TD>4 left</TD>
+<TD><CODE>t ! v</CODE></TD>
+<TD>table selection</TD>
+</TR>
+<TR>
+<TD>4 left</TD>
+<TD><CODE>A * B</CODE></TD>
+<TD>tuple type</TD>
+</TR>
+<TR>
+<TD>4 left</TD>
+<TD><CODE>R ** {b : Bool}</CODE></TD>
+<TD>record (type) extension</TD>
+</TR>
+<TR>
+<TD>3 left</TD>
+<TD><CODE>t + s</CODE></TD>
+<TD>token gluing</TD>
+</TR>
+<TR>
+<TD>2 left</TD>
+<TD><CODE>t ++ s</CODE></TD>
+<TD>token list concatenation</TD>
+</TR>
+<TR>
+<TD>1 right</TD>
+<TD><CODE>\x,y -&gt; t</CODE></TD>
+<TD>function abstraction ("lambda")</TD>
+</TR>
+<TR>
+<TD>1 right</TD>
+<TD><CODE>\\x,y =&gt; t</CODE></TD>
+<TD>table abstraction</TD>
+</TR>
+<TR>
+<TD>1 right</TD>
+<TD><CODE>(x : A) -&gt; B</CODE></TD>
+<TD>dependent function type</TD>
+</TR>
+<TR>
+<TD>1 right</TD>
+<TD><CODE>A -&gt; B</CODE></TD>
+<TD>function type</TD>
+</TR>
+<TR>
+<TD>1 right</TD>
+<TD><CODE>P =&gt; T</CODE></TD>
+<TD>table type</TD>
+</TR>
+<TR>
+<TD>1 right</TD>
+<TD><CODE>let x = v in t</CODE></TD>
+<TD>local definition</TD>
+</TR>
+<TR>
+<TD>1</TD>
+<TD><CODE>t where {x = v}</CODE></TD>
+<TD>local definition</TD>
+</TR>
+<TR>
+<TD>1</TD>
+<TD><CODE>in M.C "foo"</CODE></TD>
+<TD>rule by example</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+Any expression in parentheses (<CODE>(</CODE><I>exp</I><CODE>)</CODE>) is in the highest
+precedence class.
+</P>
+<A NAME="toc33"></A>
+<H3>The functional fragment: expressions in abstract syntax</H3>
+<P>
+<a name="functiontype"></a>
+</P>
+<P>
+The expression syntax is the same in abstract and concrete syntax, although
+only a part of the syntax is actually usable in well-typed expressions in
+abstract syntax. An abstract syntax is essentially used for defining a set
+of types and a set of functions between those types. Therefore it needs
+essentially the <B>functional fragment</B>
+of the syntax. This fragment comprises two kinds of types:
+</P>
+<UL>
+<LI><B>basic types</B>, of form <I>C a1...an</I> where
+ <UL>
+ <LI><CODE>cat</CODE> <I>C</I> (<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>)...(<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>), including the predefined
+ categories <CODE>Int</CODE>, <CODE>Float</CODE>, and <CODE>String</CODE> explained <a href="#predefabs">here</a>
+ <LI><i>a</i><sub>1</sub> : <i>A</i><sub>1</sub>,...,<i>a</i><sub>n</sub> : <i>A</i><sub>n</sub>{<i>x</i><sub>1</sub> = <i>a</i><sub>1</sub>,...,<i>x</i><sub>n-1</sub>=<i>a</i><sub>n-1</sub>}
+ </UL>
+</UL>
+
+<UL>
+<LI><B>function types</B>, of form (<I>x</I> : <I>A</I>) <CODE>-&gt;</CODE> <I>B</I>, where
+ <UL>
+ <LI><I>A</I> is a type
+ <LI><I>B</I> is a type possibly depending on <I>x</I> : <I>A</I>
+ </UL>
+</UL>
+
+<P>
+When defining basic types, we used the notation
+<I>t</I>{<i>x</i><sub>1</sub> = <i>t</i><sub>1</sub>,...,<i>x</i><sub>n</sub>=<i>t</i><sub>n</sub>}
+for the <B>substitution</B> of values to variables. This is a metalevel notation,
+which denotes a term that is formed by replacing the free occurrences of
+each variable <i>x</i><sub>i</sub> by <i>t</i><sub>i</sub>.
+</P>
+<P>
+These types have six kinds of expressions:
+</P>
+<UL>
+<LI><B>constants</B>, <I>f</I> : <I>A</I> where
+ <UL>
+ <LI><CODE>fun</CODE> <I>f</I> : <I>A</I>
+ </UL>
+</UL>
+
+<UL>
+<LI><B>literals</B> for integers, floats, and strings (defined in <a href="#predefabs">here</a>)
+</UL>
+
+<UL>
+<LI><B>variables</B>, <I>x</I> : <I>A</I> where
+ <UL>
+ <LI><I>x</I> has been introduced by a binding
+ </UL>
+</UL>
+
+<UL>
+<LI><B>applications</B>, <I>f a</I> : <I>B</I>{<I>x</I>=<I>a</I>}, where
+ <UL>
+ <LI><I>f</I> : (<I>x</I> : <I>A</I>) <CODE>-&gt;</CODE> <I>B</I>
+ <LI><I>a</I> : <I>A</I>
+ </UL>
+</UL>
+
+<UL>
+<LI><B>abstractions</B>, <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>b</I> : (<I>x</I> : <I>A</I>) <CODE>-&gt;</CODE> <I>B</I>, where
+ <UL>
+ <LI><I>b</I> : <I>B</I> possibly depending on <I>x</I> : <I>A</I>
+ </UL>
+</UL>
+
+<UL>
+<LI><B>metavariables</B>, <CODE>?</CODE>, as introduced in intermediate phases of
+ incremental type checking; metavariables are not permitted
+ in GF source code
+</UL>
+
+<P>
+<a name="variablebinding"></a>
+</P>
+<P>
+The notion of <B>binding</B> is defined for occurrences of variables in
+subexpressions as follows:
+</P>
+<UL>
+<LI>in (<I>x</I> : <I>A</I>) <CODE>-&gt;</CODE> <I>B</I>, <I>x</I> is bound in <I>B</I>
+<LI>in <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>b</I>, <I>x</I> is bound in <I>b</I>
+<LI>in <CODE>def</CODE> <I>f</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub> = <I>t</I>, any pattern variable introduced in
+ any <I>pi</I> is bound in <I>t</I> (as defined <a href="#patternmatching">here</a>)
+</UL>
+
+<P>
+As syntactic sugar, function types have sharing of types and
+suppression of variables, in the same way as contexts
+(defined <a href="#contexts">here</a>):
+</P>
+<UL>
+<LI>variables can share a type,
+<center>
+<CODE>(</CODE> <I>x,y</I> <CODE>:</CODE> <I>A</I> <CODE>)</CODE> <CODE>-&gt;</CODE> <I>B</I> ===
+ <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>A</I> <CODE>) -&gt; (</CODE> <I>y</I> <CODE>:</CODE> <I>A</I> <CODE>) -&gt;</CODE> <I>B</I>
+</center>
+<LI>a <B>wildcard</B> can be used for a variable not occurring later in the type,
+<center>
+<CODE>(</CODE> <CODE>_</CODE> <CODE>:</CODE> <I>A</I> <CODE>) -&gt;</CODE> <I>B</I> ===
+ <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>) -&gt;</CODE> <I>B</I>
+</center>
+<LI>if the variable does not occur later, it can be omitted altogether, and
+ parentheses are not used,
+<center>
+ <I>A</I> <CODE>-&gt;</CODE> <I>B</I> === <CODE>(</CODE> <I>_</I> <CODE>:</CODE> <I>A</I> <CODE>) -&gt;</CODE> <I>B</I>
+</center>
+</UL>
+
+<P>
+There is analogous syntactic sugar for constant functions,
+<center>
+<CODE>\</CODE><I>_</I> <CODE>-&gt;</CODE> <I>t</I> === <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>t</I>
+</center>
+where <I>x</I> does not occur in <I>t</I>, and for multiple lambda abstractions:
+<center>
+<CODE>\</CODE><I>p,q</I> <CODE>-&gt;</CODE> <I>t</I> === <CODE>\</CODE><I>p</I> <CODE>-&gt;</CODE> <CODE>\</CODE><I>q</I> <CODE>-&gt;</CODE> <I>t</I>
+</center>
+where <I>p</I> and <I>q</I> are variables or wild cards (<CODE>_</CODE>).
+</P>
+<A NAME="toc34"></A>
+<H3>Conversions</H3>
+<P>
+<a name="conversions"></a>
+</P>
+<P>
+Among expressions, there is a relation of <B>definitional equality</B> defined
+by four <B>conversion rules</B>:
+</P>
+<UL>
+<LI><B>alpha conversion</B>:
+ <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>b</I> = <CODE>\</CODE><I>y</I> <CODE>-&gt;</CODE> <I>b</I>{<I>x</I>=<I>y</I>}
+</UL>
+
+<UL>
+<LI><B>beta conversion</B>: (<CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>b</I>) <I>a</I> = <I>b</I>{<I>x</I>=<I>a</I>}
+</UL>
+
+<UL>
+<LI><B>delta conversion</B>: <I>f</I> <i>a</i><sub>1</sub> ... <i>a</i><sub>n</sub> = <I>tg</I>, if
+ <UL>
+ <LI>there is a definition <CODE>def</CODE> <I>f</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub> = <I>t</I>
+ <LI>this definition is the first for <I>f</I> that matches the sequence
+ <i>a</i><sub>1</sub> .... <i>a</i><sub>n</sub>, with the substitution <I>g</I>
+ </UL>
+</UL>
+
+<UL>
+<LI><B>eta conversion</B>: <I>c</I> = <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>c x</I>,
+ if <I>c</I> : (<I>x</I> : <I>A</I>) <CODE>-&gt;</CODE> <I>B</I>
+</UL>
+
+<P>
+Pattern matching substitution used in delta conversion
+is defined <a href="#patternmatching">here</a>.
+</P>
+<P>
+An expression is in <B>beta-eta-normal form</B> if
+</P>
+<UL>
+<LI>it has no subexpressions to which beta conversion applies (beta normality)
+<LI>each constant or variable whose type is a function type must be
+ <B>eta-expanded</B>, i.e. made into an abstract equal to it by eta conversion
+ (eta normality)
+</UL>
+
+<P>
+Notice that the iteration of eta expansion would lead to an expression not
+in beta-normal form.
+</P>
+<A NAME="toc35"></A>
+<H3>Syntax trees</H3>
+<P>
+<a name="syntaxtrees"></a>
+</P>
+<P>
+The <B>syntax trees</B> defined by an abstract syntax are well-typed
+expressions of basic types in beta-eta normal form.
+Linearization defined in concrete
+syntax applies to all and only these expressions.
+</P>
+<P>
+There is also a direct definition of syntax trees, which does not
+refer to beta and eta conversions: keeping in mind that a type always has
+the form
+<center>
+(<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>) <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> (<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>) <CODE>-&gt;</CODE> <I>B</I>
+</center>
+where <I>Ai</I> are types and <I>B</I> is a basic type, a syntax tree is an expression
+<center>
+<I>b</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>n</sub> : <I>B'</I>
+</center>
+where
+</P>
+<UL>
+<LI><I>B'</I> is the basic type <I>B</I>{<i>x</i><sub>1</sub> = <i>t</i><sub>1</sub>,...,<i>x</i><sub>n</sub> = <i>t</i><sub>n</sub>}
+<LI><CODE>fun</CODE> <I>b</I> : (<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>) <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> (<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>) <CODE>-&gt;</CODE> <I>B</I>
+<LI>each <i>t</i><sub>i</sub> has the form <CODE>\</CODE><i>z</i><sub>1</sub>,...,<i>z</i><sub>m</sub> <CODE>-&gt;</CODE> <I>c</I> where <i>A</i><sub>i</sub> is
+<center>
+(<i>y</i><sub>1</sub> : <i>B</i><sub>1</sub>) <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> (<i>y</i><sub>m</sub> : <i>B</i><sub>m</sub>) <CODE>-&gt;</CODE> <I>B</I>
+</center>
+</UL>
+
+<A NAME="toc36"></A>
+<H3>Predefined types in abstract syntax</H3>
+<P>
+<a name="predefabs"></a>
+</P>
+<P>
+GF provides three predefined categories for abstract syntax, with predefined
+expressions:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>category</TH>
+<TH COLSPAN="2">expressions</TH>
+</TR>
+<TR>
+<TD ALIGN="center"><CODE>Int</CODE></TD>
+<TD>integer literals, e.g. <CODE>123</CODE></TD>
+</TR>
+<TR>
+<TD ALIGN="center"><CODE>Float</CODE></TD>
+<TD>floating point literals, e.g. <CODE>12.34</CODE></TD>
+</TR>
+<TR>
+<TD ALIGN="center"><CODE>String</CODE></TD>
+<TD>string literals, e.g. <CODE>"foo"</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+These categories take no arguments, and they can be used as basic
+types in the same way as if they were introduced in <CODE>cat</CODE> judgements.
+However, it is not legal to define <CODE>fun</CODE> functions that have any
+of these types as value type: their only well-typed expressions are
+literals as defined in the above table.
+</P>
+<A NAME="toc37"></A>
+<H3>Overview of expressions in concrete syntax</H3>
+<P>
+<a name="cnctypes"></a>
+</P>
+<P>
+Concrete syntax is about defining mappings from abstract syntax trees
+to <B>concrete syntax objects</B>. These objects comprise
+</P>
+<UL>
+<LI>records
+<LI>tables
+<LI>strings
+<LI>parameter values
+</UL>
+
+<P>
+Thus functions are not concrete syntax objects; however, the
+mappings themselves are expressed as functions, and the source code
+of a concrete syntax can use functions under the condition that
+they can be eliminated from the final compiled grammar (which they
+can; this is one of the fundamental properties of compilation, as
+explained in more detail in the <I>JFP</I> article).
+</P>
+<P>
+Concrete syntax thus has the same function types and expression forms as
+abstract syntax, specified <a href="#functiontype">here</a>. The basic types defined
+by categories (<CODE>cat</CODE> judgements) are available via grammar reuse
+explained <a href="#reuse">here</a>; this also comprises the
+predefined categories <CODE>Float</CODE> and <CODE>String</CODE>.
+</P>
+<A NAME="toc38"></A>
+<H3>Values, canonical forms, and run-time variables</H3>
+<P>
+In abstract syntax, the conversion rules fiven <a href="#conversions">here</a>
+define a computational relation
+among expressions, but there is no separate notion of a <B>value</B> of
+computation: the value (the end point) of a computation chain is
+simply an expression to which no more conversions apply. In general,
+we are interested in expressions that satisfy the conditions of being
+syntax trees (as defined <a href="#syntaxtrees">here</a>), but there can be many computationally
+equivalent syntax trees which nonetheless are distinct syntax trees
+and hence have different linearizations. The main use of computation
+in abstract syntax is to compare types in dependent type checking.
+</P>
+<P>
+In concrete syntax, the notion of values is central. At run time,
+we want to compute the values of linearizations; at compile time, we want
+to perform <B>partial evaluation</B>, which computes expressions as far as
+possible.
+To specify what happens
+in computation we therefore have to distinguish between <B>canonical forms</B>
+and other forms of expressions. The canonical forms are defined separately
+for each form of type, whereas the other forms may usually produce expressions
+of any type.
+</P>
+<P>
+<a name="linexpansion"></a>
+<a name="runtimevariables"></a>
+</P>
+<P>
+What is done at compile time is the elimination of any noncanonical forms,
+except for those depending on <B>run-time variables</B>. Run-time variables are
+the same as the <B>argument variables</B> of linearization rules, i.e. the
+variables <i>x</i><sub>1</sub>,...,<i>x</i><sub>n</sub> in
+<center>
+<CODE>lin</CODE> <I>f</I> <CODE>= \</CODE> <i>x</i><sub>1</sub>,...,<i>x</i><sub>n</sub> <CODE>-&gt;</CODE> <I>t</I>
+</center>
+where
+<center>
+<CODE>fun</CODE> <I>f</I> <CODE>:</CODE>
+(<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>) <CODE>-&gt;</CODE> ... <CODE>-&gt;</CODE> (<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>) <CODE>-&gt;</CODE> <I>B</I>
+</center>
+Notice that this definition refers to the <B>eta-expanded</B> linearization term,
+which has one abstracted variable for each argument type of <I>f</I>. These variables
+are not necessarily explicit in GF source code, but introduced by the compiler.
+</P>
+<P>
+Since certain expression forms should be eliminated in compilation but
+cannot be eliminated if run-time variables appear in them, errors can
+appear late in compilation. This is an issue with the following
+expression forms:
+</P>
+<UL>
+<LI>gluing (<CODE>s + t</CODE>), defined <a href="#gluing">here</a>
+<LI>pattern matching on strings, defined <a href="#patternmatching">here</a>
+<LI>predefined string operations, defined <a href="#predefcnc">here</a> (those taking
+ <CODE>Str</CODE> arguments)
+</UL>
+
+<A NAME="toc39"></A>
+<H3>Token lists, tokens, and strings</H3>
+<P>
+<a name="strtype"></a>
+</P>
+<P>
+The most prominent basic type is <CODE>Str</CODE>, the type of <B>token lists</B>.
+This type is often sloppily referred to as the type of <B>strings</B>;
+but it should be kept in mind that the objects of <CODE>Str</CODE> are
+<I>lists</I> of strings rather than single strings.
+</P>
+<P>
+Expressions of type <CODE>Str</CODE> have the following canonical forms:
+</P>
+<UL>
+<LI><B>tokens</B>, i.e. <B>string literals</B>, in double quotes, e.g. <CODE>"foo"</CODE>
+<LI><B>the empty token list</B>, <CODE>[]</CODE>
+<LI><B>concatenation</B>, <I>s</I> <CODE>++</CODE> <I>t</I>, where <I>s,t</I> : <CODE>Str</CODE>
+<LI><B>prefix-dependent choice</B>,
+ <CODE>pre {</CODE> <I>s</I> ; <i>s</i><sub>1</sub> <CODE>/</CODE> <i>p</i><sub>1</sub> ; ... ; <i>s</i><sub>n</sub> <CODE>/</CODE> <i>p</i><sub>n</sub>}, where
+ <UL>
+ <LI><I>s</I>, <i>s</i><sub>1</sub>,...,<i>s</i><sub>n</sub>, <i>p</i><sub>1</sub>,...,<i>p</i><sub>n</sub> : <CODE>Str</CODE>
+ </UL>
+</UL>
+
+<P>
+For convenience, the notation is overloaded so that tokens are identified
+with singleton token lists, and there is no separate type of tokens
+(this is a change from the <I>JFP</I> article).
+The notion of a token
+is still important for compilation: all tokens introduced by
+the grammar must be known at compile time. This, in turn, is
+required by the parsing algorithms used for parsing with GF grammars.
+</P>
+<P>
+In addition to string literals, tokens can be formed by a specific
+non-canonical operator:
+</P>
+<UL>
+<LI><B>gluing</B>, <I>s</I> <CODE>+</CODE> <I>t</I>, where <I>s,t</I> : <CODE>Str</CODE>
+</UL>
+
+<P>
+<a name="gluing"></a>
+</P>
+<P>
+Being noncanonical, gluing is equipped with a computation rule:
+string literals are glued by forming a new string literal, and
+empty token lists can be ignored:
+</P>
+<UL>
+<LI><CODE>"foo" + "bar"</CODE> ==> <CODE>"foobar"</CODE>
+<LI><I>t</I> <CODE>+ []</CODE> ==> <I>t</I>
+<LI><CODE>[] +</CODE> <I>t</I> ==> <I>t</I>
+</UL>
+
+<P>
+Since tokens must be known at compile time,
+the operands of gluing may not depend on run-time variables,
+as defined <a href="#runtimevariables">here</a>.
+</P>
+<P>
+As syntactic sugar, token lists can be given as bracketed string literals, where
+spaces separate tokens:
+</P>
+<UL>
+<LI><B>token lists</B>, <CODE>["one two three"]</CODE> === <CODE>"one" ++ "two" ++ "three"</CODE>
+</UL>
+
+<P>
+Notice that there are no empty tokens, but the expression <CODE>[]</CODE>
+can be used in a context requiring a token, in particular in gluing expression
+below. Since <CODE>[]</CODE> denotes an empty token list, the following computation laws
+are valid:
+</P>
+<UL>
+<LI><I>t</I> <CODE>++ []</CODE> ==> <I>t</I>
+<LI><CODE>[] ++</CODE> <I>t</I> ==> <I>t</I>
+</UL>
+
+<P>
+Moreover, concatenation and gluing are associative:
+</P>
+<UL>
+<LI>s <CODE>+</CODE> (t <CODE>+</CODE> u) ==> s <CODE>+</CODE> t <CODE>+</CODE> u
+<LI>s <CODE>++</CODE> (t <CODE>++</CODE> u) ==> s <CODE>++</CODE> t <CODE>++</CODE> u
+</UL>
+
+<P>
+For the programmer, associativity and the empty token laws mean
+that the compiler can use them to simplify string expressions.
+It also means that these laws are respected in pattern matching
+on strings.
+</P>
+<P>
+A prime example of prefix-dependent choice operation is the following
+approximative expression for the English indefinite article:
+</P>
+<PRE>
+ pre {"a" ; "an" / variants {"a" ; "e" ; "i" ; "o"}}
+</PRE>
+<P>
+This expression can be computed in the context of a subsequent token:
+</P>
+<UL>
+<LI><CODE>pre {</CODE> <I>s</I> ; <i>s</i><sub>1</sub> <CODE>/</CODE> <i>p</i><sub>1</sub> ; ... ; <i>s</i><sub>n</sub> <CODE>/</CODE> <i>p</i><sub>n</sub><CODE>} ++</CODE> <I>t</I>
+ ==>
+ <UL>
+ <LI><i>s</i><sub>i</sub> for the first <I>i</I> such that the prefix <i>p</i><sub>i</sub>
+ matches <I>t</I>, if it exists
+ <LI><I>s</I> otherwise
+ </UL>
+</UL>
+
+<P>
+The <B>matching prefix</B> is defined by comparing the string with the prefix of
+the token. If the prefix is a variant list of strings, then it matches
+the token if any of the strings in the list matches it.
+</P>
+<P>
+The computation rule can sometimes be applied at compile time, but it general,
+prefix-dependent choices need to be passed to the run-time grammar, because
+they are not given a subsequent token to compare with, or because the
+subsequent token depends on a run-time variable.
+</P>
+<P>
+The prefix-dependent choice expression itself may not depend on run-time
+variables.
+</P>
+<P>
+<I>In GF prior to 3.0, a specific type</I> <CODE>Strs</CODE>
+<I>is used for defining prefixes,</I>
+<I>instead of just</I> <CODE>variants</CODE> <I>of</I> <CODE>Str</CODE>.
+</P>
+<A NAME="toc40"></A>
+<H3>Records and record types</H3>
+<P>
+A <B>record</B> is a collection of objects of possibly different types,
+accessible by <B>projections</B> from the record with <B>labels</B> pointing
+to these objects. A record is also itself an object, whose type is
+a <B>record type</B>. Record types have the form
+<center>
+ <CODE>{</CODE> <i>r</i><sub>1</sub> : <i>A</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> : <i>A</i><sub>n</sub> <CODE>}</CODE>
+</center>
+where <I>n</I> &gt;= 0, each <i>A</i><sub>i</sub> is a type, and the labels <i>r</i><sub>i</sub> are
+distinct. A record of this type has the form
+<center>
+ <CODE>{</CODE> <i>r</i><sub>1</sub> = <i>a</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> = <i>a</i><sub>n</sub> <CODE>}</CODE>
+</center>
+where each #aii : "Aii. A limiting case is the <B>empty record type</B>
+<CODE>{}</CODE>, which has the object <CODE>{}</CODE>, the <B>empty record</B>.
+</P>
+<P>
+The <B>fields</B> of a record type are its parts of the form <I>r</I> : <I>A</I>,
+also called <B>typings</B>. The <B>fields</B> of a record are of the form
+<I>r</I> = <I>a</I>, also called <B>value assignments</B>. Value assignments
+may optionally indicate the type, as in <I>r</I> : <I>A</I> = <I>a</I>.
+</P>
+<P>
+The order of fields in record types and records is insignificant: two record
+types (or records) are equal if they have the same fields, in any order, and a
+record is an object of a record type, if it has type-correct value assignments
+for all fields of the record type.
+The latter definition implies the even stronger
+principle of <B>record subtyping</B>: a record can have any type that has some
+subset of its fields. This principle is explained further
+<a href="#subtyping">here</a>.
+</P>
+<P>
+All fields in a record must have distinct labels. Thus it is not possible
+e.g. to "redefine" a field "later" in a record.
+</P>
+<P>
+Lexically, labels are identifiers (defined <a href="#identifiers">here</a>).
+This is with the exception
+of the labels selecting bound variables in the linearization of higher-order
+abstract syntax, which have the form <CODE>$</CODE><I>i</I> for an integer <I>i</I>,
+as specified <a href="#HOAS">here</a>.
+In source code, these labels should not appear in records fields,
+but only in selections.
+</P>
+<P>
+Labels occur only in syntactic positions where they cannot be confused with
+constants or variables. Therefore it is safe to write, as in <CODE>Prelude</CODE>,
+</P>
+<PRE>
+ ss : Str -&gt; {s : Str} = \s -&gt; {s = s} ;
+</PRE>
+<P>
+A <B>projection</B> is an expression of the form
+<center>
+ <I>t</I>.<I>r</I>
+</center>
+where <I>t</I> must be a record and <I>r</I> must be a label defined in it.
+The type of the projection is the type of that field.
+The computation rule for projection returns the value assigned to that field:
+<center>
+<CODE>{</CODE> ... <CODE>;</CODE> <I>r</I> = <I>a</I> <CODE>;</CODE> ... <CODE>}.</CODE><I>r</I> ==> <I>a</I>
+</center>
+Notice that the dot notation <I>t</I>.<I>r</I> is also used for qualified names
+as specified <a href="#qualifiednames">here</a>.
+This ambiguity follows tradition and convenience. It is
+resolved by the following rules (before type checking):
+</P>
+<OL>
+<LI>if <I>t</I> is a bound variable or a constant in scope,
+ <I>t</I>.<I>r</I> is type-checked as a projection
+<LI>otherwise, <I>t</I>.<I>r</I> is type-checked as a qualified name
+</OL>
+
+<P>
+As syntactic sugar, types and values can be shared:
+</P>
+<UL>
+<LI><CODE>{</CODE> ... <CODE>;</CODE> <I>r,s</I> : <I>A</I> <CODE>;</CODE> ... <CODE>}</CODE> ===
+ <CODE>{</CODE> ... <CODE>;</CODE> <I>r</I> : <I>A</I> <CODE>;</CODE> <I>s</I> : <I>A</I> <CODE>;</CODE> ... <CODE>}</CODE>
+<LI><CODE>{</CODE> ... <CODE>;</CODE> <I>r,s</I> = <I>a</I> <CODE>;</CODE> ... <CODE>}</CODE> ===
+ <CODE>{</CODE> ... <CODE>;</CODE> <I>r</I> = <I>a</I> <CODE>;</CODE> <I>s</I> = <I>a</I> <CODE>;</CODE> ... <CODE>}</CODE>
+</UL>
+
+<P>
+Another syntactic sugar are <B>tuple types</B> and <B>tuples</B>, which are translated
+by endowing their unlabelled fields by the labels <CODE>p1</CODE>, <CODE>p2</CODE>,... in the
+order of appearance of the fields:
+</P>
+<UL>
+<LI><i>A</i><sub>1</sub> <CODE>*</CODE> ... <CODE>*</CODE> <i>A</i><sub>n</sub> ===
+ <CODE>{</CODE> <CODE>p1</CODE> : <i>A</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <CODE>pn</CODE> : <i>A</i><sub>n</sub> <CODE>}</CODE>
+<LI><CODE>&lt;</CODE><i>a</i><sub>1</sub> <CODE>,</CODE> ... <CODE>,</CODE> <i>a</i><sub>n</sub> <CODE>&gt;</CODE> ===
+ <CODE>{</CODE> <CODE>p1</CODE> = <i>a</i><sub>1</sub><CODE>;</CODE> ... <CODE>;</CODE> <CODE>pn</CODE> = <i>a</i><sub>n</sub> <CODE>}</CODE>
+</UL>
+
+<P>
+A <B>record extension</B> is formed by adding fields to a record or a record type.
+The general syntax involves two expressions,
+<center>
+ <I>R</I> <CODE>**</CODE> <I>S</I>
+</center>
+The result is a record type or a record with a union of the fields of <I>R</I> and
+<I>S</I>. It is therefore well-formed if
+</P>
+<UL>
+<LI>both <I>R</I> and <I>S</I> are either records or record types
+<LI>the labels in <I>R</I> and <I>S</I> are distinct
+</UL>
+
+<A NAME="toc41"></A>
+<H3>Subtyping</H3>
+<P>
+<a name="subtyping"></a>
+</P>
+<P>
+The possibility of having superfluous fields in a record forms the basis of
+the <B>subtyping</B> relation.
+That <I>A</I> is a subtype of <I>B</I> means that <I>a : A</I> implies <I>a : B</I>.
+This is clearly satisfied for records with superfluous fields:
+</P>
+<UL>
+<LI>if <I>R</I> is a record type without the label <I>r</I>,
+ then <I>R</I> <CODE>** {</CODE> <I>r</I> : <I>A</I> <CODE>}</CODE> is a subtype of <I>R</I>
+</UL>
+
+<P>
+The GF grammar compiler extends subtyping to function types by <B>covariance</B>
+and <B>contravariance</B>:
+</P>
+<UL>
+<LI>covariance: if <I>A</I> is a subtype of <I>B</I>,
+ then <I>C</I> <CODE>-&gt;</CODE> <I>A</I> is a subtype of <I>C</I> <CODE>-&gt;</CODE> <I>B</I>
+<LI>contravariance: if <I>A</I> is a subtype of <I>B</I>,
+ then <I>B</I> <CODE>-&gt;</CODE> <I>C</I> is a subtype of <I>A</I> <CODE>-&gt;</CODE> <I>C</I>
+</UL>
+
+<P>
+The logic of these rules is natural: if a function is returns a value
+in a subtype, then this value is <I>a fortiori</I> in the supertype.
+If a function is defined for some type, then it is <I>a fortiori</I> defined
+for any subtype.
+</P>
+<P>
+In addition to the well-known principles of record subtyping and co- and
+contravariance, GF implements subtyping for initial segments of integers:
+</P>
+<UL>
+<LI>if <I>m</I> &lt; <I>n</I>, then <CODE>Ints</CODE> <I>m</I> is a subtype of <CODE>Ints</CODE> <I>n</I>
+<LI><CODE>Ints</CODE> <I>n</I> is a subtype of <CODE>Integer</CODE>
+</UL>
+
+<P>
+As the last rule, subtyping is transitive:
+</P>
+<UL>
+<LI>if <I>A</I> is a subtype of <I>B</I> and <I>B</I> is a subtype of <I>C</I>, then
+ <I>A</I> is a subtype of <I>C</I>.
+</UL>
+
+<A NAME="toc42"></A>
+<H3>Tables and table types</H3>
+<P>
+<a name="tables"></a>
+</P>
+<P>
+One of the most characteristic constructs of GF is <B>tables</B>, also called
+<B>finite functions</B>. That these functions are finite means that it
+is possible to finitely enumerate all argument-value pairs; this, in
+turn, is possible because the argument types are finite.
+</P>
+<P>
+A <B>table type</B> has the form
+<center>
+<I>P</I> <CODE>=&gt;</CODE> <I>T</I>
+</center>
+where <I>P</I> must be a parameter type in the sense defined <a href="#paramtypes">here</a>, whereas
+<I>T</I> can be any type.
+</P>
+<P>
+Canonical expressions of table types are <B>tables</B>, of the form
+<center>
+<CODE>table</CODE> <CODE>{</CODE> <i>V</i><sub>1</sub> <CODE>=&gt;</CODE> <i>t</i><sub>1</sub> ; ... ; <i>V</i><sub>n</sub> <CODE>=&gt;</CODE> <i>t</i><sub>n</sub> <CODE>}</CODE>
+</center>
+where <i>V</i><sub>1</sub>,...,<i>V</i><sub>n</sub> is the complete list of the parameter values of
+the argument type <I>P</I> (defined <a href="#paramvalues">here</a>), and each <i>t</i><sub>i</sub> is
+an expression of the value type <I>T</I>.
+</P>
+<P>
+In addition to explicit enumerations,
+tables can be given by <B>pattern matching</B>,
+<center>
+<CODE>table</CODE> <CODE>{</CODE><i>p</i><sub>1</sub> <CODE>=&gt;</CODE> <i>t</i><sub>1</sub> ; ... ; <i>p</i><sub>m</sub> <CODE>=&gt;</CODE> <i>t</i><sub>m</sub><CODE>}</CODE>
+</center>
+where <i>p</i><sub>1</sub>,....,<i>p</i><sub>m</sub> is a list of patterns that covers all values of type <I>P</I>.
+Each pattern <i>p</i><sub>i</sub> may bind some variables, on which the expression <i>t</i><sub>i</sub>
+may depend. A complete account of patterns and pattern matching is given
+<a href="#patternmatching">here</a>.
+</P>
+<P>
+A <B>course-of-values table</B> omits the patterns and just lists all
+values. It uses the enumeration of all values of the argument type <I>P</I>
+to pair the values with arguments:
+<center>
+<CODE>table</CODE> <I>P</I> <CODE>[</CODE><i>t</i><sub>1</sub> ; ... ; <i>t</i><sub>n</sub><CODE>]</CODE>
+</center>
+This format is not recommended for GF source code, since the
+ordering of parameter values is not specified and therefore a
+compiler-internal decision.
+</P>
+<P>
+The argument type can be indicated in ordinary tables as well, which is
+sometimes helpful for type inference:
+<center>
+<CODE>table</CODE> <I>P</I> <CODE>{</CODE> ... <CODE>}</CODE>
+</center>
+</P>
+<P>
+The <B>selection</B> operator <CODE>!</CODE>, applied to a table <I>t</I> and to an expression
+<I>v</I> of its argument type
+<center>
+<I>t</I> <CODE>!</CODE> <I>v</I>
+</center>
+returns the first pattern matching result from <I>t</I> with <I>v</I>, as defined
+<a href="#patternmatching">here</a>. The order of patterns is thus significant as long as the
+patterns contain variables or wildcards. When the compiler reorders the
+patterns following the enumeration of all values of the argument type,
+this order no longer matters, because no overlap remains between patterns.
+</P>
+<P>
+The GF compiler performs <B>table expansion</B>, i.e. an analogue of
+eta expansion defined <a href="#conversions">here</a>, where a table is applied to all
+values to its argument type:
+<center>
+<I>t</I> : <I>P</I> <CODE>=&gt;</CODE> <I>T</I> ==>
+<CODE>table</CODE> <I>P</I> <CODE>[</CODE><I>t</I> <CODE>!</CODE> <i>V</i><sub>1</sub> ; ... ; <I>t</I> <CODE>!</CODE> <i>V</i><sub>n</sub><CODE>]</CODE>
+</center>
+As syntactic sugar, one-branch tables can be written in a way similar to
+lambda abstractions:
+<center>
+<CODE>\\</CODE><I>p</I> <CODE>=&gt;</CODE> <I>t</I> === <CODE>table {</CODE><I>p</I> <CODE>=&gt;</CODE> <I>t</I> <CODE>}</CODE>
+</center>
+where <I>p</I> is either a variable or a wildcard (<CODE>_</CODE>). Multiple bindings
+can be abbreviated:
+<center>
+<CODE>\\</CODE><I>p,q</I> <CODE>=&gt;</CODE> <I>t</I> === <CODE>\\</CODE><I>p</I> <CODE>=&gt;</CODE> <CODE>\\</CODE><I>q</I> <CODE>=&gt;</CODE> <I>t</I>
+</center>
+<B>Case expressions</B> are syntactic sugar for selections:
+<center>
+<CODE>case</CODE> <I>e</I> <CODE>of {</CODE>...<CODE>}</CODE> === <CODE>table {</CODE>...<CODE>} !</CODE> <I>e</I>
+</center>
+</P>
+<A NAME="toc43"></A>
+<H3>Pattern matching</H3>
+<P>
+<a name="patternmatching"></a>
+</P>
+<P>
+We will list all forms of patterns that can be used in table branches.
+We define their <B>variable bindings</B> and <B>matching substitutions</B>.
+</P>
+<P>
+We start with the patterns available for all parameter types, as well
+as for the types <CODE>Integer</CODE> and <CODE>Str</CODE>.
+</P>
+<UL>
+<LI>A constructor pattern <I>C</I> <i>p</i><sub>1</sub>...<i>p</i><sub>n</sub>
+ binds the union of all variables bound in the subpatterns
+ <i>p</i><sub>1</sub>,...,<i>p</i><sub>n</sub>.
+ It matches any value
+ <I>C</I> <i>V</i><sub>1</sub>...<i>V</i><sub>n</sub> where each <i>p</i><sub>i</sub># matches <i>V</i><sub>i</sub>,
+ and the matching substitution is the union of these substitutions.
+<LI>A record pattern
+ <CODE>{</CODE> <i>r</i><sub>1</sub> <CODE>=</CODE> <i>p</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> <CODE>=</CODE> <i>p</i><sub>n</sub> <CODE>}</CODE>
+ binds the union of all variables bound in the subpatterns
+ <i>p</i><sub>1</sub>,...,<i>p</i><sub>n</sub>.
+ It matches any value
+ <CODE>{</CODE> <i>r</i><sub>1</sub> <CODE>=</CODE> <i>V</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> <CODE>=</CODE> <i>V</i><sub>n</sub> <CODE>;</CODE> ...<CODE>}</CODE>
+ where each <i>p</i><sub>i</sub># matches <i>V</i><sub>i</sub>,
+ and the matching substitution is the union of these substitutions.
+<LI>A variable pattern <I>x</I>
+ (identifier other than parameter constructor)
+ binds the variable <I>x</I>.
+ It matches any value <I>V</I>, with the substitution {<I>x</I> = <I>V</I>}.
+<LI>The wild card <CODE>_</CODE> binds no variables.
+ It matches any value, with the empty substitution.
+<LI>A disjunctive pattern <I>p</I> <CODE>|</CODE> <I>q</I> binds the intersection of
+ the variables bound by <I>p</I> and <I>q</I>.
+ It matches anything that
+ either <I>p</I> or <I>q</I> matches, with the first substitution starting
+ with <I>p</I> matches, from which those
+ variables that are not bound by both patterns are removed.
+<LI>A negative pattern <CODE>-</CODE> <I>p</I> binds no variables.
+ It matches anything that <I>p</I> does <I>not</I> match, with the empty
+ substitution.
+<LI>An alias pattern <I>x</I> <CODE>@</CODE> <I>p</I> binds <I>x</I> and all the variables
+ bound by <I>p</I>. It matches any value <I>V</I> that <I>p</I> matches, with
+ the same substition extended by {<I>x</I> = <I>V</I>}.
+</UL>
+
+<P>
+The following patterns are only available for the type <CODE>Str</CODE>:
+</P>
+<UL>
+<LI>A string literal pattern, e.g. <CODE>"s"</CODE>, binds no variables.
+ It matches the same string, with the empty substitution.
+<LI>A concatenation pattern, <I>p</I> <CODE>+</CODE> <I>q</I>,
+ binds the union of variables bound by <I>p</I> and <I>q</I>.
+ It matches any string that consists
+ of a prefix matching <I>p</I> and a suffix matching <I>q</I>,
+ with the union of substitutions corresponding to the first match (see below).
+<LI>A repetition pattern <I>p</I><CODE>*</CODE> binds no variables.
+ It matches any string that can be decomposed
+ into strings that match <I>p</I>, with the empty substitution.
+</UL>
+
+<P>
+The following pattern is only available for the types <CODE>Integer</CODE>
+and <CODE>Ints</CODE> <I>n</I>:
+</P>
+<UL>
+<LI>An integer literal pattern, e.g. <CODE>214</CODE>, binds no variables.
+ It matches the same integer, with
+ the empty substitution.
+</UL>
+
+<P>
+All patterns must be <B>linear</B>: the same pattern variable may occur
+only once in them. This is what makes it straightforward to speak
+about unions of binding sets and substitutions.
+</P>
+<P>
+Pattern matching is performed in the order in which the branches
+appear in the source code: the branch of the first matching pattern is followed.
+In concrete syntax, the type checker reject sets of patterns that are
+not exhaustive, and warns for completely overshadowed patterns.
+It also checks the type correctness of patterns with respect to the
+argument type. In abstract syntax, only type correctness is checked,
+no exhaustiveness or overshadowing.
+</P>
+<P>
+It follows from the definition of record pattern matching
+that it can utilize partial records: the branch
+</P>
+<PRE>
+ {g = Fem} =&gt; t
+</PRE>
+<P>
+in a table of type <CODE>{g : Gender ; n : Number} =&gt; T</CODE> means the same as
+</P>
+<PRE>
+ {g = Fem ; n = _} =&gt; t
+</PRE>
+<P>
+Variables in regular expression patterns
+are always bound to the <B>first match</B>, which is the first
+in the sequence of binding lists. For example:
+</P>
+<UL>
+<LI><CODE>x + "e" + y</CODE> matches <CODE>"peter"</CODE> with <CODE>x = "p", y = "ter"</CODE>
+<LI><CODE>x + "er"*</CODE> matches <CODE>"burgerer"</CODE> with <CODE>x = "burg"</CODE>
+</UL>
+
+<A NAME="toc44"></A>
+<H3>Free variation</H3>
+<P>
+An expressions of the form
+<center>
+<CODE>variants</CODE> <CODE>{</CODE><i>t</i><sub>1</sub> ; ... ; <i>t</i><sub>n</sub><CODE>}</CODE>
+</center>
+where all <i>t</i><sub>i</sub> are of the same type <I>T</I>, has itseld type <I>T</I>.
+This expression presents <i>t</i><sub>i</sub>,...,<i>t</i><sub>n</sub> as being in <B>free variation</B>:
+the choice between them is not determined by semantics or parameters.
+A limiting case is
+<center>
+<CODE>variants {}</CODE>
+</center>
+which encodes a rule saying that there is no way to express a certain
+thing, e.g. that a certain inflectional form does not exist.
+</P>
+<P>
+A common wisdom in linguistics is that "there is no free variation", which
+refers to the situation where <I>all</I> aspects are taken into account. For
+instance, the English negation contraction could be expressed as free variation,
+</P>
+<PRE>
+ variants {"don't" ; "do" ++ "not"}
+</PRE>
+<P>
+if only semantics is taken into account, but if stylistic aspects are included,
+then the proper formulation might be with a parameter distinguishing between
+informal and formal style:
+</P>
+<PRE>
+ case style of {Informal =&gt; "don't" ; Formal =&gt; "do" ++ "not"}
+</PRE>
+<P>
+Since there is not way to choose a particular element from a ``variants` list,
+free variants is normally not adequate in libraries, nor in grammars meant for
+natural language generation. In application grammars
+meant to parse user input, free variation is a way to avoid cluttering the
+abstract syntax with semantically insignificant distinctions and even to
+tolerate some grammatical errors.
+</P>
+<P>
+Permitting <CODE>variants</CODE> in all types involves a major modification of the
+semantics of GF expressions. All computation rules have to be lifted to
+deal with lists of expressions and values. For instance,
+<center>
+<I>t</I> <CODE>!</CODE> <CODE>variants</CODE> <CODE>{</CODE><i>t</i><sub>1</sub> ; ... ; <i>t</i><sub>n</sub><CODE>}</CODE> ==>
+<CODE>variants</CODE> <CODE>{</CODE><I>t</I> <CODE>!</CODE> <i>t</i><sub>1</sub> ; ... ; <I>t</I> <CODE>!</CODE> <i>t</i><sub>n</sub><CODE>}</CODE>
+</center>
+This is done in such a way that
+variation does not distribute to records (or other product-like structures).
+For instance, variants of records,
+</P>
+<PRE>
+ variants {{s = "Auto" ; g = Neutr} ; {s = "Wagen" ; g = Masc}}
+</PRE>
+<P>
+is <I>not</I> the same as a record of variants,
+</P>
+<PRE>
+ {s = variants {"Auto" ; "Wagen"} ; g = variants {Neutr ; Masc}}
+</PRE>
+<P>
+Variants of variants are flattened,
+<center>
+<CODE>variants</CODE> <CODE>{</CODE>...; <CODE>variants</CODE> <CODE>{</CODE><i>t</i><sub>1</sub> ;...; <i>t</i><sub>n</sub><CODE>}</CODE> ;...<CODE>}</CODE>
+==>
+<CODE>variants</CODE> <CODE>{</CODE>...; <i>t</i><sub>1</sub> ;...; <i>t</i><sub>n</sub> ;...<CODE>}</CODE>
+</center>
+and singleton variants are eliminated,
+<center>
+<CODE>variants</CODE> <CODE>{</CODE><I>t</I><CODE>}</CODE> ==> <I>t</I>
+</center>
+</P>
+<A NAME="toc45"></A>
+<H3>Local definitions</H3>
+<P>
+A <B>local definition</B>, i.e. a <B>let expression</B> has the form
+<center>
+<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>in</CODE> <I>e</I>
+</center>
+The type of <I>x</I> must be <I>T</I>, which also has to be the type of <I>t</I>.
+Computation is performed by substituting <I>t</I> for <I>x</I> in <I>e</I>:
+<center>
+<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>in</CODE> <I>e</I> ==> <I>e</I> {<I>x</I> = <I>t</I>}
+</center>
+As syntactic sugar, the type can be omitted if the type checker is
+able to infer it:
+<center>
+<CODE>let</CODE> <I>x</I> = <I>t</I> <CODE>in</CODE> <I>e</I>
+</center>
+It is possible to compress several local definitions into one block:
+<center>
+<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>;</CODE> <I>y</I> : <I>U</I> = <I>u</I> <CODE>in</CODE> <I>e</I>
+===
+<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>in</CODE> <CODE>let</CODE> <I>y</I> : <I>U</I> = <I>u</I> <CODE>in</CODE> <I>e</I>
+</center>
+Another notational variant is a definition block appearing after the main
+expression:
+<center>
+<I>e</I> <CODE>where</CODE> <CODE>{</CODE>...<CODE>}</CODE> === <CODE>let</CODE> <CODE>{</CODE>...<CODE>}</CODE> <CODE>in</CODE> <I>e</I>
+</center>
+Curly brackets are obligatory in the <CODE>where</CODE> form, and can
+also be optionally used in the <CODE>let</CODE> form.
+</P>
+<P>
+Since a block of definitions is treated as syntactic sugar
+for a nested <CODE>let</CODE> expression, a constant must be defined before it
+is used: the scope is not mutual, as in a module body.
+Furthermore, unlike in <CODE>lin</CODE> and <CODE>oper</CODE> definitions, it is <I>not</I> possible
+to bind variables on the left of the equality sign.
+</P>
+<A NAME="toc46"></A>
+<H3>Function applications in concrete syntax</H3>
+<P>
+<a name="functionelimination"></a>
+</P>
+<P>
+Fully compiled concrete syntax may not include expressions of function types
+except on the outermost level of <CODE>lin</CODE> rules, as defined <a href="#linexpansion">here</a>.
+However,
+in the source code, and especially in <CODE>oper</CODE> definitions, functions
+are the main vehicle of code reuse and abstraction. Thus function types and
+functions follow the same rules as in abstract syntax, as specified
+<a href="#functiontype">here</a>. In
+particular, the application of a lambda abstract is computed by beta conversion.
+</P>
+<P>
+To ensure the elimination of functions, GF uses a special computation rule
+for pushing function applications inside tables, since otherwise run-time
+variables could block their applications:
+<center>
+(<CODE>table</CODE> <CODE>{</CODE><i>p</i><sub>1</sub> <CODE>=&gt;</CODE> <i>f</i><sub>1</sub> ; ... ;
+ <i>p</i><sub>n</sub> <CODE>=&gt;</CODE> <i>f</i><sub>n</sub> <CODE>}</CODE> <CODE>!</CODE> <I>e</I>) <I>a</I>
+ ==>
+ <CODE>table</CODE> <CODE>{</CODE><i>p</i><sub>1</sub> <CODE>=&gt;</CODE> <i>f</i><sub>1</sub> <I>a</I> ; ... ;
+ <i>p</i><sub>n</sub> <CODE>=&gt;</CODE> <i>f</i><sub>n</sub> <I>a</I><CODE>}</CODE> <CODE>!</CODE> <I>e</I>
+</center>
+Also parameter constructors with non-empty contexts, as defined
+<a href="#paramjudgements">here</a>,
+result in expressions in application form. These expressions are never
+a problem if their arguments are just constructors, because they can then
+be translated to integers corresponding to the position of the expression
+in the enumaration of the values of its type.
+However, a constructor
+applied to a run-time variable may need to be converted as follows:
+<center>
+<I>C</I>...<I>x</I>... ==> <CODE>case</CODE> <I>x</I> of <CODE>{_ =&gt;</CODE> <I>C</I>...<I>x</I><CODE>}</CODE>
+</center>
+The resulting expression, when processed by table expansion as explained
+<a href="#tables">here</a>,
+results in <I>C</I> being applied to just values of the type of <I>x</I>, and the
+application thereby disappears.
+</P>
+<A NAME="toc47"></A>
+<H3>Reusing top-level grammars as resources</H3>
+<P>
+<a name="reuse"></a>
+</P>
+<P>
+<I>This section is valid for GF 3.0, which abandons the "lock field"</I>
+<I>discipline of GF 2.8.</I>
+</P>
+<P>
+As explained <a href="#openabstract">here</a>,
+abstract syntax modules can be opened as interfaces
+and concrete syntaxes as their instances. This means that judgements are,
+as it were, translated in the following way:
+</P>
+<UL>
+<LI><CODE>cat</CODE> <I>C</I> <I>G</I> ===&gt; <CODE>oper</CODE> <I>C</I> : <CODE>Type</CODE>
+<LI><CODE>fun</CODE> <I>f</I> : <I>T</I> ===&gt; <CODE>oper</CODE> <I>f</I> : <I>T</I>
+<LI><CODE>lincat</CODE> <I>C</I> = <I>T</I> ===&gt; <CODE>oper</CODE> <I>C</I> : <CODE>Type</CODE> = <I>C</I>
+<LI><CODE>lin</CODE> <I>f</I> = <I>t</I> ===&gt; <CODE>oper</CODE> <I>f</I> = <I>t</I>
+</UL>
+
+<P>
+Notice that the value <I>T</I> of <CODE>lincat</CODE> definitions is not disclosed
+in the translation. This means that the type <I>C</I> remains abstract: the
+only ways of building an object of type <I>C</I> are the operations <I>f</I>
+obtained from <I>fun</I> and <I>lin</I> rules.
+</P>
+<P>
+The purpose of keeping linearization types abstract is to enforce
+<B>grammar checking via type checking</B>. This means that any well-typed
+operation application is also well-typed in the sense of the original
+grammar. If the types were disclosed, then we could for instance easily
+confuse all categories that have the linearization
+type <CODE>{s : Str}</CODE>. Yet another reason is that revealing the types
+makes it impossible for the library programmers to change their type
+definitions afterwards.
+</P>
+<P>
+Library writers may occasionally want to have access to the values of
+linearization types. The way to make it possible is to add an extra
+construction operation to a module in which the linearization type
+is available:
+</P>
+<PRE>
+ oper MkC : T -&gt; C = \x -&gt; x
+</PRE>
+<P>
+In object-oriented terms, the type <I>C</I> itself is <B>protected</B>, whereas
+<I>MkC</I> is a <B>public constructor</B> of <I>C</I>. Of course, it is possible to
+make these constructors overloaded (concept explained <a href="#overloading">here</a>),
+to enable easy access to special cases.
+</P>
+<A NAME="toc48"></A>
+<H3>Predefined concrete syntax types</H3>
+<P>
+<a name="predefcnc"></a>
+</P>
+<P>
+The following concrete syntax types are predefined:
+</P>
+<UL>
+<LI><CODE>Str</CODE>, the type of tokens and token lists (defined <a href="#strtype">here</a>)
+<LI><CODE>Integer</CODE>, the type of nonnegative integers
+<LI><CODE>Ints</CODE> <I>n</I>, the type of integers from <I>0</I> to <I>n</I>
+<LI><CODE>Type</CODE>, the type of (concrete syntax) types
+<LI><CODE>PType</CODE>, the type of parameter types
+</UL>
+
+<P>
+The last two types are, in a way, extended by user-written grammars,
+since new parameter types can be defined in the way shown <a href="#paramjudgements">here</a>,
+and every paramater type is also a type. From the point of view of the values
+of expressions, however, a <CODE>param</CODE> declaration does not extend
+<CODE>PType</CODE>, since all parameter types get compiled to initial
+segments of integers.
+</P>
+<P>
+Notice the difference between the concrete syntax types
+<CODE>Str</CODE> and <CODE>Integer</CODE> on the one hand, and the abstract
+syntax categories <CODE>String</CODE> and <CODE>Int</CODE>, on the other.
+As <I>concrete syntax</I> types, the latter are treated in
+the same way as any reused categories: their objects
+can be formed by using syntax trees (string and integer
+literals).
+</P>
+<P>
+<I>The type name</I> <CODE>Integer</CODE> <I>replaces in GF 3.0 the name</I> <CODE>Int</CODE>,
+<I>to avoid confusion with the abstract syntax type and to be analogous</I>
+<I>with the</I> <CODE>Str</CODE> <I>vs.</I> <CODE>String</CODE> <I>distinction.</I>
+</P>
+<A NAME="toc49"></A>
+<H3>Predefined concrete syntax operations</H3>
+<P>
+The following predefined operations are defined in the resource module
+<CODE>prelude/Predefined.gf</CODE>. Their implementations are defined as
+a part of the GF grammar compiler.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>operation</TH>
+<TH>type</TH>
+<TH COLSPAN="2">explanation</TH>
+</TR>
+<TR>
+<TD><CODE>PBool</CODE></TD>
+<TD><CODE>PType</CODE></TD>
+<TD><CODE>PTrue | PFalse</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>Error</CODE></TD>
+<TD><CODE>Type</CODE></TD>
+<TD>the empty type</TD>
+</TR>
+<TR>
+<TD><CODE>Int</CODE></TD>
+<TD><CODE>Type</CODE></TD>
+<TD>the type of integers</TD>
+</TR>
+<TR>
+<TD><CODE>Ints</CODE></TD>
+<TD><CODE>Integer -&gt; Type</CODE></TD>
+<TD>the type of integers from 0 to n</TD>
+</TR>
+<TR>
+<TD><CODE>error</CODE></TD>
+<TD><CODE>Str -&gt; Error</CODE></TD>
+<TD>forms error message</TD>
+</TR>
+<TR>
+<TD><CODE>length</CODE></TD>
+<TD><CODE>Str -&gt; Int</CODE></TD>
+<TD>length of string</TD>
+</TR>
+<TR>
+<TD><CODE>drop</CODE></TD>
+<TD><CODE>Integer -&gt; Str -&gt; Str</CODE></TD>
+<TD>drop prefix of length</TD>
+</TR>
+<TR>
+<TD><CODE>take</CODE></TD>
+<TD><CODE>Integer -&gt; Str -&gt; Str</CODE></TD>
+<TD>take prefix of length</TD>
+</TR>
+<TR>
+<TD><CODE>tk</CODE></TD>
+<TD><CODE>Integer -&gt; Str -&gt; Str</CODE></TD>
+<TD>drop suffix of length</TD>
+</TR>
+<TR>
+<TD><CODE>dp</CODE></TD>
+<TD><CODE>Integer -&gt; Str -&gt; Str</CODE></TD>
+<TD>take suffix of length</TD>
+</TR>
+<TR>
+<TD><CODE>eqInt</CODE></TD>
+<TD><CODE>Integer -&gt; Integer -&gt; PBool</CODE></TD>
+<TD>test if equal integers</TD>
+</TR>
+<TR>
+<TD><CODE>lessInt</CODE></TD>
+<TD><CODE>Integer -&gt; Integer -&gt; PBool</CODE></TD>
+<TD>test order of integers</TD>
+</TR>
+<TR>
+<TD><CODE>plus</CODE></TD>
+<TD><CODE>Integer -&gt; Integer -&gt; Integer</CODE></TD>
+<TD>add integers</TD>
+</TR>
+<TR>
+<TD><CODE>eqStr</CODE></TD>
+<TD><CODE>Str -&gt; Str -&gt; PBool</CODE></TD>
+<TD>test if equal strings</TD>
+</TR>
+<TR>
+<TD><CODE>occur</CODE></TD>
+<TD><CODE>Str -&gt; Str -&gt; PBool</CODE></TD>
+<TD>test if occurs as substring</TD>
+</TR>
+<TR>
+<TD><CODE>occurs</CODE></TD>
+<TD><CODE>Str -&gt; Str -&gt; PBool</CODE></TD>
+<TD>test if any char occurs</TD>
+</TR>
+<TR>
+<TD><CODE>show</CODE></TD>
+<TD><CODE>(P : Type) -&gt; P -&gt; Str</CODE></TD>
+<TD>convert param to string</TD>
+</TR>
+<TR>
+<TD><CODE>read</CODE></TD>
+<TD><CODE>(P : Type) -&gt; Str -&gt; P</CODE></TD>
+<TD>convert string to param</TD>
+</TR>
+<TR>
+<TD><CODE>toStr</CODE></TD>
+<TD><CODE>(L : Type) -&gt; L -&gt; Str</CODE></TD>
+<TD>find the "first" string</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+Compilation eliminates these operations, and they may therefore not
+take arguments that depend on run-time variables.
+</P>
+<P>
+The module <CODE>Predef</CODE> is included in the <I>opens</I> list of all
+modules, and therefore does not need to be opened explicitly.
+</P>
+<A NAME="toc50"></A>
+<H2>Flags and pragmas</H2>
+<A NAME="toc51"></A>
+<H3>Some flags and their values</H3>
+<P>
+<a name="flagvalues"></a>
+</P>
+<P>
+The flag <CODE>coding</CODE> in concrete syntax sets the <B>character encoding</B>
+used in the grammar. Internally, GF uses unicode, and <CODE>.gfcc</CODE> files
+are always written in UTF8 encoding. The presence of the flag
+<CODE>coding=utf8</CODE> prevents GF from encoding an already encoded
+file.
+</P>
+<P>
+The flag <CODE>lexer</CODE> in concrete syntax sets the lexer,
+i.e. the processor that turns
+strings into token lists sent to the parser. Some GF implementations
+support the following lexers.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>lexer</TH>
+<TH COLSPAN="2">description</TH>
+</TR>
+<TR>
+<TD><CODE>words</CODE></TD>
+<TD>(default) tokens are separated by spaces or newlines</TD>
+</TR>
+<TR>
+<TD><CODE>literals</CODE></TD>
+<TD>like words, but integer and string literals recognized</TD>
+</TR>
+<TR>
+<TD><CODE>chars</CODE></TD>
+<TD>each character is a token</TD>
+</TR>
+<TR>
+<TD><CODE>code</CODE></TD>
+<TD>program code conventions (uses Haskell's lex)</TD>
+</TR>
+<TR>
+<TD><CODE>text</CODE></TD>
+<TD>with conventions on punctuation and capital letters</TD>
+</TR>
+<TR>
+<TD><CODE>codelit</CODE></TD>
+<TD>like code, but recognize literals (unknown words as strings)</TD>
+</TR>
+<TR>
+<TD><CODE>textlit</CODE></TD>
+<TD>like text, but recognize literals (unknown words as strings)</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The flag <CODE>startcat</CODE> in abstract syntax sets the default start category for
+parsing, random generation, and any other grammar operation that depends
+on category. Its legal values are the categories defined or inherited in
+the abstract syntax.
+</P>
+<P>
+The flag <CODE>unlexer</CODE> in concrete syntax sets the lexer,
+i.e. the processor that turns
+token lists obrained from the linearizer to strings. Some GF implementations
+support the following unlexers.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>unlexer</TH>
+<TH COLSPAN="2">description</TH>
+</TR>
+<TR>
+<TD><CODE>unwords</CODE></TD>
+<TD>(default) space-separated token list</TD>
+</TR>
+<TR>
+<TD><CODE>text</CODE></TD>
+<TD>format as text: punctuation, capitals, paragraph &lt;p&gt;</TD>
+</TR>
+<TR>
+<TD><CODE>code</CODE></TD>
+<TD>format as code (spacing, indentation)</TD>
+</TR>
+<TR>
+<TD><CODE>textlit</CODE></TD>
+<TD>like text, but remove string literal quotes</TD>
+</TR>
+<TR>
+<TD><CODE>codelit</CODE></TD>
+<TD>like code, but remove string literal quotes</TD>
+</TR>
+<TR>
+<TD><CODE>concat</CODE></TD>
+<TD>remove all spaces</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<A NAME="toc52"></A>
+<H3>Compiler pragmas</H3>
+<P>
+<B>Compiler pragmas</B> are a special form of comments prefixed with <CODE>--#</CODE>.
+Currently GF interprets the following pragmas.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>pragma</TH>
+<TH COLSPAN="2">explanation</TH>
+</TR>
+<TR>
+<TD><CODE>-path=</CODE>PATH</TD>
+<TD>path list for searching modules</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For instance, the line
+</P>
+<PRE>
+ --# -path=.:present:prelude:/home/aarne/GF/tmp
+</PRE>
+<P>
+in the top of <CODE>FILE.gf</CODE> causes the GF compiler, when invoked on <CODE>FILE.gf</CODE>,
+to search through the current directory (<CODE>.</CODE>) and the directories
+<CODE>present</CODE>, <CODE>prelude</CODE>, and <CODE>/home/aarne/GF/tmp</CODE>, in this order.
+If a directory <CODE>DIR</CODE> is not found relative to the working directory,
+also <CODE>$(GF_LIB_PATH)/DIR</CODE> is searched.
+</P>
+<A NAME="toc53"></A>
+<H2>Alternative grammar input formats</H2>
+<P>
+While the GF language as specified in this document is the most versatile
+and powerful way of writing GF grammars, there are several other formats
+that a GF compiler may make available for users, either to get started
+with small grammars or to semiautomatically convert grammars from other
+formats to GF. Here are the ones supported by GF 2.8 and 3.0.
+</P>
+<A NAME="toc54"></A>
+<H3>Old GF without modules</H3>
+<P>
+<a name="oldgf"></a>
+</P>
+<P>
+Before GF compiler version 2.0, there was no module system, and
+all kinds of judgement could be written in all files, without
+any headers. This format is still available, and the compiler
+(version 2.8) detects automatically if a file is in the current
+or the old format. However, the old format is not recommended
+because of pure modularity and missing separate compilation,
+and also because libraries are not available, since the old
+and the new format cannot be mixed. With version 2.8, grammars
+in the old format can be converted to modular grammar with the
+command
+</P>
+<PRE>
+ &gt; import -o FILE.gf
+</PRE>
+<P>
+which rewrites the grammar divided into three files:
+an abstract, a concrete, and a resource module.
+</P>
+<A NAME="toc55"></A>
+<H3>Context-free grammars</H3>
+<P>
+A quick way to write a GF grammar is to use the context-free format,
+also known as BNF. Files of this form are recognized by the suffix
+<CODE>.cf</CODE>. Rules in these files have the form
+<center>
+<I>Label</I> <CODE>.</CODE> <I>Cat</I> <CODE>::=</CODE> (<I>String</I> | <I>Cat</I>)* <CODE>;</CODE>
+</center>
+where <I>Label</I> and <I>Cat</I> are identifiers and <I>String</I> quoted strings.
+</P>
+<P>
+There is a shortcut form generating labels automatically,
+<center>
+<I>Cat</I> <CODE>::=</CODE> (<I>String</I> | <I>Cat</I>)* <CODE>;</CODE>
+</center>
+In the shortcut form, vertical bars (<CODE>|</CODE>) can be used to give
+several right-hand-sides at a time. An empty right-hand side
+means the singleton of an empty sequence, and not an empty union.
+</P>
+<P>
+Just like old-style GF files (previous section), contex-free grammar
+files can be converted to modular GF by using the <CODE>-o</CODE> option to
+the compiler in GF 2.8.
+</P>
+<A NAME="toc56"></A>
+<H3>Extended BNF grammars</H3>
+<P>
+Extended BNF (<CODE>FILE.ebnf</CODE>)
+goes one step further from the shortcut notation of previous section.
+The rules have the form
+<center>
+<I>Cat</I> <CODE>::=</CODE> <I>RHS</I> <CODE>;</CODE>
+</center>
+where an <I>RHS</I> can be any regular expression
+built from quoted strings and category symbols, in the following ways:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>RHS item</TH>
+<TH COLSPAN="2">explanation</TH>
+</TR>
+<TR>
+<TD><I>Cat</I></TD>
+<TD>nonterminal</TD>
+</TR>
+<TR>
+<TD><I>String</I></TD>
+<TD>terminal</TD>
+</TR>
+<TR>
+<TD><I>RHS</I> <I>RHS</I></TD>
+<TD>sequence</TD>
+</TR>
+<TR>
+<TD><I>RHS</I> <CODE>|</CODE> <I>RHS</I></TD>
+<TD>alternatives</TD>
+</TR>
+<TR>
+<TD><I>RHS</I> <CODE>?</CODE></TD>
+<TD>optional</TD>
+</TR>
+<TR>
+<TD><I>RHS</I> <CODE>*</CODE></TD>
+<TD>repetition</TD>
+</TR>
+<TR>
+<TD><I>RHS</I> <CODE>+</CODE></TD>
+<TD>non-empty repetition|</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+Parentheses are used to override standard precedences, where
+<CODE>|</CODE> binds weaker than sequencing, which binds weaker than the unary operations.
+</P>
+<P>
+The compiler generates not only labels, but also new categories corresponding
+to the regular expression combinations actually in use.
+</P>
+<P>
+Just like <CODE>.cf</CODE> files (previous section), <CODE>.ebnf</CODE>
+files can be converted to modular GF by using the <CODE>-o</CODE> option to
+the compiler in GF 2.8.
+</P>
+<A NAME="toc57"></A>
+<H3>Example-based grammars</H3>
+<P>
+<B>Example-based grammars</B> (<CODE>.gfe</CODE>) provide a way to use
+resource grammar libraries without having to know the names
+of functions in them. The compiler works as a preprocessor,
+saving the result in a (<CODE>.gf</CODE>) file, which can be compiled
+as usual.
+</P>
+<P>
+If a library is implemented as an abstract and concrete syntax,
+it can be used for parsing. Calls of library functions can therefore
+be formed by parsing strings in the library. GF has an expression
+format for this,
+<center>
+<CODE>in</CODE> <I>C</I> <I>String</I>
+</center>
+where <I>C</I> is the category in which to parse (it can be qualified by
+the module name) and the string is the input to parser. Expressions
+of this form are replaced by the syntax trees that result. These
+trees are always type-correct. If several parses are found, all but
+the first one are given in comments.
+</P>
+<P>
+Here is an example, from <CODE>GF/examples/animal/</CODE>:
+</P>
+<PRE>
+ --# -resource=../../lib/present/LangEng.gfc
+ --# -path=.:present:prelude
+
+ incomplete concrete QuestionsI of Questions = open Lang in {
+ lincat
+ Phrase = Phr ;
+ Entity = N ;
+ Action = V2 ;
+ lin
+ Who love_V2 man_N = in Phr "who loves men" ;
+ Whom man_N love_V2 = in Phr "whom does the man love" ;
+ Answer woman_N love_V2 man_N = in Phr "the woman loves men" ;
+ }
+</PRE>
+<P>
+The <CODE>resource</CODE> pragma shows the grammar that is used for parsing
+the examples.
+</P>
+<P>
+Notice that the variables <CODE>love_V2</CODE>, <CODE>man_N</CODE>, etc, are
+actually constants in the library. In the resulting rules, such as
+</P>
+<PRE>
+ lin Whom = \man_N -&gt; \love_V2 -&gt;
+ PhrUtt NoPConj (UttQS (UseQCl TPres ASimul PPos
+ (QuestSlash whoPl_IP (SlashV2 (DetCN (DetSg (SgQuant
+ DefArt)NoOrd)(UseN man_N)) love_V2)))) NoVoc ;
+</PRE>
+<P>
+those constants are nonetheless treated as variables, following
+the normal binding conventions, as stated <a href="#renaming">here</a>.
+</P>
+<A NAME="toc58"></A>
+<H2>The grammar of GF</H2>
+<P>
+The following grammar is actually used in the parser of GF, although we have
+omitted
+some obsolete rules still included in the parser for backward compatibility
+reasons.
+</P>
+<P>
+This document was automatically generated by the <I>BNF-Converter</I>. It was generated together with the lexer, the parser, and the abstract syntax module, which guarantees that the document matches with the implementation of the language (provided no hand-hacking has taken place).
+</P>
+<A NAME="toc59"></A>
+<H2>The lexical structure of GF</H2>
+<A NAME="toc60"></A>
+<H3>Identifiers</H3>
+<P>
+Identifiers <I>Ident</I> are unquoted strings beginning with a letter,
+followed by any combination of letters, digits, and the characters <CODE>_ '</CODE>
+reserved words excluded.
+</P>
+<A NAME="toc61"></A>
+<H3>Literals</H3>
+<P>
+Integer literals <I>Integer</I> are nonempty sequences of digits.
+</P>
+<P>
+String literals <I>String</I> have the form
+<CODE>"</CODE><I>x</I><CODE>"</CODE>}, where <I>x</I> is any sequence of any characters
+except <CODE>"</CODE> unless preceded by <CODE>\</CODE>.
+</P>
+<P>
+Double-precision float literals <I>Double</I> have the structure
+indicated by the regular expression <CODE>digit+ '.' digit+ ('e' ('-')? digit+)?</CODE> i.e.\
+two sequences of digits separated by a decimal point, optionally
+followed by an unsigned or negative exponent.
+</P>
+<A NAME="toc62"></A>
+<H3>Reserved words and symbols</H3>
+<P>
+The set of reserved words is the set of terminals appearing in the grammar. Those reserved words that consist of non-letter characters are called symbols, and they are treated in a different way from those that are similar to identifiers. The lexer follows rules familiar from languages like Haskell, C, and Java, including longest match and spacing conventions.
+</P>
+<P>
+The reserved words used in GF are the following:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4">
+<TR>
+<TD><CODE>PType</CODE></TD>
+<TD><CODE>Str</CODE></TD>
+<TD><CODE>Strs</CODE></TD>
+<TD><CODE>Type</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>abstract</CODE></TD>
+<TD><CODE>case</CODE></TD>
+<TD><CODE>cat</CODE></TD>
+<TD><CODE>concrete</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>data</CODE></TD>
+<TD><CODE>def</CODE></TD>
+<TD><CODE>flags</CODE></TD>
+<TD><CODE>fun</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>in</CODE></TD>
+<TD><CODE>incomplete</CODE></TD>
+<TD><CODE>instance</CODE></TD>
+<TD><CODE>interface</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>let</CODE></TD>
+<TD><CODE>lin</CODE></TD>
+<TD><CODE>lincat</CODE></TD>
+<TD><CODE>lindef</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>of</CODE></TD>
+<TD><CODE>open</CODE></TD>
+<TD><CODE>oper</CODE></TD>
+<TD><CODE>param</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>pre</CODE></TD>
+<TD><CODE>printname</CODE></TD>
+<TD><CODE>resource</CODE></TD>
+<TD><CODE>strs</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>table</CODE></TD>
+<TD><CODE>transfer</CODE></TD>
+<TD><CODE>variants</CODE></TD>
+<TD><CODE>where</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>with</CODE></TD>
+<TD></TD>
+<TD></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The symbols used in GF are the following:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4">
+<TR>
+<TD>;</TD>
+<TD>=</TD>
+<TD>:</TD>
+<TD>-&gt;</TD>
+</TR>
+<TR>
+<TD>{</TD>
+<TD>}</TD>
+<TD>**</TD>
+<TD>,</TD>
+</TR>
+<TR>
+<TD>(</TD>
+<TD>)</TD>
+<TD>[</TD>
+<TD>]</TD>
+</TR>
+<TR>
+<TD>-</TD>
+<TD>.</TD>
+<TD>|</TD>
+<TD>?</TD>
+</TR>
+<TR>
+<TD>&lt;</TD>
+<TD>&gt;</TD>
+<TD>@</TD>
+<TD>!</TD>
+</TR>
+<TR>
+<TD>*</TD>
+<TD>+</TD>
+<TD>++</TD>
+<TD>\</TD>
+</TR>
+<TR>
+<TD>=&gt;</TD>
+<TD>_</TD>
+<TD>$</TD>
+<TD>/</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<A NAME="toc63"></A>
+<H3>Comments</H3>
+<P>
+Single-line comments begin with --.Multiple-line comments are enclosed with {- and -}.
+</P>
+<A NAME="toc64"></A>
+<H2>The syntactic structure of GF</H2>
+<P>
+Non-terminals are enclosed between &lt; and &gt;.
+The symbols -&gt; (production), <B>|</B> (union)
+and <B>eps</B> (empty rule) belong to the BNF notation.
+All other symbols are terminals.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4">
+<TR>
+<TD><I>Grammar</I></TD>
+<TD>-&gt;</TD>
+<TD><I>[ModDef]</I></TD>
+</TR>
+<TR>
+<TD><I>[ModDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>ModDef</I> <I>[ModDef]</I></TD>
+</TR>
+<TR>
+<TD><I>ModDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>ModDef</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>ComplMod</I> <I>ModType</I> <CODE>=</CODE> <I>ModBody</I></TD>
+</TR>
+<TR>
+<TD><I>ModType</I></TD>
+<TD>-&gt;</TD>
+<TD><CODE>abstract</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>resource</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>interface</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>concrete</CODE> <I>Ident</I> <CODE>of</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>instance</CODE> <I>Ident</I> <CODE>of</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>transfer</CODE> <I>Ident</I> <CODE>:</CODE> <I>Open</I> <CODE>-&gt;</CODE> <I>Open</I></TD>
+</TR>
+<TR>
+<TD><I>ModBody</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Extend</I> <I>Opens</I> <CODE>{</CODE> <I>[TopDef]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>[Included]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Included</I> <CODE>with</CODE> <I>[Open]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Included</I> <CODE>with</CODE> <I>[Open]</I> <CODE>**</CODE> <I>Opens</I> <CODE>{</CODE> <I>[TopDef]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>[Included]</I> <CODE>**</CODE> <I>Included</I> <CODE>with</CODE> <I>[Open]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>[Included]</I> <CODE>**</CODE> <I>Included</I> <CODE>with</CODE> <I>[Open]</I> <CODE>**</CODE> <I>Opens</I> <CODE>{</CODE> <I>[TopDef]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD><I>[TopDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>TopDef</I> <I>[TopDef]</I></TD>
+</TR>
+<TR>
+<TD><I>Extend</I></TD>
+<TD>-&gt;</TD>
+<TD><I>[Included]</I> <CODE>**</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD><I>[Open]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Open</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Open</I> <CODE>,</CODE> <I>[Open]</I></TD>
+</TR>
+<TR>
+<TD><I>Opens</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>open</CODE> <I>[Open]</I> <CODE>in</CODE></TD>
+</TR>
+<TR>
+<TD><I>Open</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>(</CODE> <I>QualOpen</I> <I>Ident</I> <CODE>)</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>(</CODE> <I>QualOpen</I> <I>Ident</I> <CODE>=</CODE> <I>Ident</I> <CODE>)</CODE></TD>
+</TR>
+<TR>
+<TD><I>ComplMod</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>incomplete</CODE></TD>
+</TR>
+<TR>
+<TD><I>QualOpen</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD><I>[Included]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Included</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Included</I> <CODE>,</CODE> <I>[Included]</I></TD>
+</TR>
+<TR>
+<TD><I>Included</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>[</CODE> <I>[Ident]</I> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>-</CODE> <CODE>[</CODE> <I>[Ident]</I> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD><I>Def</I></TD>
+<TD>-&gt;</TD>
+<TD><I>[Name]</I> <CODE>:</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>[Name]</I> <CODE>=</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Name</I> <I>[Patt]</I> <CODE>=</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>[Name]</I> <CODE>:</CODE> <I>Exp</I> <CODE>=</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD><I>TopDef</I></TD>
+<TD>-&gt;</TD>
+<TD><CODE>cat</CODE> <I>[CatDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>fun</CODE> <I>[FunDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>data</CODE> <I>[FunDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>def</CODE> <I>[Def]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>data</CODE> <I>[DataDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>param</CODE> <I>[ParDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>oper</CODE> <I>[Def]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>lincat</CODE> <I>[PrintDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>lindef</CODE> <I>[Def]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>lin</CODE> <I>[Def]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>printname</CODE> <CODE>cat</CODE> <I>[PrintDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>printname</CODE> <CODE>fun</CODE> <I>[PrintDef]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>flags</CODE> <I>[FlagDef]</I></TD>
+</TR>
+<TR>
+<TD><I>CatDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I> <I>[DDecl]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>[</CODE> <I>Ident</I> <I>[DDecl]</I> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>[</CODE> <I>Ident</I> <I>[DDecl]</I> <CODE>]</CODE> <CODE>{</CODE> <I>Integer</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD><I>FunDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>[Ident]</I> <CODE>:</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD><I>DataDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I> <CODE>=</CODE> <I>[DataConstr]</I></TD>
+</TR>
+<TR>
+<TD><I>DataConstr</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>.</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD><I>[DataConstr]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>DataConstr</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>DataConstr</I> <CODE>|</CODE> <I>[DataConstr]</I></TD>
+</TR>
+<TR>
+<TD><I>ParDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I> <CODE>=</CODE> <I>[ParConstr]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>=</CODE> <CODE>(</CODE> <CODE>in</CODE> <I>Ident</I> <CODE>)</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD><I>ParConstr</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I> <I>[DDecl]</I></TD>
+</TR>
+<TR>
+<TD><I>PrintDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>[Name]</I> <CODE>=</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD><I>FlagDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I> <CODE>=</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD><I>[Def]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Def</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Def</I> <CODE>;</CODE> <I>[Def]</I></TD>
+</TR>
+<TR>
+<TD><I>[CatDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>CatDef</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>CatDef</I> <CODE>;</CODE> <I>[CatDef]</I></TD>
+</TR>
+<TR>
+<TD><I>[FunDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>FunDef</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>FunDef</I> <CODE>;</CODE> <I>[FunDef]</I></TD>
+</TR>
+<TR>
+<TD><I>[DataDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>DataDef</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>DataDef</I> <CODE>;</CODE> <I>[DataDef]</I></TD>
+</TR>
+<TR>
+<TD><I>[ParDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>ParDef</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>ParDef</I> <CODE>;</CODE> <I>[ParDef]</I></TD>
+</TR>
+<TR>
+<TD><I>[PrintDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>PrintDef</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>PrintDef</I> <CODE>;</CODE> <I>[PrintDef]</I></TD>
+</TR>
+<TR>
+<TD><I>[FlagDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>FlagDef</I> <CODE>;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>FlagDef</I> <CODE>;</CODE> <I>[FlagDef]</I></TD>
+</TR>
+<TR>
+<TD><I>[ParConstr]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>ParConstr</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>ParConstr</I> <CODE>|</CODE> <I>[ParConstr]</I></TD>
+</TR>
+<TR>
+<TD><I>[Ident]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>,</CODE> <I>[Ident]</I></TD>
+</TR>
+<TR>
+<TD><I>Name</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>[</CODE> <I>Ident</I> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD><I>[Name]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Name</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Name</I> <CODE>,</CODE> <I>[Name]</I></TD>
+</TR>
+<TR>
+<TD><I>LocDef</I></TD>
+<TD>-&gt;</TD>
+<TD><I>[Ident]</I> <CODE>:</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>[Ident]</I> <CODE>=</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>[Ident]</I> <CODE>:</CODE> <I>Exp</I> <CODE>=</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD><I>[LocDef]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>LocDef</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>LocDef</I> <CODE>;</CODE> <I>[LocDef]</I></TD>
+</TR>
+<TR>
+<TD><I>Exp6</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Sort</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>String</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Integer</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Double</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>?</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>[</CODE> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>data</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>[</CODE> <I>Ident</I> <I>Exps</I> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>[</CODE> <I>String</I> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>{</CODE> <I>[LocDef]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>&lt;</CODE> <I>[TupleComp]</I> <CODE>&gt;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>&lt;</CODE> <I>Exp</I> <CODE>:</CODE> <I>Exp</I> <CODE>&gt;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>(</CODE> <I>Exp</I> <CODE>)</CODE></TD>
+</TR>
+<TR>
+<TD><I>Exp5</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp5</I> <CODE>.</CODE> <I>Label</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp6</I></TD>
+</TR>
+<TR>
+<TD><I>Exp4</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp4</I> <I>Exp5</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>table</CODE> <CODE>{</CODE> <I>[Case]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>table</CODE> <I>Exp6</I> <CODE>{</CODE> <I>[Case]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>table</CODE> <I>Exp6</I> <CODE>[</CODE> <I>[Exp]</I> <CODE>]</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>case</CODE> <I>Exp</I> <CODE>of</CODE> <CODE>{</CODE> <I>[Case]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>variants</CODE> <CODE>{</CODE> <I>[Exp]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>pre</CODE> <CODE>{</CODE> <I>Exp</I> <CODE>;</CODE> <I>[Altern]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>strs</CODE> <CODE>{</CODE> <I>[Exp]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>@</CODE> <I>Exp6</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp5</I></TD>
+</TR>
+<TR>
+<TD><I>Exp3</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp3</I> <CODE>!</CODE> <I>Exp4</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp3</I> <CODE>*</CODE> <I>Exp4</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp3</I> <CODE>**</CODE> <I>Exp4</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp4</I></TD>
+</TR>
+<TR>
+<TD><I>Exp1</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp2</I> <CODE>+</CODE> <I>Exp1</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp2</I></TD>
+</TR>
+<TR>
+<TD><I>Exp</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp1</I> <CODE>++</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>\</CODE> <I>[Bind]</I> <CODE>-&gt;</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>\</CODE> <CODE>\</CODE> <I>[Bind]</I> <CODE>=&gt;</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Decl</I> <CODE>-&gt;</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp3</I> <CODE>=&gt;</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>let</CODE> <CODE>{</CODE> <I>[LocDef]</I> <CODE>}</CODE> <CODE>in</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>let</CODE> <I>[LocDef]</I> <CODE>in</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp3</I> <CODE>where</CODE> <CODE>{</CODE> <I>[LocDef]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>in</CODE> <I>Exp5</I> <I>String</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp1</I></TD>
+</TR>
+<TR>
+<TD><I>Exp2</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp3</I></TD>
+</TR>
+<TR>
+<TD><I>[Exp]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp</I> <CODE>;</CODE> <I>[Exp]</I></TD>
+</TR>
+<TR>
+<TD><I>Exps</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp6</I> <I>Exps</I></TD>
+</TR>
+<TR>
+<TD><I>Patt2</I></TD>
+<TD>-&gt;</TD>
+<TD><CODE>_</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>.</CODE> <I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Integer</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Double</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>String</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>{</CODE> <I>[PattAss]</I> <CODE>}</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>&lt;</CODE> <I>[PattTupleComp]</I> <CODE>&gt;</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>(</CODE> <I>Patt</I> <CODE>)</CODE></TD>
+</TR>
+<TR>
+<TD><I>Patt1</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I> <I>[Patt]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>.</CODE> <I>Ident</I> <I>[Patt]</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Patt2</I> <CODE>*</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Ident</I> <CODE>@</CODE> <I>Patt2</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>-</CODE> <I>Patt2</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Patt2</I></TD>
+</TR>
+<TR>
+<TD><I>Patt</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Patt</I> <CODE>|</CODE> <I>Patt1</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Patt</I> <CODE>+</CODE> <I>Patt1</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Patt1</I></TD>
+</TR>
+<TR>
+<TD><I>PattAss</I></TD>
+<TD>-&gt;</TD>
+<TD><I>[Ident]</I> <CODE>=</CODE> <I>Patt</I></TD>
+</TR>
+<TR>
+<TD><I>Label</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>$</CODE> <I>Integer</I></TD>
+</TR>
+<TR>
+<TD><I>Sort</I></TD>
+<TD>-&gt;</TD>
+<TD><CODE>Type</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>PType</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>Str</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>Strs</CODE></TD>
+</TR>
+<TR>
+<TD><I>[PattAss]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>PattAss</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>PattAss</I> <CODE>;</CODE> <I>[PattAss]</I></TD>
+</TR>
+<TR>
+<TD><I>[Patt]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Patt2</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Patt2</I> <I>[Patt]</I></TD>
+</TR>
+<TR>
+<TD><I>Bind</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Ident</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><CODE>_</CODE></TD>
+</TR>
+<TR>
+<TD><I>[Bind]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Bind</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Bind</I> <CODE>,</CODE> <I>[Bind]</I></TD>
+</TR>
+<TR>
+<TD><I>Decl</I></TD>
+<TD>-&gt;</TD>
+<TD><CODE>(</CODE> <I>[Bind]</I> <CODE>:</CODE> <I>Exp</I> <CODE>)</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp4</I></TD>
+</TR>
+<TR>
+<TD><I>TupleComp</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp</I></TD>
+</TR>
+<TR>
+<TD><I>PattTupleComp</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Patt</I></TD>
+</TR>
+<TR>
+<TD><I>[TupleComp]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>TupleComp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>TupleComp</I> <CODE>,</CODE> <I>[TupleComp]</I></TD>
+</TR>
+<TR>
+<TD><I>[PattTupleComp]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>PattTupleComp</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>PattTupleComp</I> <CODE>,</CODE> <I>[PattTupleComp]</I></TD>
+</TR>
+<TR>
+<TD><I>Case</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Patt</I> <CODE>=&gt;</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD><I>[Case]</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Case</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Case</I> <CODE>;</CODE> <I>[Case]</I></TD>
+</TR>
+<TR>
+<TD><I>Altern</I></TD>
+<TD>-&gt;</TD>
+<TD><I>Exp</I> <CODE>/</CODE> <I>Exp</I></TD>
+</TR>
+<TR>
+<TD><I>[Altern]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Altern</I></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Altern</I> <CODE>;</CODE> <I>[Altern]</I></TD>
+</TR>
+<TR>
+<TD><I>DDecl</I></TD>
+<TD>-&gt;</TD>
+<TD><CODE>(</CODE> <I>[Bind]</I> <CODE>:</CODE> <I>Exp</I> <CODE>)</CODE></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>Exp6</I></TD>
+</TR>
+<TR>
+<TD><I>[DDecl]</I></TD>
+<TD>-&gt;</TD>
+<TD><B>eps</B></TD>
+</TR>
+<TR>
+<TD></TD>
+<TD ALIGN="center"><B>|</B></TD>
+<TD><I>DDecl</I> <I>[DDecl]</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+
+<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
+<!-- cmdline: txt2tags -thtml -\-toc gf-refman.txt -->
+</BODY></HTML>
diff --git a/doc/gf-tutorial.html b/doc/gf-tutorial.html
new file mode 100644
index 000000000..1e6d961b8
--- /dev/null
+++ b/doc/gf-tutorial.html
@@ -0,0 +1,7952 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
+<HTML>
+<HEAD>
+<META NAME="generator" CONTENT="http://txt2tags.sf.net">
+<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
+<TITLE>Grammatical Framework Tutorial</TITLE>
+</HEAD><BODY BGCOLOR="white" TEXT="black">
+<P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
+<FONT SIZE="4">
+<I>Aarne Ranta</I><BR>
+Draft, November 2007
+</FONT></CENTER>
+
+<P></P>
+<HR NOSHADE SIZE=1>
+<P></P>
+ <UL>
+ <LI><A HREF="#toc1">Getting started with GF</A>
+ <UL>
+ <LI><A HREF="#toc2">What GF is</A>
+ <LI><A HREF="#toc3">Getting the GF system</A>
+ <LI><A HREF="#toc4">Running the GF system</A>
+ <LI><A HREF="#toc5">A "Hello World" grammar</A>
+ <UL>
+ <LI><A HREF="#toc6">The program: abstract syntax and concrete syntaxes</A>
+ <LI><A HREF="#toc7">Using the grammar in the GF system</A>
+ </UL>
+ <LI><A HREF="#toc8">Using grammars from outside GF</A>
+ <LI><A HREF="#toc9">What else can be done with the grammar</A>
+ <LI><A HREF="#toc10">Summary of GF language features</A>
+ <UL>
+ <LI><A HREF="#toc11">Modules</A>
+ <LI><A HREF="#toc12">Judgements</A>
+ <LI><A HREF="#toc13">Types and terms</A>
+ <LI><A HREF="#toc14">Type checking</A>
+ </UL>
+ </UL>
+ <LI><A HREF="#toc15">Designing a grammar for complex phrases</A>
+ <UL>
+ <LI><A HREF="#toc16">The abstract syntax Food</A>
+ <LI><A HREF="#toc17">The concrete syntax FoodEng</A>
+ <LI><A HREF="#toc18">Commands for testing grammars</A>
+ <UL>
+ <LI><A HREF="#toc19">Generating trees and strings</A>
+ <LI><A HREF="#toc20">More on pipes; tracing</A>
+ <LI><A HREF="#toc21">Writing and reading files</A>
+ <LI><A HREF="#toc22">Visualizing trees</A>
+ <LI><A HREF="#toc23">System commands</A>
+ </UL>
+ <LI><A HREF="#toc24">An Italian concrete syntax</A>
+ <LI><A HREF="#toc25">Free variation</A>
+ <LI><A HREF="#toc26">More application of multilingual grammars</A>
+ <UL>
+ <LI><A HREF="#toc27">Multilingual treebanks</A>
+ <LI><A HREF="#toc28">Translation session</A>
+ <LI><A HREF="#toc29">Translation quiz</A>
+ <LI><A HREF="#toc30">Multilingual syntax editing</A>
+ </UL>
+ <LI><A HREF="#toc31">Context-free grammars and GF</A>
+ <UL>
+ <LI><A HREF="#toc32">The "cf" grammar format</A>
+ <LI><A HREF="#toc33">Restrictions of context-free grammars</A>
+ </UL>
+ <LI><A HREF="#toc34">Modules and files</A>
+ <LI><A HREF="#toc35">Using operations and resource modules</A>
+ <UL>
+ <LI><A HREF="#toc36">The golden rule of functional programming</A>
+ <LI><A HREF="#toc37">Operation definitions</A>
+ <LI><A HREF="#toc38">The ``resource`` module type</A>
+ <LI><A HREF="#toc39">Opening a resource</A>
+ <LI><A HREF="#toc40">Partial application</A>
+ <LI><A HREF="#toc41">Testing resource modules</A>
+ </UL>
+ <LI><A HREF="#toc42">Grammar architecture</A>
+ <UL>
+ <LI><A HREF="#toc43">Extending a grammar</A>
+ <LI><A HREF="#toc44">Multiple inheritance</A>
+ <LI><A HREF="#toc45">Visualizing module structure</A>
+ </UL>
+ <LI><A HREF="#toc46">Summary of GF language features</A>
+ <UL>
+ <LI><A HREF="#toc47">Modules</A>
+ <LI><A HREF="#toc48">Judgements</A>
+ <LI><A HREF="#toc49">Free variation</A>
+ <LI><A HREF="#toc50">The context-free grammar format</A>
+ <LI><A HREF="#toc51">Character encoding</A>
+ </UL>
+ </UL>
+ <LI><A HREF="#toc52">Grammars with parameters</A>
+ <UL>
+ <LI><A HREF="#toc53">The problem: words have to be inflected</A>
+ <LI><A HREF="#toc54">Parameters and tables</A>
+ <LI><A HREF="#toc55">Inflection tables and paradigms</A>
+ <LI><A HREF="#toc56">Using parameters in concrete syntax</A>
+ <UL>
+ <LI><A HREF="#toc57">Agreement</A>
+ <LI><A HREF="#toc58">Determiners</A>
+ <LI><A HREF="#toc59">Parametric vs. inherent features</A>
+ </UL>
+ <LI><A HREF="#toc60">An English concrete syntax for Foods with parameters</A>
+ <LI><A HREF="#toc61">More on inflection paradigms</A>
+ <UL>
+ <LI><A HREF="#toc62">Worst-case functions</A>
+ <LI><A HREF="#toc63">Intelligent paradigms</A>
+ <LI><A HREF="#toc64">Function types with variables</A>
+ <LI><A HREF="#toc65">Separating operation types and definitions</A>
+ <LI><A HREF="#toc66">Overloading of operations</A>
+ <LI><A HREF="#toc67">Morphological analysis and morphology quiz</A>
+ </UL>
+ <LI><A HREF="#toc68">The Italian Foods grammar</A>
+ <LI><A HREF="#toc69">Discontinuous constituents</A>
+ <LI><A HREF="#toc70">Strings at compile time vs. run time</A>
+ <LI><A HREF="#toc71">Summary of GF language features</A>
+ <UL>
+ <LI><A HREF="#toc72">Parameter and table types</A>
+ <LI><A HREF="#toc73">Pattern matching</A>
+ <LI><A HREF="#toc74">Overloading</A>
+ <LI><A HREF="#toc75">Local definitions</A>
+ <LI><A HREF="#toc76">Supplementary constructs</A>
+ </UL>
+ </UL>
+ <LI><A HREF="#toc77">Using the resource grammar library</A>
+ <UL>
+ <LI><A HREF="#toc78">The coverage of the library</A>
+ <LI><A HREF="#toc79">The structure of the library</A>
+ <UL>
+ <LI><A HREF="#toc80">Lexical vs. phrasal rules</A>
+ <LI><A HREF="#toc81">Lexical categories</A>
+ <LI><A HREF="#toc82">Lexical rules</A>
+ <LI><A HREF="#toc83">Phrasal categories</A>
+ </UL>
+ <LI><A HREF="#toc84">The resource API</A>
+ <LI><A HREF="#toc85">Example: English</A>
+ <LI><A HREF="#toc86">Functor implementation of multilingual grammars</A>
+ <LI><A HREF="#toc87">Interfaces and instances</A>
+ <LI><A HREF="#toc88">Adding languages to a functor implementation</A>
+ <LI><A HREF="#toc89">Division of labour revisited</A>
+ <LI><A HREF="#toc90">Restricted inheritance</A>
+ <LI><A HREF="#toc91">Grammar reuse</A>
+ <LI><A HREF="#toc92">Browsing the resource with GF commands</A>
+ <LI><A HREF="#toc93">An extended Foods grammar</A>
+ <UL>
+ <LI><A HREF="#toc94">Abstract syntax</A>
+ <LI><A HREF="#toc95">Linearization types</A>
+ <LI><A HREF="#toc96">Linearization rules</A>
+ </UL>
+ <LI><A HREF="#toc97">Tenses</A>
+ <LI><A HREF="#toc98">Summary of GF language features</A>
+ <UL>
+ <LI><A HREF="#toc99">Interfaces and instances</A>
+ <LI><A HREF="#toc100">Grammar reuse</A>
+ <LI><A HREF="#toc101">Functors</A>
+ <LI><A HREF="#toc102">Restricted inheritance</A>
+ </UL>
+ </UL>
+ <LI><A HREF="#toc103">Refining semantics in abstract syntax</A>
+ <UL>
+ <LI><A HREF="#toc104">GF as a logical framework</A>
+ <LI><A HREF="#toc105">Dependent types</A>
+ <LI><A HREF="#toc106">Polymorphism</A>
+ <UL>
+ <LI><A HREF="#toc107">Digression: dependent types in concrete syntax</A>
+ </UL>
+ <LI><A HREF="#toc108">Proof objects</A>
+ <UL>
+ <LI><A HREF="#toc109">Proof-carrying documents</A>
+ </UL>
+ <LI><A HREF="#toc110">Restricted polymorphism</A>
+ <LI><A HREF="#toc111">Variable bindings</A>
+ <LI><A HREF="#toc112">Semantic definitions</A>
+ <LI><A HREF="#toc113">Summary of GF language features</A>
+ <UL>
+ <LI><A HREF="#toc114">Judgements</A>
+ <LI><A HREF="#toc115">Dependent function types</A>
+ </UL>
+ </UL>
+ <LI><A HREF="#toc116">Grammars of formal languages</A>
+ <UL>
+ <LI><A HREF="#toc117">Arithmetic expressions</A>
+ <UL>
+ <LI><A HREF="#toc118">Abstract syntax</A>
+ <LI><A HREF="#toc119">Concrete syntax: a simple approach</A>
+ </UL>
+ <LI><A HREF="#toc120">Lexing and unlexing</A>
+ <LI><A HREF="#toc121">Precedence and fixity</A>
+ <LI><A HREF="#toc122">Code generation as linearization</A>
+ <LI><A HREF="#toc123">Speaking aloud arithmetic expressions</A>
+ <LI><A HREF="#toc124">Programs with variables</A>
+ <UL>
+ <LI><A HREF="#toc125">The concrete syntax of assignments</A>
+ <LI><A HREF="#toc126">A liberal syntax of variables</A>
+ </UL>
+ <LI><A HREF="#toc127">Conclusion</A>
+ <LI><A HREF="#toc128">Summary of GF language constructs</A>
+ <UL>
+ <LI><A HREF="#toc129">Lexers and unlexers</A>
+ <LI><A HREF="#toc130">Built-in abstract syntax types</A>
+ </UL>
+ </UL>
+ <LI><A HREF="#toc131">Embedded grammars</A>
+ <UL>
+ <LI><A HREF="#toc132">The portable grammar format</A>
+ <LI><A HREF="#toc133">The embedded interpreter and its API</A>
+ <LI><A HREF="#toc134">Embedded GF applications in Haskell</A>
+ <UL>
+ <LI><A HREF="#toc135">The EmbedAPI module</A>
+ <LI><A HREF="#toc136">First application: a translator</A>
+ <LI><A HREF="#toc137">A looping translator</A>
+ <LI><A HREF="#toc138">A question-answer system</A>
+ <LI><A HREF="#toc139">Exporting GF datatypes</A>
+ <LI><A HREF="#toc140">Putting it all together</A>
+ </UL>
+ <LI><A HREF="#toc141">Embedded GF applications in Java</A>
+ <UL>
+ <LI><A HREF="#toc142">Translets</A>
+ <LI><A HREF="#toc143">Dialogue systems</A>
+ </UL>
+ <LI><A HREF="#toc144">Language models for speech recognition</A>
+ <LI><A HREF="#toc145">Dependent types and spoken language models</A>
+ <UL>
+ <LI><A HREF="#toc146">Statistical language models</A>
+ </UL>
+ </UL>
+ </UL>
+
+<P></P>
+<HR NOSHADE SIZE=1>
+<P></P>
+<P>
+<h2>Overview</h2>
+</P>
+<P>
+This tutorial gives a hands-on introduction to grammar writing in GF.
+It has been written for all programmers
+who want to learn to write grammars in GF.
+It will go through the programming concepts of GF, and also
+explain, without presupposing them, the main ingredients of GF:
+linguistics, functional programming, and type theory.
+This knowledge will be introduced as a part of grammar writing
+practice.
+Thus the tutorial should be accessible to anyone who has some
+previous experience from any programming language; the basics
+of using computers are also presupposed, e.g. the use of
+text editors and the management of files.
+</P>
+<P>
+We start in <a href="#chaptwo">the second chapter</a>
+by building a "Hello World" grammar, which covers greetings
+in three languages: English (<I>hello world</I>),
+Finnish (<I>terve maailma</I>), and Italian (<I>ciao mondo</I>).
+This <B>multilingual grammar</B> is based on the most central idea of GF:
+the distinction between <B>abstract syntax</B>
+(the logical structure) and <B>concrete syntax</B> (the
+sequence of words).
+</P>
+<P>
+From the "Hello World" example, we proceed
+in <a href="#chapthree">the third chapter</a>
+to a larger grammar for the domain of food.
+In this grammar, you can say things like
+<center>
+<I>this Italian cheese is delicious</I>
+</center>
+in English and Italian. This grammar illustrates how translation is
+more than just replacement of words. For instance, the order of
+words may have to be changed:
+<center>
+<I>Italian cheese</I>
+</P>
+<P>
+<I>formaggio italiano</I>
+</center>
+Moreover, words can have different forms, and which forms
+they have vary from language to language. For instance,
+Italian adjectives usually have four forms where English
+has just one:
+<center>
+<I>delicious</I> (<I>wine, wines, pizza, pizzas</I>)
+</P>
+<P>
+<I>vino delizioso, vini deliziosi, pizza deliziosa, pizze deliziose</I>
+</center>
+The <B>morphology</B> of a language describes the
+forms of its words, and the basics of implementing morphology and
+integrating it with syntax are covered in <a href="#chaptwo">the fourth chapter</a>.
+</P>
+<P>
+The complete description of morphology and syntax in natural
+languages is in GF preferably left to the <B>resource grammar library</B>.
+Its use is therefore an important part of GF programming, and
+it is covered in <a href="#chapfive">the fifth chapter</a>. How to contribute to resource
+grammars as an author will only be covered in Part III;
+however, the tutorial does go through all the
+programming concepts of GF, including those involved in
+resource grammars.
+</P>
+<P>
+In addition to multilinguality, <B>semantics</B> is an important aspect of GF
+grammars. The "purely linguistic" aspects (morphology and syntax) belong to
+the concrete syntax part of GF, whereas semantics is expressed in the abstract
+syntax. After the presentation of concrete syntax constructs, we proceed
+in <a href="#chapsix">the sixth chapter</a> to the enrichment of abstract syntax with <B>dependent types</B>,
+<B>variable bindings</B>, and <B>semantic definitions</B>.
+<a href="#chapseven">the seventh chapter</a> concludes the tutorial by technical tips for implementing formal
+languages. It will also illustrate the close relation between GF grammars
+and compilers by actually implementing a small compiler from C-like statements
+and expressions to machine code similar to Java Virtual Machine.
+</P>
+<P>
+English and Italian are used as example languages in many grammars.
+Of course, we will not presuppose that the reader knows any Italian.
+We have chosen Italian because it has a rich structure
+that illustrates very well the capacities of GF.
+Moreover, even those readers who don't know Italian, will find many of
+its words familiar, due to the Latin heritage.
+The exercises will encourage the reader to
+port the examples to other languages as well; in particular,
+it should be instructive for the reader to look at her
+own native language from the point of view of writing a grammar
+implementation.
+</P>
+<P>
+To learn how to write GF grammars is not the only goal of
+this tutorial. We will also explain the most important
+commands of the GF system, mostly in passing. With these commands,
+simple application programs such as translation and
+quiz systems, can be built simply by writing scripts for the
+GF system. More complicated applications, such as natural-language
+interfaces and dialogue systems, moreover require programming in
+some general-purpose language; such applications are covered in <a href="#chapeight">the eighth chapter</a>.
+</P>
+<A NAME="toc1"></A>
+<H1>Getting started with GF</H1>
+<P>
+<a name="chaptwo"></a>
+</P>
+<P>
+In this chapter, we will introduce the GF system and write the first GF grammar,
+a "Hello World" grammar. While extremely small, this grammar already illustrates
+how GF can be used for the tasks of translation and multilingual
+generation.
+</P>
+<A NAME="toc2"></A>
+<H2>What GF is</H2>
+<P>
+We use the term GF for three different things:
+</P>
+<UL>
+<LI>a <B>system</B> (computer program) used for working with grammars
+<LI>a <B>programming language</B> in which grammars can be written
+<LI>a <B>theory</B> about grammars and languages
+</UL>
+
+<P>
+The relation between these things is obvious: the GF system is an implementation
+of the GF programming language, which in turn is built on the ideas of the
+GF theory. The main focus of this book is on the GF programming language.
+We learn how grammars are written in this language. At the same time, we learn
+the way of thinking in the GF theory. To make this all useful and fun, and
+to encourage experimenting, we make the grammars run on a computer by
+using the GF system.
+</P>
+<P>
+A GF program is called a <B>grammar</B>. A grammar is, traditionally, a
+definition of a language. From this definition, different language
+processing components can be derived:
+</P>
+<UL>
+<LI><B>parsing</B>: to analyse the language
+<LI><B>linearization</B>: to generate the language
+<LI><B>translation</B>: to analyse one language and generate another
+</UL>
+
+<P>
+A GF grammar is thus a declarative program from which these
+procedures can be automatically derived. In general, a GF grammar
+is <B>multilingual</B>: it defines many languages and translations between them.
+</P>
+<A NAME="toc3"></A>
+<H2>Getting the GF system</H2>
+<P>
+The GF system is open-source free software, which can be downloaded via the
+GF Homepage:
+<center>
+<CODE>gf.digitalgrammars.com</CODE>
+</center>
+There you can download
+</P>
+<UL>
+<LI>binaries for Linux, Mac OS X, and Windows
+<LI>source code and documentation
+<LI>grammar libraries and examples
+</UL>
+
+<P>
+In particular, many of the examples in this book are included in the
+subdirectory <CODE>examples/tutorial</CODE> of the source distribution package.
+This directory is also available
+<A HREF="http://digitalgrammars.com/gf/examples/tutorial">online</A>.
+</P>
+<P>
+If you want to compile GF from source, you need a Haskell compiler.
+To compile the interactive editor, you also need a Java compilers.
+But normally you don't have to compile anything yourself, and you definitely
+don't need to know Haskell or Java to use GF.
+</P>
+<P>
+We are assuming the availability of a Unix shell. Linux and Mac OS X users
+have it automatically, the latter under the name "terminal".
+Windows users are recommended to install Cywgin, the free Unix shell for Windows.
+</P>
+<A NAME="toc4"></A>
+<H2>Running the GF system</H2>
+<P>
+To start the GF system, assuming you have installed it, just type
+<CODE>gf</CODE> in the Unix (or Cygwin) shell:
+</P>
+<PRE>
+ % gf
+</PRE>
+<P>
+You will see GF's welcome message and the prompt <CODE>&gt;</CODE>.
+The command
+</P>
+<PRE>
+ &gt; help
+</PRE>
+<P>
+will give you a list of available commands.
+</P>
+<P>
+As a common convention in this book, we will use
+</P>
+<UL>
+<LI><CODE>%</CODE> as a prompt that marks system commands
+<LI><CODE>&gt;</CODE> as a prompt that marks GF commands
+</UL>
+
+<P>
+Thus you should not type these prompts, but only the characters that
+follow them.
+</P>
+<A NAME="toc5"></A>
+<H2>A "Hello World" grammar</H2>
+<P>
+The tradition in programming language tutorials is to start with a
+program that prints "Hello World" on the terminal. GF should be no
+exception. But our program has features that distinguish it from
+most "Hello World" programs:
+</P>
+<UL>
+<LI><B>Multilinguality</B>: the message is printed in many languages.
+<LI><B>Reversibility</B>: in addition to printing, you can <B>parse</B> the
+ message and <B>translate</B> it to other languages.
+</UL>
+
+<A NAME="toc6"></A>
+<H3>The program: abstract syntax and concrete syntaxes</H3>
+<P>
+A GF program, in general, is a <B>multilingual grammar</B>. Its main parts
+are
+</P>
+<UL>
+<LI>an <B>abstract syntax</B>
+<LI>one or more <B>concrete syntaxes</B>
+</UL>
+
+<P>
+The abstract syntax defines, in a language-independent way, what <B>meanings</B>
+can be expressed in the grammar. In the "Hello World" grammar we want
+to express <I>Greetings</I>, where we greet a <I>Recipient</I>, which can be
+<I>World</I> or <I>Mum</I> or <I>Friends</I>. Here is the entire
+GF code for the abstract syntax:
+</P>
+<PRE>
+ -- a "Hello World" grammar
+ abstract Hello = {
+
+ flags startcat = Greeting ;
+
+ cat Greeting ; Recipient ;
+
+ fun
+ Hello : Recipient -&gt; Greeting ;
+ World, Mum, Friends : Recipient ;
+ }
+</PRE>
+<P>
+The code has the following parts:
+</P>
+<UL>
+<LI>a <B>comment</B> (optional), saying what the module is doing
+<LI>a <B>module header</B> indicating that it is an abstract syntax
+ module named <CODE>Hello</CODE>
+<LI>a <B>module body</B> in braces, consisting of
+ <UL>
+ <LI>a <B>startcat flag declaration</B> stating that <CODE>Greeting</CODE> is the
+ main category, i.e. the one in which parsing and generation are
+ performed by default
+ <LI><B>category declarations</B> stating that <CODE>Greeting</CODE> and <CODE>Recipient</CODE>
+ are categories, i.e. types of meanings
+ <LI><B>function declarations</B> stating what meaning-building functions there
+ are; these are the function <CODE>Hello</CODE> constructing a greeting from a recipient,
+ as well as the three possible recipients
+ </UL>
+</UL>
+
+<P>
+A concrete syntax defines a mapping from the abstract meanings to their
+expressions in a language. We first give an English concrete syntax:
+</P>
+<PRE>
+ concrete HelloEng of Hello = {
+
+ lincat Greeting, Recipient = {s : Str} ;
+
+ lin
+ Hello recip = {s = "hello" ++ recip.s} ;
+ World = {s = "world"} ;
+ Mum = {s = "mum"} ;
+ Friends = {s = "friends"} ;
+ }
+</PRE>
+<P>
+The major parts of this code are:
+</P>
+<UL>
+<LI>a module header indicating that it is a concrete syntax of the abstract syntax
+ <CODE>Hello</CODE>, itself named <CODE>HelloEng</CODE>
+<LI>a module body in curly brackets, consisting of
+ <UL>
+ <LI><B>linearization type definitions</B> stating that
+ <CODE>Greeting</CODE> and <CODE>Recipient</CODE> are <B>records</B> with a <B>string</B> <CODE>s</CODE>
+ <LI><B>linearization definitions</B> telling what records are assigned to
+ each of the meanings defined in the abstract syntax; the recipients are
+ linearized to records containing single words, whereas the <CODE>Hello</CODE> greeting
+ has a function telling that the word <CODE>hello</CODE> is prefixed to the string
+ <CODE>s</CODE> contained in the record <CODE>recip</CODE>
+ </UL>
+</UL>
+
+<P>
+To make the grammar truly multilingual, we add a Finnish and an Italian concrete
+syntax:
+</P>
+<PRE>
+ concrete HelloFin of Hello = {
+ lincat Greeting, Recipient = {s : Str} ;
+ lin
+ Hello recip = {s = "terve" ++ recip.s} ;
+ World = {s = "maailma"} ;
+ Mum = {s = "äiti"} ;
+ Friends = {s = "ystävät"} ;
+ }
+
+ concrete HelloIta of Hello = {
+ lincat Greeting, Recipient = {s : Str} ;
+ lin
+ Hello recip = {s = "ciao" ++ recip.s} ;
+ World = {s = "mondo"} ;
+ Mum = {s = "mamma"} ;
+ Friends = {s = "amici"} ;
+ }
+</PRE>
+<P>
+Now we have a trilingual grammar usable for translation and
+many other tasks, which we will now start experimenting with.
+</P>
+<A NAME="toc7"></A>
+<H3>Using the grammar in the GF system</H3>
+<P>
+In order to compile the grammar in GF, each of the four modules
+has to be put into a file named <I>Modulename</I><CODE>.gf</CODE>:
+</P>
+<PRE>
+ Hello.gf HelloEng.gf HelloFin.gf HelloIta.gf
+</PRE>
+<P>
+The first GF command needed when using a grammar is to <B>import</B> it.
+The command has a long name, <CODE>import</CODE>, and a short name, <CODE>i</CODE>.
+When you have started GF (by the shell command <CODE>gf</CODE>), you can thus type either
+</P>
+<PRE>
+ &gt; import HelloEng.gf
+</PRE>
+<P>
+or
+</P>
+<PRE>
+ &gt; i HelloEng.gf
+</PRE>
+<P>
+to get the same effect. In general, all GF commands have a long and a short name;
+short names are convenient when typing commands by hand, whereas long command
+names are more readable in scripts, i.e. files that include sequences of commands.
+</P>
+<P>
+The effect of <CODE>import</CODE> is that the GF system <B>compiles</B> your grammar
+into an internal representation, and shows a new prompt when it is ready.
+It will also show how much CPU time was consumed:
+</P>
+<PRE>
+ &gt; i HelloEng.gf
+ - compiling Hello.gf... wrote file Hello.gfc 8 msec
+ - compiling HelloEng.gf... wrote file HelloEng.gfc 12 msec
+
+ 12 msec
+ &gt;
+</PRE>
+<P>
+You can now use GF for <B>parsing</B>:
+</P>
+<PRE>
+ &gt; parse "hello world"
+ Hello World
+</PRE>
+<P>
+The <CODE>parse</CODE> (= <CODE>p</CODE>) command takes a <B>string</B>
+(in double quotes) and returns an <B>abstract syntax tree</B> --- the meaning
+of the string as defined in the abstract syntax.
+A tree is, in general, something easier than a string
+for a machine to understand and to process further, although this
+is not so obvious in this simple grammar. The syntax for trees is that
+of <B>function application</B>, which in GF is written
+</P>
+<PRE>
+ function argument1 ... argumentn
+</PRE>
+<P>
+Parentheses are only needed for grouping. For instance, <CODE>f (a b)</CODE> is
+<CODE>f</CODE> applied to the application of <CODE>a</CODE> to <CODE>b</CODE>. This is different
+from <CODE>f a b</CODE>, which is <CODE>f</CODE> applied to <CODE>a</CODE> and <CODE>b</CODE>.
+</P>
+<P>
+Strings that return a tree when parsed do so in virtue of the grammar
+you imported. Try to parse something that is not in grammar, and you will fail
+</P>
+<PRE>
+ &gt; parse "hello dad"
+ Unknown words: dad
+
+ &gt; parse "world hello"
+ no tree found
+</PRE>
+<P>
+In the first example, the failure is caused by an unknown word.
+In the second example, the combination of words is ungrammatical.
+</P>
+<P>
+In addition to parsing, you can also use GF for <B>linearization</B>
+(<CODE>linearize = l</CODE>). This is the inverse of
+parsing, taking trees into strings:
+</P>
+<PRE>
+ &gt; linearize Hello World
+ hello world
+</PRE>
+<P>
+What is the use of this? Typically not that you type in a tree at
+the GF prompt. The utility of linearization comes from the fact that
+you can obtain a tree from somewhere else --- for instance, from
+a parser. A prime example of this is <B>translation</B>: you parse
+with one concrete syntax and linearize with another. Let us
+now do this by first importing the Italian grammar:
+</P>
+<PRE>
+ &gt; import HelloIta.gf
+</PRE>
+<P>
+We can now parse with <CODE>HelloEng</CODE> and <B>pipe</B> the result
+into linearizing with <CODE>HelloIta</CODE>:
+</P>
+<PRE>
+ &gt; parse -lang=HelloEng "hello mum" | linearize -lang=HelloIta
+ ciao mamma
+</PRE>
+<P>
+Notice that, since there are now two concrete syntaxes read into the
+system, the commands use a <B>language flag</B> to indicate
+which concrete syntax is used in each operation. If no language flag is
+given, the last-imported language is applied.
+</P>
+<P>
+To conclude the translation exercise, we import the Finnish grammar
+and pipe English parsing into <B>multilingual generation</B>:
+</P>
+<PRE>
+ &gt; parse -lang=HelloEng "hello friends" | linearize -multi
+ terve ystävät
+ ciao amici
+ hello friends
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Test the parsing and translation examples shown above, as well as
+some other examples, in different combinations of languages.
+</P>
+<P>
+<B>Exercise</B>. Extend the grammar <CODE>Hello.gf</CODE> and some of the
+concrete syntaxes by five new recipients and one new greeting
+form.
+</P>
+<P>
+<B>Exercise</B>. Add a concrete syntax for some other
+languages you might know.
+</P>
+<P>
+<B>Exercise</B>. Add a pair of greetings that are expressed in one and the same way in
+one language and in two different ways in another. For instance, <I>good morning</I>
+and <I>good afternoon</I> in English are both expressed as <I>buongiorno</I> in Italian.
+Test what happens when you translate <I>buongiorno</I> to English in GF.
+</P>
+<P>
+<B>Exercise</B>. Inject errors in the <CODE>Hello</CODE> grammars, for example, leave out
+some line, omit a variable in a <CODE>lin</CODE> rule, or change the name in one occurrence
+of a variable. Inspect the error messages generated by GF.
+</P>
+<A NAME="toc8"></A>
+<H2>Using grammars from outside GF</H2>
+<P>
+A normal "hello world" program written in C is executable from the
+Unix shell and print its output on the terminal. This is possible in GF
+as well, by using the <CODE>gf</CODE> program in a Unix pipe. Invoking <CODE>gf</CODE>
+can be made with grammar names as arguments,
+</P>
+<PRE>
+ % gf HelloEng.gf HelloFin.gf HelloIta.gf
+</PRE>
+<P>
+which has the same effect as opening <CODE>gf</CODE> and then importing the
+grammars. A command can be send to this <CODE>gf</CODE> state by piping it from
+Unix's <CODE>echo</CODE> command:
+</P>
+<PRE>
+ % echo "l -multi Hello Wordl" | gf HelloEng.gf HelloFin.gf HelloIta.gf
+</PRE>
+<P>
+which will execute the command and then quit. Alternatively, one can write
+a <B>script</B>, a file containing the lines
+</P>
+<PRE>
+ import HelloEng.gf
+ import HelloFin.gf
+ import HelloIta.gf
+ linearize -multi Hello World
+</PRE>
+<P>
+If we name this script <CODE>hello.gfs</CODE>, we can do
+</P>
+<PRE>
+ $ gf -batch -s &lt;hello.gfs s
+
+ ciao mondo
+ terve maailma
+ hello world
+</PRE>
+<P>
+The options <CODE>-batch</CODE> and <CODE>-s</CODE> ("silent") remove prompts, CPU time,
+and other messages. Writing GF scripts and Unix shell scripts that call
+GF is the simplest way to build application programs that use GF grammars.
+In <a href="#chapeight">the eighth chapter</a>, we will see how to build stand-alone programs that don't need
+the GF system to run.
+</P>
+<P>
+<B>Exercise</B>. (For Unix hackers.) Write a GF application that reads
+an English string from the standard input and writes an Italian
+translation to the output.
+</P>
+<A NAME="toc9"></A>
+<H2>What else can be done with the grammar</H2>
+<P>
+Now we have built our first multilingual grammar and seen the basic
+functionalities of GF: parsing and linearization. We have tested
+these functionalities inside the GF program. In the forthcoming
+chapters, we will build larger grammars and can then get more out of
+these functionalities. But we will also introduce new ones:
+</P>
+<UL>
+<LI><B>morphological analysis</B>: find out the possible inflection forms of words
+<LI><B>morphological synthesis</B>: generate all inflection forms of words
+<LI><B>random generation</B>: generate random expressions
+<LI><B>corpus generation</B>: generate all expressions
+<LI><B>treebank generation</B>: generate a list of trees with their linearizations
+<LI><B>teaching quizzes</B>: train morphology and translation
+<LI><B>multilingual authoring</B>: create a document in many languages simultaneously
+<LI><B>speech input</B>: optimize a speech recognition system for a grammar
+</UL>
+
+<P>
+The usefulness of GF would be quite limited if grammars were
+usable only inside the GF system. In <a href="#chapeight">the eighth chapter</a>,
+we will see other ways of using grammars:
+</P>
+<UL>
+<LI>compile them to new formats, such as speech recognition grammars
+<LI>embed them in Java and Haskell programs
+<LI>build applications using compilation and embedding:
+ <UL>
+ <LI>voice commands
+ <LI>spoken language translators
+ <LI>dialogue systems
+ <LI>user interfaces
+ <LI>localization: parametrize the messages printed by a program
+ to support different languages
+ </UL>
+</UL>
+
+<P>
+All GF functionalities, both those inside the GF program and those
+ported to other environments,
+are of course already applicable to the simplest of grammars,
+such as the <CODE>Hello</CODE> grammars presented above. But the main focus
+of this tutorial will be on grammar writing. Thus we will show
+how larger and more expressive grammars can be built by using
+the constructs of the GF programming language, before entering the
+applications.
+</P>
+<A NAME="toc10"></A>
+<H2>Summary of GF language features</H2>
+<P>
+As the last section of each chapter, we will give a summary of the GF language
+features covered in the chapter. The presentation is rather technical and intended
+as a reference for later use, rather than to be read at once. The summaries
+may cover some new features, which complement the discussion in the main chapter.
+</P>
+<A NAME="toc11"></A>
+<H3>Modules</H3>
+<P>
+A GF grammar consists of <B>modules</B>,
+into which judgements are grouped. The most important
+module forms are
+</P>
+<UL>
+<LI><CODE>abstract</CODE> A <CODE>= {...}</CODE> , abstract syntax A with judgements in
+ the <B>module body</B> <CODE>{...}</CODE>.
+<LI><CODE>concrete</CODE> C <CODE>of</CODE> A <CODE>= {...}</CODE>, concrete syntax C of the
+ abstract syntax A, with judgements in the module body <CODE>{...}</CODE>.
+</UL>
+
+<P>
+Each module is written in a file named <I>Modulename</I><CODE>.gf</CODE>.
+</P>
+<A NAME="toc12"></A>
+<H3>Judgements</H3>
+<P>
+<a name="secjment"></a>
+</P>
+<P>
+Rules in a module body are called <B>judgements</B>. Keywords such as
+<CODE>fun</CODE> and <CODE>lin</CODE> are used for distinguishing between
+<B>judgement forms</B>. Here is a summary of the most important
+judgement forms, which we have considered by now:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>form</TH>
+<TH>reading</TH>
+<TH COLSPAN="2">module type</TH>
+</TR>
+<TR>
+<TD><CODE>cat</CODE> <I>C</I></TD>
+<TD><I>C</I> is a category</TD>
+<TD>abstract</TD>
+</TR>
+<TR>
+<TD><CODE>fun</CODE> <I>f</I> <CODE>:</CODE> <I>A</I></TD>
+<TD><I>f</I> is a function of type <I>A</I></TD>
+<TD>abstract</TD>
+</TR>
+<TR>
+<TD><CODE>lincat</CODE> <I>C</I> <CODE>=</CODE> <I>T</I></TD>
+<TD>category <I>C</I> has linearization type <I>T</I></TD>
+<TD>concrete</TD>
+</TR>
+<TR>
+<TD><CODE>lin</CODE> <I>f <i>x</i><sub>1</sub> ... <i>x</i><sub>n</sub></I> <CODE>=</CODE> <I>t</I></TD>
+<TD>function <I>f</I> has linearization <I>t</I></TD>
+<TD>concrete</TD>
+</TR>
+<TR>
+<TD><CODE>flags</CODE> <I>p</I> <CODE>=</CODE> <I>v</I></TD>
+<TD>flag <I>p</I> has value <I>v</I></TD>
+<TD>any</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+Both abstract and concrete modules may moreover contain <B>comments</B> of the forms
+</P>
+<UL>
+<LI><CODE>--</CODE> <I>anything until a newline</I>
+<LI><CODE>{-</CODE> <I>anything except hyphen followed by closing brace</I> <CODE>-}</CODE>
+</UL>
+
+<P>
+Judgements are terminated by semicolons. Shorthands permit the sharing of
+the keyword in subsequent judgements,
+</P>
+<PRE>
+ cat C ; D ; === cat C ; cat D ;
+</PRE>
+<P>
+and of the right-hand-side in subsequent judgements of the same form
+</P>
+<PRE>
+ fun f, g : A ; === fun f : A ; g : A ;
+</PRE>
+<P>
+We will use the symbol <CODE>===</CODE> to indicate <B>syntactic sugar</B> when
+speaking about GF. Thus it is not a symbol of the GF language.
+</P>
+<P>
+Each judgement declares a <B>name</B>, which is an <B>identifier</B>.
+An identifier is a letter followed by a sequence of letters, digits, and
+characters <CODE>'</CODE> or <CODE>_</CODE>. Each identifier can only be
+defined once in the same module (that is, as next to the judgement keyword;
+local variables such as those in <CODE>lin</CODE> judgemenrs can be
+reused in other judgements).
+</P>
+<P>
+Names are in <B>scope</B> in the rest of the module, i.e. usable in the other
+judgements of the module (subject to type restrictions, of course). Also
+the name of the module is an identifier in scope.
+</P>
+<P>
+The order of judgements in a module is free. In particular, an identifier
+need not be declared before it is used.
+</P>
+<A NAME="toc13"></A>
+<H3>Types and terms</H3>
+<P>
+A <B>type</B> in an abstract syntax are either a <B>basic type</B>,
+i.e. one introduced in a <CODE>cat</CODE> judgement, or a
+<B>function type</B> of the form
+</P>
+<PRE>
+ A1 -&gt; ... -&gt; An -&gt; A
+</PRE>
+<P>
+where each of <CODE>A1, ..., An, A</CODE> is a basic type.
+The last type in the arrow-separated sequence
+is the <B>value type</B> of the function type, and the earlier types are
+its <B>argument types</B>.
+</P>
+<P>
+In a concrete syntax, the available types include
+</P>
+<UL>
+<LI>the type of <B>token lists</B>, <CODE>Str</CODE>
+<LI><B>record types</B> of form <CODE>{</CODE> r1 : T1 ; ... ; rn : Tn <CODE>}</CODE>
+</UL>
+
+<P>
+Token lists are often briefly called <B>strings</B>.
+</P>
+<P>
+Each semi-colon separated part in a record type is called a
+<B>field</B>. The identifier introduced by the left-hand-side of a field
+is called a <B>label</B>.
+</P>
+<P>
+A <B>term</B> in abstract syntax is a <B>function application</B> of form
+</P>
+<PRE>
+ f a1 ... an
+</PRE>
+<P>
+where <CODE>f</CODE> is a function declared in a <CODE>fun</CODE> judgement and <CODE>a1 ... an</CODE>
+are terms. These terms are also called <B>abstract syntax trees</B>, or just
+<B>trees</B>.
+The tree above is well-typed and has the type A, if
+</P>
+<PRE>
+ f : A1 -&gt; ... -&gt; An -&gt; A
+</PRE>
+<P>
+and each <CODE>ai</CODE> has type <CODE>an</CODE>.
+</P>
+<P>
+A term used in concrete syntax has one the forms
+</P>
+<UL>
+<LI>quoted string: <CODE>"foo"</CODE>, of type <CODE>Str</CODE>
+<LI>concatenation of strings: <CODE>"foo" ++ "bar"</CODE>,
+<LI>record: <CODE>{</CODE> r1 = t1 ; ... ; rn = Tn <CODE>}</CODE>,
+ of type <CODE>{</CODE> r1 : R1 ; ... ; rn : Rn <CODE>}</CODE>
+<LI>projection <CODE>t.r</CODE> of a term <CODE>t</CODE> that has a record type,
+ with the record label <CODE>r</CODE>; the projection has the corresponding record
+ field type
+<LI>argument variable <CODE>x</CODE> bound by the left-hand-side of a <CODE>lin</CODE> rule,
+ of the corresponding linearization type
+</UL>
+
+<P>
+Each quoted string is treated as one <B>token</B>, and strings concatenated by
+<CODE>++</CODE> are treated as separate tokens. Tokens are, by default, written with
+a space in between. This behaviour can be changed by <CODE>lexer</CODE> and <CODE>unlexer</CODE>
+flags, as will be explained later "Rseclexing. Therefore it is usually
+not correct to have a space in a token. Writing
+</P>
+<PRE>
+ "hello world"
+</PRE>
+<P>
+in a grammar would give the parser the task to find a token with a space
+in it, rather than two tokens <CODE>"hello"</CODE> and <CODE>"world"</CODE>. If the latter is
+what is meant, it is possible to use the shorthand
+</P>
+<PRE>
+ ["hello world"] === "hello" ++ "world"
+</PRE>
+<P>
+The <B>empty string</B> is denoted by <CODE>[]</CODE> or, equivalently, <CODE>`` or ``[]</CODE>.
+</P>
+<A NAME="toc14"></A>
+<H3>Type checking</H3>
+<P>
+An important functionality of the GF system is <B>static type checking</B>.
+This means that the grammars are controlled to be well-formed, so that all
+run-time errors are eliminated. The main type checking principles are the
+following:
+</P>
+<UL>
+<LI>a concrete syntax must define the <CODE>lincat</CODE> of each <CODE>cat</CODE> and a <CODE>lin</CODE>
+ for each <CODE>fun</CODE> in the abstract syntax that it is "<CODE>of</CODE>"
+<LI><CODE>lin</CODE> rules are type checked with respect to the <CODE>lincat</CODE> and <CODE>fun</CODE>
+ rules
+<LI>terms have types as defined in the previous section
+</UL>
+
+<A NAME="toc15"></A>
+<H1>Designing a grammar for complex phrases</H1>
+<P>
+<a name="chapthree"></a>
+</P>
+<P>
+In this chapter, we will write a grammar that has much more structure than
+the <CODE>Hello</CODE> grammar. We will look at how the abstract syntax
+is divided into suitable categories, and how infinitely many
+phrases can be generated by using recursive rules. We will also
+introduce modularity by showing how a grammar can be
+divided into modules, and how functional programming
+can be used to share code in and among modules.
+</P>
+<A NAME="toc16"></A>
+<H2>The abstract syntax Food</H2>
+<P>
+We will write a grammar that
+defines a set of phrases usable for speaking about food:
+</P>
+<UL>
+<LI>the start category is <CODE>Phrase</CODE>
+<LI>a <CODE>Phrase</CODE> can be built by assigning a <CODE>Quality</CODE> to an <CODE>Item</CODE>
+ (e.g. <I>this cheese is Italian</I>)
+<LI>an<CODE>Item</CODE> are build from a <CODE>Kind</CODE> by prefixing <I>this</I> or <I>that</I>
+ (e.g. <I>this wine</I>)
+<LI>a <CODE>Kind</CODE> is either <B>atomic</B> (e.g. <I>cheese</I>), or formed
+ qualifying a given <CODE>Kind</CODE> with a <CODE>Quality</CODE> (e.g. <I>Italian cheese</I>)
+<LI>a <CODE>Quality</CODE> is either atomic (e.g. <I>Italian</I>,
+ or built by modifying a given <CODE>Quality</CODE> with the word <I>very</I> (e.g. <I>very warm</I>)
+</UL>
+
+<P>
+These verbal descriptions can be expressed as the following abstract syntax:
+</P>
+<PRE>
+ abstract Food = {
+
+ flags startcat = Phrase ;
+
+ cat
+ Phrase ; Item ; Kind ; Quality ;
+
+ fun
+ Is : Item -&gt; Quality -&gt; Phrase ;
+ This, That : Kind -&gt; Item ;
+ QKind : Quality -&gt; Kind -&gt; Kind ;
+ Wine, Cheese, Fish : Kind ;
+ Very : Quality -&gt; Quality ;
+ Fresh, Warm, Italian, Expensive, Delicious, Boring : Quality ;
+ }
+</PRE>
+<P>
+In this abstract syntax, we can build <CODE>Phrase</CODE>s such as
+</P>
+<PRE>
+ Is (This (QKind Delicious (QKind Italian Wine))) (Very (Very Expensive))
+</PRE>
+<P>
+In the English concrete syntax, we will want to linearize this into
+</P>
+<PRE>
+ this delicious Italian wine is very very expensive
+</PRE>
+<P></P>
+<A NAME="toc17"></A>
+<H2>The concrete syntax FoodEng</H2>
+<P>
+The English concrete syntax gives no surprises:
+</P>
+<PRE>
+ concrete FoodEng of Food = {
+
+ lincat
+ Phrase, Item, Kind, Quality = {s : Str} ;
+
+ lin
+ Is item quality = {s = item.s ++ "is" ++ quality.s} ;
+ This kind = {s = "this" ++ kind.s} ;
+ That kind = {s = "that" ++ kind.s} ;
+ QKind quality kind = {s = quality.s ++ kind.s} ;
+ Wine = {s = "wine"} ;
+ Cheese = {s = "cheese"} ;
+ Fish = {s = "fish"} ;
+ Very quality = {s = "very" ++ quality.s} ;
+ Fresh = {s = "fresh"} ;
+ Warm = {s = "warm"} ;
+ Italian = {s = "Italian"} ;
+ Expensive = {s = "expensive"} ;
+ Delicious = {s = "delicious"} ;
+ Boring = {s = "boring"} ;
+ }
+</PRE>
+<P>
+Let us test how the grammar works in parsing:
+</P>
+<PRE>
+ &gt; import FoodEng.gf
+ &gt; parse "this delicious wine is very very Italian"
+ Is (This (QKind Delicious Wine)) (Very (Very Italian))
+</PRE>
+<P>
+We can also try parsing in other categories than the <CODE>startcat</CODE>,
+by setting the command-line <CODE>cat</CODE> flag:
+</P>
+<PRE>
+ p -cat=Kind "very Italian wine"
+ QKind (Very Italian) Wine
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Extend the <CODE>Food</CODE> grammar by ten new food kinds and
+qualities, and run the parser with new kinds of examples.
+</P>
+<P>
+<B>Exercise</B>. Add a rule that enables question phrases of the form
+<I>is this cheese Italian</I>.
+</P>
+<P>
+<B>Exercise</B>. Enable the optional prefixing of
+phrases with the words "excuse me but". Do this in such a way that
+the prefix can occur at most once.
+</P>
+<A NAME="toc18"></A>
+<H2>Commands for testing grammars</H2>
+<A NAME="toc19"></A>
+<H3>Generating trees and strings</H3>
+<P>
+When we have a grammar above a trivial size, especially a recursive
+one, we need more efficient ways of testing it than just by parsing
+sentences that happen to come to our minds. One way to do this is
+based on automatic generation, which can be either
+<B>random generation</B> or <B>exhaustive generation</B>.
+</P>
+<P>
+Random generation (<CODE>generate_random = gr</CODE>) is an operation that
+builds a random tree in accordance with an abstract syntax:
+</P>
+<PRE>
+ &gt; generate_random
+ Is (This (QKind Italian Fish)) Fresh
+</PRE>
+<P>
+By using a pipe, random generation can be fed into linearization:
+</P>
+<PRE>
+ &gt; generate_random | linearize
+ this Italian fish is fresh
+</PRE>
+<P>
+Random generation is a good way to test a grammar. It can also give results
+that are surprising, which shows how fast we lose intuition
+when we write complex grammars.
+</P>
+<P>
+By using the <CODE>number</CODE> flag, several trees can be generated
+in one command:
+</P>
+<PRE>
+ &gt; gr -number=10 | l
+ that wine is boring
+ that fresh cheese is fresh
+ that cheese is very boring
+ this cheese is Italian
+ that expensive cheese is expensive
+ that fish is fresh
+ that wine is very Italian
+ this wine is Italian
+ this cheese is boring
+ this fish is boring
+</PRE>
+<P>
+To generate <I>all</I> phrases that a grammar can produce,
+GF provides the command <CODE>generate_trees = gt</CODE>.
+</P>
+<PRE>
+ &gt; generate_trees | l
+ that cheese is very Italian
+ that cheese is very boring
+ that cheese is very delicious
+ that cheese is very expensive
+ that cheese is very fresh
+ ...
+ this wine is expensive
+ this wine is fresh
+ this wine is warm
+
+</PRE>
+<P>
+We get quite a few trees but not all of them: only up to a given
+<B>depth</B> of trees. The default depth is 3; the depth can be
+set by using the <CODE>depth</CODE> flag:
+</P>
+<PRE>
+ &gt; generate_trees -depth=5 | l
+</PRE>
+<P>
+Other options to the generation commands (like all commands) can be seen
+by GF's <CODE>help = h</CODE> command:
+</P>
+<PRE>
+ &gt; help gr
+ &gt; help gt
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. If the command <CODE>gt</CODE> generated all
+trees in your grammar, it would never terminate. Why?
+</P>
+<P>
+<B>Exercise</B>. Measure how many trees the grammar gives with depths 4 and 5,
+respectively. <B>Hint</B>. You can
+use the Unix <B>word count</B> command <CODE>wc</CODE> to count lines.
+</P>
+<A NAME="toc20"></A>
+<H3>More on pipes; tracing</H3>
+<P>
+A pipe of GF commands can have any length, but the "output type"
+(either string or tree) of one command must always match the "input type"
+of the next command, in order for the result to make sense.
+</P>
+<P>
+The intermediate results in a pipe can be observed by putting the
+<B>tracing</B> option <CODE>-tr</CODE> to each command whose output you
+want to see:
+</P>
+<PRE>
+ &gt; gr -tr | l -tr | p
+
+ Is (This Cheese) Boring
+ this cheese is boring
+ Is (This Cheese) Boring
+</PRE>
+<P>
+This facility is useful for test purposes: the pipe above can show
+if a grammar is <B>ambiguous</B>, i.e.
+contains strings that can be parsed in more than one way.
+</P>
+<P>
+<B>Exercise</B>. Extend the <CODE>Food</CODE> grammar so that it produces ambiguous
+strings, and try out the ambiguity test.
+</P>
+<A NAME="toc21"></A>
+<H3>Writing and reading files</H3>
+<P>
+To save the outputs of GF commands into a file, you can
+pipe it to the <CODE>write_file = wf</CODE> command,
+</P>
+<PRE>
+ &gt; gr -number=10 | linearize | write_file exx.tmp
+</PRE>
+<P>
+You can read the file back to GF with the
+<CODE>read_file = rf</CODE> command,
+</P>
+<PRE>
+ &gt; read_file exx.tmp | parse -lines
+</PRE>
+<P>
+Notice the flag <CODE>-lines</CODE> given to the parsing
+command. This flag tells GF to parse each line of
+the file separately. Without the flag, the grammar could
+not recognize the string in the file, because it is not
+a sentence but a sequence of ten sentences.
+</P>
+<P>
+Files with examples can be used for <B>regression testing</B>
+of grammars. The most systematic way to do this is by
+generating treebanks; see <a href="#sectreebank">here</a>.
+</P>
+<A NAME="toc22"></A>
+<H3>Visualizing trees</H3>
+<P>
+The gibberish code with parentheses returned by the parser does not
+look like trees. Why is it called so? From the abstract mathematical
+point of view, trees are a data structure that
+represents <B>nesting</B>: trees are branching entities, and the branches
+are themselves trees. Parentheses give a linear representation of trees,
+useful for the computer. But the human eye may prefer to see a visualization;
+for this purpose, GF provides the command <CODE>visualize_tree = vt</CODE>, to which
+parsing (and any other tree-producing command) can be piped:
+</P>
+<PRE>
+ &gt; parse "this delicious cheese is very Italian" | visualize_tree
+</PRE>
+<P></P>
+<P>
+<IMG ALIGN="middle" SRC="mytree.png" BORDER="0" ALT="">
+</P>
+<P>
+This command uses the programs Graphviz and Ghostview, which you
+might not have, but which are freely available on the web.
+</P>
+<P>
+Alternatively, you can print the tree into a file
+e.g. a <CODE>.png</CODE> file that
+can be be viewed with e.g. an HTML browser and also included in an
+HTML document. You can do this
+by saving the file <CODE>grphtmp.dot</CODE>, which the command <CODE>vt</CODE>
+produces. Then you can process this file with the <CODE>dot</CODE>
+program (from the Graphviz package).
+</P>
+<PRE>
+ % dot -Tpng grphtmp.dot &gt; mytree.png
+</PRE>
+<P></P>
+<A NAME="toc23"></A>
+<H3>System commands</H3>
+<P>
+If you don't have Ghostview, or want to view graphs in some other way,
+you can call <CODE>dot</CODE> and a suitable
+viewer (e.g. <CODE>open</CODE> in Mac) without leaving GF, by using
+a <B>system command</B>: <CODE>!</CODE> followed by a Unix command,
+</P>
+<PRE>
+ &gt; ! dot -Tpng grphtmp.dot &gt; mytree.png
+ &gt; ! open mytree.png
+</PRE>
+<P>
+Another form of system commands are those that receive arguments from
+GF pipes. The escape symbol
+is then <CODE>?</CODE>.
+</P>
+<PRE>
+ &gt; generate_trees | ? wc
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. (Exercise drom 3.3.1 revisited.)
+Measure how many trees the grammar <CODE>FoodEng</CODE> gives with depths 4 and 5,
+respectively. Use the Unix <B>word count</B> command <CODE>wc</CODE> to count lines, and
+a pipe from a GF command into a Unix command.
+</P>
+<A NAME="toc24"></A>
+<H2>An Italian concrete syntax</H2>
+<P>
+<a name="secanitalian"></a>
+</P>
+<P>
+We write the Italian grammar in a straightforward way, by replacing
+English words with their dictionary equivalents:
+</P>
+<PRE>
+ concrete FoodIta of Food = {
+
+ lincat
+ Phrase, Item, Kind, Quality = {s : Str} ;
+
+ lin
+ Is item quality = {s = item.s ++ "è" ++ quality.s} ;
+ This kind = {s = "questo" ++ kind.s} ;
+ That kind = {s = "quello" ++ kind.s} ;
+ QKind quality kind = {s = kind.s ++ quality.s} ;
+ Wine = {s = "vino"} ;
+ Cheese = {s = "formaggio"} ;
+ Fish = {s = "pesce"} ;
+ Very quality = {s = "molto" ++ quality.s} ;
+ Fresh = {s = "fresco"} ;
+ Warm = {s = "caldo"} ;
+ Italian = {s = "italiano"} ;
+ Expensive = {s = "caro"} ;
+ Delicious = {s = "delizioso"} ;
+ Boring = {s = "noioso"} ;
+ }
+</PRE>
+<P>
+An alert reader, or one who already knows Italian, may notice one point in
+which the change is more substantial than just replacement of words: the order of
+a quality and the kind it modifies in
+</P>
+<PRE>
+ QKind quality kind = {s = kind.s ++ quality.s} ;
+</PRE>
+<P>
+Thus Italian says <CODE>vino italiano</CODE> for <CODE>Italian wine</CODE>. (Some Italian adjectives
+are put before the noun. This distinction can be controlled by parameters, which
+are introduced in <a href="#chaptwo">the fourth chapter</a>.)
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax of <CODE>Food</CODE> for some other language.
+You will probably end up with grammatically incorrect linearizations --- but don't
+worry about this yet.
+</P>
+<P>
+<B>Exercise</B>. If you have written <CODE>Food</CODE> for German, Swedish, or some
+other language, test with random or exhaustive generation what constructs
+come out incorrect, and prepare a list of those ones that cannot be helped
+with the currently available fragment of GF. You can return to your list
+after having worked out <a href="#chaptwo">the fourth chapter</a>.
+</P>
+<A NAME="toc25"></A>
+<H2>Free variation</H2>
+<P>
+Sometimes there are alternative ways to define a concrete syntax.
+For instance, if we use the <CODE>Food</CODE> grammars in a restaurant phrase
+book, we may want to accept different words for expressing the quality
+"delicious" ---- and different languages can differ in how many
+such words they have. Then we don't want to put the distinctions into
+the abstract syntax, but into concrete syntaxes. Such semantically
+neutral distinctions are known as <B>free variation</B> in linguistics.
+</P>
+<P>
+The <CODE>variants</CODE> construct of GF expresses free variation. For example,
+</P>
+<PRE>
+ lin Delicious = {s = variants {"delicious" ; "exquisit" ; "tasty"}} ;
+</PRE>
+<P>
+says that <CODE>Delicious</CODE> can be linearized to any of <I>delicious</I>,
+<I>exquisit</I>, and <I>tasty</I>. As a consequence, both these words result in the
+tree <CODE>Delicious</CODE> when parsed. By default, the <CODE>linearize</CODE> command
+shows only the first variant from each <CODE>variants</CODE> list; to see them
+all, the option <CODE>-all</CODE> can be used:
+</P>
+<PRE>
+ &gt; p "this exquisit wine is delicious" | l -all
+ this delicious wine is delicious
+ this delicious wine is exquisit
+ ...
+</PRE>
+<P>
+In linguistics, it is well known that free variation is almost
+non-existing, if all aspects of expressions are taken into account, including style.
+Therefore, free variation should not be used in grammars that are meant as
+libraries for other grammars, as in <a href="#chapfive">the fifth chapter</a>. However, in a specific
+application, free variation is an excellent way to scale up the parser to
+variations in user input that make no difference in the semantic
+treatment.
+</P>
+<P>
+An example that clearly illustrates these points is the
+English negation. If we added to the <CODE>Food</CODE> grammar the negation
+of a quality, we could accept both contracted and uncontracted <I>not</I>:
+</P>
+<PRE>
+ fun IsNot : Item -&gt; Quality -&gt; Phrase ;
+ lin IsNot item qual =
+ {s = item.s ++ variants {"isn't" ; ["is not"]} ++ qual.s} ;
+</PRE>
+<P>
+Both forms are likely to occur in user input. Since there is no
+corresponding contrast in Italian, we do not want to put the distinction
+in the abstract syntax. Yet there is a stylistic difference between
+these two forms. In particular, if we are doing generation rather
+than parsing, we will want to choose the one or
+the other depending on the kind of language we want to generate.
+</P>
+<P>
+A limiting case of free variation is an empty variant list
+</P>
+<PRE>
+ variants {}
+</PRE>
+<P>
+It can be used e.g. if a word lacks a certain inflection form.
+</P>
+<P>
+Free variation works for all types in concrete syntax; all terms in
+a <CODE>variants</CODE> list must be of the same type.
+</P>
+<P>
+<B>Exercise</B>. Modify <CODE>FoodIta</CODE> in such a way that a quality can
+be assigned to an item by using two different word orders, exemplified
+by <I>questo vino è delizioso</I> and <I>è delizioso questo vino</I>
+(a real variation in Italian),
+and that it is impossible to say that something is boring
+(a rather contrived example).
+</P>
+<A NAME="toc26"></A>
+<H2>More application of multilingual grammars</H2>
+<A NAME="toc27"></A>
+<H3>Multilingual treebanks</H3>
+<P>
+<a name="sectreebank"></a>
+</P>
+<P>
+A <B>multilingual treebank</B> is a set of trees with their
+translations in different languages:
+</P>
+<PRE>
+ &gt; gr -number=2 | tree_bank
+
+ Is (That Cheese) (Very Boring)
+ quello formaggio è molto noioso
+ that cheese is very boring
+
+ Is (That Cheese) Fresh
+ quello formaggio è fresco
+ that cheese is fresh
+</PRE>
+<P>
+There is also an XML format for treebanks and a set of commands
+suitable for regression testing; see <CODE>help tb</CODE> for more details.
+</P>
+<A NAME="toc28"></A>
+<H3>Translation session</H3>
+<P>
+If translation is what you want to do with a set of grammars, a convenient
+way to do it is to open a <CODE>translation_session = ts</CODE>. In this session,
+you can translate between all the languages that are in scope.
+A dot <CODE>.</CODE> terminates the translation session.
+</P>
+<PRE>
+ &gt; ts
+
+ trans&gt; that very warm cheese is boring
+ quello formaggio molto caldo è noioso
+ that very warm cheese is boring
+
+ trans&gt; questo vino molto italiano è molto delizioso
+ questo vino molto italiano è molto delizioso
+ this very Italian wine is very delicious
+
+ trans&gt; .
+ &gt;
+</PRE>
+<P></P>
+<A NAME="toc29"></A>
+<H3>Translation quiz</H3>
+<P>
+This is a simple language exercise that can be automatically
+generated from a multilingual grammar. The system generates a set of
+random sentences, displays them in one language, and checks the user's
+answer given in another language. The command <CODE>translation_quiz = tq</CODE>
+makes this in a subshell of GF.
+</P>
+<PRE>
+ &gt; translation_quiz FoodEng FoodIta
+
+ Welcome to GF Translation Quiz.
+ The quiz is over when you have done at least 10 examples
+ with at least 75 % success.
+ You can interrupt the quiz by entering a line consisting of a dot ('.').
+
+ this fish is warm
+ questo pesce è caldo
+ &gt; Yes.
+ Score 1/1
+
+ this cheese is Italian
+ questo formaggio è noioso
+ &gt; No, not questo formaggio è noioso, but
+ questo formaggio è italiano
+
+ Score 1/2
+ this fish is expensive
+</PRE>
+<P>
+You can also generate a list of translation exercises and save it in a
+file for later use, by the command <CODE>translation_list = tl</CODE>
+</P>
+<PRE>
+ &gt; translation_list -number=25 FoodEng FoodIta | write_file transl.txt
+</PRE>
+<P>
+The <CODE>number</CODE> flag gives the number of sentences generated.
+</P>
+<A NAME="toc30"></A>
+<H3>Multilingual syntax editing</H3>
+<P>
+<a name="secediting"></a>
+</P>
+<P>
+Any multilingual grammar can be used in the graphical syntax editor, which is
+opened by the shell
+command <CODE>gfeditor</CODE> followed by the names of the grammar files.
+Thus
+</P>
+<PRE>
+ % gfeditor FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+opens the editor for the two <CODE>Food</CODE> grammars.
+</P>
+<P>
+The editor supports commands for manipulating an abstract syntax tree.
+The process is started by choosing a category from the "New" menu.
+Choosing <CODE>Phrase</CODE> creates a new tree of type <CODE>Phrase</CODE>. A new tree
+is in general completely unknown: it consists of a <B>metavariable</B>
+<CODE>?1</CODE>. However, since the category <CODE>Phrase</CODE> in <CODE>Food</CODE> has
+only one possible constructor, <CODE>Is</CODE>, the tree is readily
+given the form <CODE>Is ?1 ?2</CODE>. Here is what the editor looks like at
+this stage:
+</P>
+<P>
+ <IMG ALIGN="right" SRC="food1.png" BORDER="0" ALT="">
+</P>
+<P>
+Editing goes on by <B>refinements</B>, i.e. choices of constructors from
+the menu, until no metavariables remain. Here is a tree resulting from the
+current editing session:
+</P>
+<P>
+ <IMG ALIGN="right" SRC="food2.png" BORDER="0" ALT="">
+</P>
+<P>
+Editing can be continued even when the tree is finished. The user can shift
+the <B>focus</B> to some of the subtrees by clicking at it or the corresponding
+part of a linearization. In the picture, the focus is on "fish".
+Since there are no metavariables,
+the menu shows no refinements, but some other possible actions:
+</P>
+<UL>
+<LI>to <B>change</B> "fish" to "cheese" or "wine"
+<LI>to <B>delete</B> "fish", i.e. change it to a metavariable
+<LI>to <B>wrap</B> "fish" in a qualification, i.e. change it to
+ <CODE>QKind ? Fish</CODE>, where the quality can be given in a later refinement
+</UL>
+
+<P>
+In addition to menu-based editing, the tool supports refinement by parsing,
+which is accessible by middle-clicking in the tree or in the linearization field.
+</P>
+<P>
+<B>Exercise</B>. Construct the sentence
+<I>this very expensive cheese is very very delicious</I>
+and its Italian translation by using <CODE>gfeditor</CODE>.
+</P>
+<A NAME="toc31"></A>
+<H2>Context-free grammars and GF</H2>
+<P>
+Readers not familar with context-free grammars, also known as BNF grammars, can
+skip this section. Those that are familar with them will find here the exact
+relation between GF and context-free grammars. We will moreover show how
+the BNF format can be used as input to the GF program; it is often more
+concise than GF proper, but also more restricted in expressive power.
+</P>
+<A NAME="toc32"></A>
+<H3>The "cf" grammar format</H3>
+<P>
+The grammar <CODE>FoodEng</CODE> could be written in a BNF format as follows:
+</P>
+<PRE>
+ Is. Phrase ::= Item "is" Quality ;
+ That. Item ::= "that" Kind ;
+ This. Item ::= "this" Kind ;
+ QKind. Kind ::= Quality Kind ;
+ Cheese. Kind ::= "cheese" ;
+ Fish. Kind ::= "fish" ;
+ Wine. Kind ::= "wine" ;
+ Italian. Quality ::= "Italian" ;
+ Boring. Quality ::= "boring" ;
+ Delicious. Quality ::= "delicious" ;
+ Expensive. Quality ::= "expensive" ;
+ Fresh. Quality ::= "fresh" ;
+ Very. Quality ::= "very" Quality ;
+ Warm. Quality ::= "warm" ;
+</PRE>
+<P>
+In this format, each rule is prefixed by a <B>label</B> that gives
+the constructor function GF gives in its <CODE>fun</CODE> rules. In fact,
+each context-free rule is a fusion of a <CODE>fun</CODE> and a <CODE>lin</CODE> rule:
+it states simultaneously that
+</P>
+<UL>
+<LI>the label is a function from the nonterminal categories
+ on the right-hand side to the category on the left-hand side;
+ the first rule gives
+<PRE>
+ fun Is : Item -&gt; Quality -&gt; Phrase
+</PRE>
+<LI>trees built by the label are linearized in the way indicated
+ by the right-hand side;
+ the first rule gives
+<PRE>
+ lin Is item quality = {s = item.s ++ "is" ++ quality.s}
+</PRE>
+</UL>
+
+<P>
+The translation from BNF to GF described above is in fact used in
+the GF system to convert BNF grammars into GF. BNF files are recognized
+by the file name suffix <CODE>.cf</CODE>; thus the grammar above can be
+put into a file named <CODE>food.cf</CODE> and read into GF by
+</P>
+<PRE>
+ &gt; import food.cf
+</PRE>
+<P></P>
+<A NAME="toc33"></A>
+<H3>Restrictions of context-free grammars</H3>
+<P>
+Even though we managed to write <CODE>FoodEng</CODE> in the context-free format,
+we cannot do this for GF grammars in general. It is enough to try this
+with <CODE>FoodIta</CODE> at the same time as <CODE>FoodEng</CODE>,
+we lose an important aspect of multilinguality:
+that the order of constituents is defined only in concrete syntax.
+Thus we could not use context-free <CODE>FoodEng</CODE> and <CODE>FoodIta</CODE> in a multilingual
+grammar that supports translation via common abstract syntax: the
+qualification function <CODE>QKind</CODE> has different types in the two
+grammars.
+</P>
+<P>
+In general terms, the separation of concrete and abstract syntax allows
+three deviations from context-free grammar:
+</P>
+<UL>
+<LI><B>permutation</B>: changing the order of constituents
+<LI><B>suppression</B>: omitting constituents
+<LI><B>reduplication</B>: repeating constituents
+</UL>
+
+<P>
+The third property is the one that definitely shows that GF is
+stronger than context-free: GF can define the <B>copy language</B>
+<CODE>{x x | x &lt;- (a|b)*}</CODE>, which is known not to be context-free.
+The other properties have more to do with the kind of trees that
+the grammar can associate with strings: permutation is important
+in multilingual grammars, and suppression is exploited in grammars
+where trees carry some hidden semantic information (see <a href="#chapsix">the sixth chapter</a>
+below).
+</P>
+<P>
+Of course, context-free grammars are also restricted from the
+grammar engineering point of view. They give no support to
+modules, functions, and parameters, which are so central
+for the productivity of GF. Despite the initial conciseness
+of context-free grammars, GF can easily produce grammars where
+30 lines of GF code would need hundreds of lines of
+context-free grammar code to produce; see exercises
+<a href="#secitalian">here</a> and <a href="#sectense">here</a>.
+</P>
+<P>
+<B>Exercise</B>. GF can also interpret unlabelled BNF grammars, by
+creating labels automatically. The right-hand sides of BNF rules
+can moreover be disjunctions, e.g.
+</P>
+<PRE>
+ Quality ::= "fresh" | "Italian" | "very" Quality ;
+</PRE>
+<P>
+Experiment with this format in GF, possibly with a grammar that
+you import from some other source, such as a programming language
+document.
+</P>
+<P>
+<B>Exercise</B>. Define the copy language <CODE>{x x | x &lt;- (a|b)*}</CODE> in GF.
+</P>
+<A NAME="toc34"></A>
+<H2>Modules and files</H2>
+<P>
+GF uses suffixes to recognize different file formats. The most
+important ones are:
+</P>
+<UL>
+<LI>Source files: <I>Modulename</I><CODE>.gf</CODE>
+<LI>Target files: <I>Modulename</I><CODE>.gfc</CODE>
+</UL>
+
+<P>
+When you import <CODE>FoodEng.gf</CODE>, you see the target files being
+generated:
+</P>
+<PRE>
+ &gt; i FoodEng.gf
+ - compiling Food.gf... wrote file Food.gfc 16 msec
+ - compiling FoodEng.gf... wrote file FoodEng.gfc 20 msec
+</PRE>
+<P>
+You also see that the GF program does not only read the file
+<CODE>FoodEng.gf</CODE>, but also all other files that it
+depends on --- in this case, <CODE>Food.gf</CODE>.
+</P>
+<P>
+For each file that is compiled, a <CODE>.gfc</CODE> file
+is generated. The GFC format (="GF Canonical") is the
+"machine code" of GF, which is faster to process than
+GF source files. When reading a module, GF decides whether
+to use an existing <CODE>.gfc</CODE> file or to generate
+a new one, by looking at modification times.
+</P>
+<P>
+<I>In GF version 3, the</I> <CODE>gfc</CODE> <I>format is replaced by the format suffixed</I>
+<CODE>gfo</CODE>, <I>"GF object"</I>.
+</P>
+<P>
+<B>Exercise</B>. What happens when you import <CODE>FoodEng.gf</CODE> for
+a second time? Try this in different situations:
+</P>
+<UL>
+<LI>Right after importing it the first time (the modules are kept in
+ the memory of GF and need no reloading).
+<LI>After issuing the command <CODE>empty</CODE> (<CODE>e</CODE>), which clears the memory
+ of GF.
+<LI>After making a small change in <CODE>FoodEng.gf</CODE>, be it only an added space.
+<LI>After making a change in <CODE>Food.gf</CODE>.
+</UL>
+
+<A NAME="toc35"></A>
+<H2>Using operations and resource modules</H2>
+<A NAME="toc36"></A>
+<H3>The golden rule of functional programming</H3>
+<P>
+When writing a grammar, you have to type lots of
+characters. You have probably
+done this by the copy-and-paste method, which is a universally
+available way to avoid repeating work.
+</P>
+<P>
+However, there is a more elegant way to avoid repeating work than
+the copy-and-paste
+method. The <B>golden rule of functional programming</B> says that
+</P>
+<UL>
+<LI>whenever you find yourself programming by copy-and-paste,
+ write a function instead.
+</UL>
+
+<P>
+A function separates the shared parts of different computations from the
+changing parts, its <B>arguments</B>, or <B>parameters</B>.
+In functional programming languages, such as
+Haskell, it is possible to share much more
+code with functions than in languages such as C and Java, because
+of higher-order functions (functions that takes functions as arguments).
+</P>
+<A NAME="toc37"></A>
+<H3>Operation definitions</H3>
+<P>
+GF is a functional programming language, not only in the sense that
+the abstract syntax is a system of functions (<CODE>fun</CODE>), but also because
+functional programming can be used when defining concrete syntax. This is
+done by using a new form of judgement, with the keyword <CODE>oper</CODE> (for
+<B>operation</B>), distinct from <CODE>fun</CODE> for the sake of clarity.
+Here is a simple example of an operation:
+</P>
+<PRE>
+ oper ss : Str -&gt; {s : Str} = \x -&gt; {s = x} ;
+</PRE>
+<P>
+The operation can be <B>applied</B> to an argument, and GF will
+<B>compute</B> the application into a value. For instance,
+</P>
+<PRE>
+ ss "boy" ===&gt; {s = "boy"}
+</PRE>
+<P>
+We use the symbol <CODE>===</CODE> to indicate how an expression is
+computed into a value; this symbol is not a part of GF.
+</P>
+<P>
+Thus an <CODE>oper</CODE> judgement includes the name of the defined operation,
+its type, and an expression defining it. As for the syntax of the defining
+expression, notice the <B>lambda abstraction</B> form <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>t</I> of
+the function. It reads: function with variable <I>x</I> and <B>function body</B>
+<I>t</I>. Any occurrence of <I>x</I> in <I>t</I> is said to be <B>bound</B> in <I>t</I>.
+</P>
+<P>
+For lambda abstraction with multiple arguments, we have the shorthand
+</P>
+<PRE>
+ \x,y -&gt; t === \x -&gt; \y -&gt; t
+</PRE>
+<P>
+The notation we have used for linearization rules, where
+variables are bound on the left-hand side, is actually syntactic
+sugar for abstraction:
+</P>
+<PRE>
+ lin f x = t === lin f = \x -&gt; t
+</PRE>
+<P></P>
+<A NAME="toc38"></A>
+<H3>The ``resource`` module type</H3>
+<P>
+Operator definitions can be included in a concrete syntax.
+But they are usually not really tied to a particular
+set of linearization rules.
+They should rather be seen as <B>resources</B>
+usable in many concrete syntaxes.
+</P>
+<P>
+The <CODE>resource</CODE> module type is used to package
+<CODE>oper</CODE> definitions into reusable resources. Here is
+an example, with a handful of operations to manipulate
+strings and records.
+</P>
+<PRE>
+ resource StringOper = {
+ oper
+ SS : Type = {s : Str} ;
+ ss : Str -&gt; SS = \x -&gt; {s = x} ;
+ cc : SS -&gt; SS -&gt; SS = \x,y -&gt; ss (x.s ++ y.s) ;
+ prefix : Str -&gt; SS -&gt; SS = \p,x -&gt; ss (p ++ x.s) ;
+ }
+</PRE>
+<P></P>
+<A NAME="toc39"></A>
+<H3>Opening a resource</H3>
+<P>
+Any number of <CODE>resource</CODE> modules can be
+<B>open</B>ed in a <CODE>concrete</CODE> syntax, which
+makes definitions contained
+in the resource usable in the concrete syntax. Here is
+an example, where the resource <CODE>StringOper</CODE> is
+opened in a new version of <CODE>FoodEng</CODE>.
+</P>
+<PRE>
+ concrete FoodEng of Food = open StringOper in {
+
+ lincat
+ S, Item, Kind, Quality = SS ;
+
+ lin
+ Is item quality = cc item (prefix "is" quality) ;
+ This k = prefix "this" k ;
+ That k = prefix "that" k ;
+ QKind k q = cc k q ;
+ Wine = ss "wine" ;
+ Cheese = ss "cheese" ;
+ Fish = ss "fish" ;
+ Very = prefix "very" ;
+ Fresh = ss "fresh" ;
+ Warm = ss "warm" ;
+ Italian = ss "Italian" ;
+ Expensive = ss "expensive" ;
+ Delicious = ss "delicious" ;
+ Boring = ss "boring" ;
+ }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Use the same string operations to write <CODE>FoodIta</CODE>
+more concisely.
+</P>
+<A NAME="toc40"></A>
+<H3>Partial application</H3>
+<P>
+<a name="secpartapp"></a>
+</P>
+<P>
+GF, like Haskell, permits <B>partial application</B> of
+functions. An example of this is the rule
+</P>
+<PRE>
+ lin This k = prefix "this" k ;
+</PRE>
+<P>
+which can be written more concisely
+</P>
+<PRE>
+ lin This = prefix "this" ;
+</PRE>
+<P>
+The first form is perhaps more intuitive to write
+but, once you get used to partial application, you will appreciate its
+conciseness and elegance. The logic of partial application
+is known as <B>currying</B>, with a reference to Haskell B. Curry.
+The idea is that any <I>n</I>-place function can be seen as a 1-place
+function whose value is an <I>n-</I>1 -place function. Thus
+</P>
+<PRE>
+ oper prefix : Str -&gt; SS -&gt; SS ;
+</PRE>
+<P>
+can be used as a 1-place function that takes a <CODE>Str</CODE> into a
+function <CODE>SS -&gt; SS</CODE>. The expected linearization of <CODE>This</CODE> is exactly
+a function of such a type, operating on an argument of type <CODE>Kind</CODE>
+whose linearization is of type <CODE>SS</CODE>. Thus we can define the
+linearization directly as <CODE>prefix "this"</CODE>.
+</P>
+<P>
+An important part of the art of functional programming is to decide the order
+of arguments in a function, so that partial application can be used as much
+as possible. For instance, of the operation <CODE>prefix</CODE> we know that it
+will be typically applied to linearization variables with constant strings.
+This is the reason to put the <CODE>Str</CODE> argument before the <CODE>SS</CODE> argument --- not
+the prefixity. A <CODE>postfix</CODE> function would have exactly the same order of arguments.
+</P>
+<P>
+<B>Exercise</B>. Define an operation <CODE>infix</CODE> analogous to <CODE>prefix</CODE>,
+such that it allows you to write
+</P>
+<PRE>
+ lin Is = infix "is" ;
+</PRE>
+<P></P>
+<A NAME="toc41"></A>
+<H3>Testing resource modules</H3>
+<P>
+To test a <CODE>resource</CODE> module independently, you must import it
+with the flag <CODE>-retain</CODE>, which tells GF to retain <CODE>oper</CODE> definitions
+in the memory; the usual behaviour is that <CODE>oper</CODE> definitions
+are just applied to compile linearization rules
+(this is called <B>inlining</B>) and then thrown away.
+</P>
+<PRE>
+ &gt; import -retain StringOper.gf
+</PRE>
+<P>
+The command <CODE>compute_concrete = cc</CODE> computes any expression
+formed by operations and other GF constructs. For example,
+</P>
+<PRE>
+ &gt; compute_concrete prefix "in" (ss "addition")
+ {
+ s : Str = "in" ++ "addition"
+ }
+</PRE>
+<P></P>
+<A NAME="toc42"></A>
+<H2>Grammar architecture</H2>
+<P>
+<a name="secarchitecture"></a>
+</P>
+<A NAME="toc43"></A>
+<H3>Extending a grammar</H3>
+<P>
+The module system of GF makes it possible to write a new module that <B>extend</B>s
+an old one. The syntax of extension is
+shown by the following example. We extend <CODE>Food</CODE> into <CODE>MoreFood</CODE> by
+adding a category of questions and two new functions.
+</P>
+<PRE>
+ abstract Morefood = Food ** {
+ cat
+ Question ;
+ fun
+ QIs : Item -&gt; Quality -&gt; Question ;
+ Pizza : Kind ;
+
+ }
+</PRE>
+<P>
+Parallel to the abstract syntax, extensions can
+be built for concrete syntaxes:
+</P>
+<PRE>
+ concrete MorefoodEng of Morefood = FoodEng ** {
+ lincat
+ Question = {s : Str} ;
+ lin
+ QIs item quality = {s = "is" ++ item.s ++ quality.s} ;
+ Pizza = {s = "pizza"} ;
+ }
+</PRE>
+<P>
+The effect of extension is that all of the contents of the extended
+and extending module are put together. We also say that the new
+module <B>inherits</B> the contents of the old module.
+</P>
+<P>
+At the same time as extending a module of the same type, a concrete
+syntax module may open resources. Since <CODE>open</CODE> takes effect in
+the module body and not in the extended module, its logical place
+in the module header is after the extend part:
+</P>
+<PRE>
+ concrete MorefoodIta of Morefood = FoodIta ** open StringOper in {
+ lincat
+ Question = SS ;
+ lin
+ QIs item quality = ss (item.s ++ "è" ++ quality.s) ;
+ Pizza = ss "pizza" ;
+ }
+</PRE>
+<P>
+Resource modules can extend other resource modules, in the
+same way as modules of other types can extend modules of the
+same type. Thus it is possible to build resource hierarchies.
+</P>
+<A NAME="toc44"></A>
+<H3>Multiple inheritance</H3>
+<P>
+Specialized vocabularies can be represented as small grammars that
+only do "one thing" each. For instance, the following are grammars
+for fruit and mushrooms
+</P>
+<PRE>
+ abstract Fruit = {
+ cat Fruit ;
+ fun Apple, Peach : Fruit ;
+ }
+
+ abstract Mushroom = {
+ cat Mushroom ;
+ fun Cep, Agaric : Mushroom ;
+ }
+</PRE>
+<P>
+They can afterwards be combined into bigger grammars by using
+<B>multiple inheritance</B>, i.e. extension of several grammars at the
+same time:
+</P>
+<PRE>
+ abstract Foodmarket = Food, Fruit, Mushroom ** {
+ fun
+ FruitKind : Fruit -&gt; Kind ;
+ MushroomKind : Mushroom -&gt; Kind ;
+ }
+</PRE>
+<P>
+The main advantages with splitting a grammar to modules are
+<B>reusability</B>, <B>separate compilation</B>, and <B>division of labour</B>.
+Reusability means
+that one and the same module can be put into different uses; for instance,
+a module with mushroom names might be used in a mycological information system
+as well as in a restaurant phrasebook. Separate compilation means that a module
+once compiled into <CODE>.gfc</CODE> need not be compiled again unless changes have
+taken place.
+Division of labour means simply that programmers that are experts in
+special areas can work on modules belonging to those areas.
+</P>
+<P>
+<B>Exercise</B>. Refactor <CODE>Food</CODE> by taking apart <CODE>Wine</CODE> into a special
+<CODE>Drink</CODE> module.
+</P>
+<A NAME="toc45"></A>
+<H3>Visualizing module structure</H3>
+<P>
+When you have created all the abstract syntaxes and
+one set of concrete syntaxes needed for <CODE>Foodmarket</CODE>,
+your grammar consists of eight GF modules. To see how their
+dependences look like, you can use the command
+<CODE>visualize_graph = vg</CODE>,
+</P>
+<PRE>
+ &gt; visualize_graph
+</PRE>
+<P>
+and the graph will pop up in a separate window:
+</P>
+<P>
+<IMG ALIGN="middle" SRC="foodmarket.png" BORDER="0" ALT="">
+</P>
+<P>
+The graph uses
+</P>
+<UL>
+<LI>oval boxes for abstract modules
+<LI>square boxes for concrete modules
+<LI>black-headed arrows for inheritance
+<LI>white-headed arrows for the concrete-of-abstract relation
+</UL>
+
+<P>
+Just as the <CODE>visualize_tree = vt</CODE> command, the freely available tools
+Ghostview and Graphviz are needed. As an alternative, you can again print
+the graph into a <CODE>.dot</CODE> file by using the command <CODE>print_multi = pm</CODE>:
+</P>
+<PRE>
+ &gt; print_multi -printer=graph | write_file Foodmarket.dot
+ &gt; ! dot -Tpng Foodmarket.dot &gt; Foodmarket.png
+</PRE>
+<P></P>
+<A NAME="toc46"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc47"></A>
+<H3>Modules</H3>
+<P>
+The general form of a module is
+<center>
+ <I>Moduletype</I> <I>M</I> <I>Of</I> <CODE>=</CODE> (<I>Extends</I> <CODE>**</CODE>)? (<CODE>open</CODE> <I>Opens</I> <CODE>in</CODE>)? <I>Body</I>
+</center>
+where <I>Moduletype</I> is one of <CODE>abstract</CODE>, <CODE>concrete</CODE>, and <CODE>resource</CODE>.
+</P>
+<P>
+If <I>Moduletype</I> is <CODE>concrete</CODE>, the <I>Of</I>-part has the form <CODE>of</CODE> <I>A</I>,
+where <I>A</I> is the name of an abstract module. Otherwise it is empty.
+</P>
+<P>
+The name of the module is given by the identifier <I>M</I>.
+</P>
+<P>
+The optional <I>Extends</I> part is a comma-separated
+list of module names, which have to be modules of
+the same <I>Moduletype</I>. The contents of these modules are <B>inherited</B> by
+<I>M</I>. This means that they are both usable in <I>Body</I> and exported by <I>M</I>,
+i.e. inherited when <I>M</I> is inherited and available when <I>M</I> is opened.
+(Exception: <CODE>oper</CODE> and <CODE>param</CODE> judgements are not inherited from
+<CODE>concrete</CODE> modules.)
+</P>
+<P>
+The optional <I>Opens</I> part is a comma-separated
+list of resource module names. The contents of these
+modules are usable in the <I>Body</I>, but they are not exported.
+</P>
+<P>
+Opening can be <B>qualified</B>, e.g.
+</P>
+<PRE>
+ concrete C of A = open (P = Prelude) in ...
+</PRE>
+<P>
+This means that the names from <CODE>Prelude</CODE> are only available in the form
+<CODE>P.name</CODE>. This form of qualifying a name is always possible, and it can
+be used to resolve <B>name conflicts</B>, which result when the same name is
+declared in more than one module that is in scope.
+</P>
+<A NAME="toc48"></A>
+<H3>Judgements</H3>
+<P>
+The <I>Body</I> part consists of judgements. The judgement form table #secjment
+is extended with the following forms:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>form</TH>
+<TH>reading</TH>
+<TH COLSPAN="2">module type</TH>
+</TR>
+<TR>
+<TD ALIGN="center"><CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> <CODE>=</CODE> <I>t</I></TD>
+<TD>operation <I>h</I> of type <I>T</I> is defined as <I>t</I></TD>
+<TD>resource, concrete</TD>
+</TR>
+<TR>
+<TD ALIGN="right"><CODE>param</CODE> <I>P</I> <CODE>=</CODE> <I>C1</I> <CODE>|</CODE> ... <CODE>|</CODE> <I>Cn</I></TD>
+<TD>parameter type P has constructors <I>C1...Cn</I></TD>
+<TD>resource, concrete</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The <CODE>param</CODE> judgement will be explained in the next chapter.
+</P>
+<P>
+The type part of an <CODE>oper</CODE> judgement can be omitted, if the type can be inferred
+by the GF compiler.
+</P>
+<PRE>
+ oper hello = "hello" ++ "world" ;
+</PRE>
+<P>
+As a rule, type inference works for all terms except lambda abstracts.
+</P>
+<P>
+<B>Lambda abstracts</B> are expressions of the form <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>t</I>,
+where <I>x</I> is a variable <B>bound</B> in the expression <I>t</I>, which is the
+<B>body</B> of the lambda abstract. The type of the lambda abstract is
+<I>A</I> <CODE>-&gt;</CODE><I>B</I>, where <I>A</I> is the type of the variable <CODE>x</CODE> and
+<I>B</I> the type of the body <I>t</I>.
+</P>
+<P>
+For multiple lambda abstractions, there is a shorthand
+</P>
+<PRE>
+ \x,y -&gt; t === \x -&gt; \y -&gt; t
+</PRE>
+<P>
+For <CODE>lin</CODE> judgements, there is the shorthand
+</P>
+<PRE>
+ lin f x = t === lin f = \x -&gt; t
+</PRE>
+<P></P>
+<A NAME="toc49"></A>
+<H3>Free variation</H3>
+<P>
+The <CODE>variants</CODE> construct of GF can be used to give a list of
+concrete syntax terms, of the same type, in free variation. For example,
+</P>
+<PRE>
+ variants {["does not"] ; "doesn't"}
+</PRE>
+<P>
+A limiting case is the empty variant list <CODE>variants {}</CODE>.
+</P>
+<A NAME="toc50"></A>
+<H3>The context-free grammar format</H3>
+<P>
+The <CODE>.cf</CODE> file format is used for <B>context-free grammars</B>, which are
+always interpretable as GF grammars. Files of this format consist of
+rules of the form
+<center>
+ (<I>Label</I> <CODE>.</CODE>)? <I>Cat</I> <CODE>::=</CODE> <I>RHS</I> <CODE>;</CODE>
+</center>
+where the <I>RHS</I> is a sequence of terminals (quoted strings) and
+nonterminals (identifiers). The optional <I>Label</I> gives the abstract
+syntax function created. If it is omitted, a function name is generated
+automatically. Then it is also possible to have more than one <I>RHS</I>,
+separated by <I>|</I>. An empty <I>RHS</I> is interpreted as an empty sequence
+of terminals, not as an empty disjunction.
+</P>
+<P>
+The <B>Extended BNF</B> format (<B>EBNF</B>) can also be used, in files suffixed <CODE>.ebnf</CODE>.
+This format does not allow user-written labels. The right-hand-side of a rule
+can contain everything that is possible in the <CODE>.cf</CODE> format, but also
+optional parts (<CODE>p ?</CODE>), sequences (<CODE>p *</CODE>) and non-empty sequences (<CODE>p +</CODE>).
+For example, the phrases in <CODE>FoodEng</CODE> could be recognized with the following
+EBNF grammar:
+</P>
+<PRE>
+ Phrase ::=
+ ("this" | "that") Quality* ("wine" | "cheese" | "fish") "is" Quality ;
+ Quality ::=
+ ("very"* ("fresh" | "warm" | "boring" | "Italian" | "expensive")) ;
+</PRE>
+<P></P>
+<A NAME="toc51"></A>
+<H3>Character encoding</H3>
+<P>
+The default encoding is iso-latin-1. UTF-8 can be set by the flag <CODE>coding=utf8</CODE>
+in the grammar. The resource grammar libraries are in iso-latin-1, except Russian
+and Arabic, which are in UTF-8. The resources may be changed to UTF-8 in future.
+Letters in identifiers must currently be iso-latin-1.
+</P>
+<A NAME="toc52"></A>
+<H1>Grammars with parameters</H1>
+<P>
+<a name="chapfour"></a>
+</P>
+<P>
+In this chapter, we will introduce the techniques needed for
+describing the inflection of words, as well as the rules by
+which correct word forms are selected in syntactic combinations.
+These techniques are already needed in a very slight extension
+of the Food grammar of the previous chapter. While explaining
+how the linguistic problems are solved for English and Italian,
+we also cover all the language constructs GF has for
+defining concrete syntax.
+</P>
+<P>
+It is in principle possible to skip this chapter and go directly
+to the next, since the use of the GF Resource Grammar library
+makes it unnecessary to use any more constructs of GF than we
+have already covered: parameters could be left to library implementors.
+</P>
+<A NAME="toc53"></A>
+<H2>The problem: words have to be inflected</H2>
+<P>
+Suppose we want to say, with the vocabulary included in
+<CODE>Food.gf</CODE>, things like
+<center>
+<I>these Italian wines are delicious</I>
+</center>
+The new grammatical facility we need are the plural forms
+of nouns and verbs (<I>wines, are</I>), as opposed to their
+singular forms.
+</P>
+<P>
+The introduction of plural forms requires two things:
+</P>
+<UL>
+<LI>the <B>inflection</B> of nouns and verbs in singular and plural
+<LI>the <B>agreement</B> of the verb to subject:
+ the verb must have the same number as the subject
+</UL>
+
+<P>
+Different languages have different types of inflection and agreement.
+For instance, Italian has also agreement in gender (masculine vs. feminine).
+In a multilingual grammar,
+we want to express such differences between languages in the
+concrete syntax while ignoring them in the abstract syntax.
+</P>
+<P>
+To be able to do all this, we need one new judgement form
+and some new expression forms.
+We also need to generalize linearization types
+from strings to more complex types.
+</P>
+<P>
+<B>Exercise</B>. Make a list of the possible forms that nouns,
+adjectives, and verbs can have in some languages that you know.
+</P>
+<A NAME="toc54"></A>
+<H2>Parameters and tables</H2>
+<P>
+We define the <B>parameter type</B> of number in English by
+using a new form of judgement:
+</P>
+<PRE>
+ param Number = Sg | Pl ;
+</PRE>
+<P>
+This judgement defines the parameter type <CODE>Number</CODE> by listing
+its two <B>constructors</B>, <CODE>Sg</CODE> and <CODE>Pl</CODE> (common shorthands for
+singular and plural).
+</P>
+<P>
+To state that <CODE>Kind</CODE> expressions in English have a linearization
+depending on number, we replace the linearization type <CODE>{s : Str}</CODE>
+with a type where the <CODE>s</CODE> field is a <B>table</B> depending on number:
+</P>
+<PRE>
+ lincat Kind = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+The <B>table type</B> <CODE>Number =&gt; Str</CODE> is in many respects similar to
+a function type (<CODE>Number -&gt; Str</CODE>). The main difference is that the
+argument type of a table type must always be a parameter type. This means
+that the argument-value pairs can be listed in a finite table. The following
+example shows such a table:
+</P>
+<PRE>
+ lin Cheese = {
+ s = table {
+ Sg =&gt; "cheese" ;
+ Pl =&gt; "cheeses"
+ }
+ } ;
+</PRE>
+<P>
+The table consists of <B>branches</B>, where a <B>pattern</B> on the
+left of the arrow <CODE>=&gt;</CODE> is assigned a <B>value</B> on the right.
+</P>
+<P>
+The application of a table to a parameter is done by the <B>selection</B>
+operator <CODE>!</CODE>, which is computed by <B>pattern matching</B>: it returns
+the value from the first branch whose pattern matches the
+selection argument. For instance,
+</P>
+<PRE>
+ table {Sg =&gt; "cheese" ; Pl =&gt; "cheeses"} ! Pl
+ ===&gt; "cheeses"
+</PRE>
+<P>
+As syntactic sugar for table selections, we can define the
+<B>case expressions</B>, which are common in functional programming and also
+handy to use in GF.
+</P>
+<PRE>
+ case e of {...} === table {...} ! e
+</PRE>
+<P></P>
+<P>
+A parameter type can have any number of constructors, and these can
+also take arguments from other parameter types. For instance, an accurate
+type system for English verbs (except <I>be</I>) is
+</P>
+<PRE>
+ param VerbForm = VPresent Number | VPast | VPastPart | VPresPart ;
+</PRE>
+<P>
+This system expresses accurately the fact that only the present tense has
+number variation. (Agreement also requires variation in person, but
+this can be defined in syntax rules, by picking the singular form for third person
+singular subjects and the plural forms for all others). As an example of
+a table, here are the forms of the verb <I>drink</I>:
+</P>
+<PRE>
+ table {
+ VPresent Sg =&gt; "drinks" ;
+ VPresent Pl =&gt; "drink" ;
+ VPast =&gt; "drank" ;
+ VPastPart =&gt; "drunk" ;
+ VPresPart =&gt; "drinking"
+ }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. In an earlier exercise (previous section),
+you made a list of the possible
+forms that nouns, adjectives, and verbs can have in some languages that
+you know. Now take some of the results and implement them by
+using parameter type definitions and tables. Write them into a <CODE>resource</CODE>
+module, which you can test by using the command <CODE>compute_concrete</CODE>.
+</P>
+<A NAME="toc55"></A>
+<H2>Inflection tables and paradigms</H2>
+<P>
+All English common nouns are inflected for number, most of them in the
+same way: the plural form is obtained from the singular by adding the
+ending <I>s</I>. This rule is an example of
+a <B>paradigm</B> --- a formula telling how a class of words is inflected.
+</P>
+<P>
+From the GF point of view, a paradigm is a function that takes
+a <B>lemma</B> --- also known as a <B>dictionary form</B> or a <B>citation form</B> --- and
+returns an inflection
+table of desired type. Paradigms are not functions in the sense of the
+<CODE>fun</CODE> judgements of abstract syntax (which operate on trees and not
+on strings), but operations defined in <CODE>oper</CODE> judgements.
+The following operation defines the regular noun paradigm of English:
+</P>
+<PRE>
+ oper regNoun : Str -&gt; {s : Number =&gt; Str} = \dog -&gt; {
+ s = table {
+ Sg =&gt; dog ;
+ Pl =&gt; dog + "s"
+ }
+ } ;
+</PRE>
+<P>
+The <B>gluing</B> operator <CODE>+</CODE> tells that
+the string held in the variable <CODE>dog</CODE> and the ending <CODE>"s"</CODE>
+are written together to form one <B>token</B>. Thus, for instance,
+</P>
+<PRE>
+ (regNoun "cheese").s ! Pl ===&gt; "cheese" + "s" ===&gt; "cheeses"
+</PRE>
+<P>
+A more complex example are regular verbs:
+</P>
+<PRE>
+ oper regVerb : Str -&gt; {s : VerbForm =&gt; Str} = \talk -&gt; {
+ s = table {
+ VPresent Sg =&gt; talk + "s" ;
+ VPresent Pl =&gt; talk ;
+ VPresPart =&gt; talk + "ing" ;
+ _ =&gt; talk + "ed"
+ }
+ } ;
+</PRE>
+<P>
+Notice how a catch-all case for the past tense and the past participle
+is expressed by using a <B>wild card</B> pattern <CODE>_</CODE>. Here again, pattern matching
+tries all patterns in order until it finds a matching pattern;
+and it is the wild card that is the first match for both <CODE>VPast</CODE> and
+<CODE>VPastPart</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Identify cases in which the <CODE>regNoun</CODE> paradigm does not
+apply in English, and implement some alternative paradigms.
+</P>
+<P>
+<B>Exercise</B>. Implement some regular paradigms for other languages you have
+considered in earlier exercises.
+</P>
+<A NAME="toc56"></A>
+<H2>Using parameters in concrete syntax</H2>
+<P>
+We can now enrich the concrete syntax definitions to
+comprise morphology. This will permit a more radical
+variation between languages (e.g. English and Italian)
+than just the use of different words. In general,
+parameters and linearization types are different in
+different languages --- but this does not prevent using a
+the common abstract syntax.
+</P>
+<P>
+We consider a grammar <CODE>Foods</CODE>, which is similar to
+<CODE>Food</CODE>, with the addition two rules for forming plural items:
+</P>
+<PRE>
+ fun These, Those : Kind -&gt; Item ;
+</PRE>
+<P>
+We also add a noun which in Italian has the feminine case; all nouns in
+<CODE>Food</CODE> were carefully chosen to be masculine!
+</P>
+<PRE>
+ fun Pizza : Kind ;
+</PRE>
+<P>
+This noun will force us to deal with gender in the Italian grammar,
+which is what we need for the grammar to scale up for larger applications.
+</P>
+<A NAME="toc57"></A>
+<H3>Agreement</H3>
+<P>
+In the English <CODE>Foods</CODE> grammar, we need just one type of parameters:
+<CODE>Number</CODE> as defined above. The phrase-forming rule
+</P>
+<PRE>
+ fun Is : Item -&gt; Quality -&gt; Phrase ;
+</PRE>
+<P>
+is affected by the number because of <B>subject-verb agreement</B>.
+In English, agreement says that the verb of a sentence
+must be inflected in the number of the subject. Thus we will linearize
+</P>
+<PRE>
+ Is (This Pizza) Warm ===&gt; "this pizza is warm"
+ Is (These Pizza) Warm ===&gt; "these pizzas are warm"
+</PRE>
+<P>
+Here it is the <B>copula</B>, i.e. the verb <I>be</I> that is affected. We define
+the copula as the operation
+</P>
+<PRE>
+ oper copula : Number -&gt; Str = \n -&gt;
+ case n of {
+ Sg =&gt; "is" ;
+ Pl =&gt; "are"
+ } ;
+</PRE>
+<P>
+We don't need to inflect the copula for person and tense in this grammar.
+</P>
+<P>
+The form of the copula in a sentence depends on the
+<B>subject</B> of the sentence, i.e. the item
+that is qualified. This means that an <CODE>Item</CODE> must have such a number to provide.
+The obvious way to guarantee this is by including a number field in
+the linearization type:
+</P>
+<PRE>
+ lincat Item = {s : Str ; n : Number} ;
+</PRE>
+<P>
+Now we can write precisely the <CODE>Is</CODE> rule that expresses agreement:
+</P>
+<PRE>
+ lin Is item qual = {s = item.s ++ copula item.n ++ qual.s} ;
+</PRE>
+<P>
+The copula receives the number that it needs from the subject item.
+</P>
+<A NAME="toc58"></A>
+<H3>Determiners</H3>
+<P>
+Let us turn to <CODE>Item</CODE> subjects and see how they receive their
+numbers. The two rules
+</P>
+<PRE>
+ fun This, These : Kind -&gt; Item ;
+</PRE>
+<P>
+form <CODE>Item</CODE>s from <CODE>Kind</CODE>s by adding <B>determiners</B>, either
+<I>this</I> or <I>these</I>. The determiners
+require different numbers of their <CODE>Kind</CODE> arguments: <CODE>This</CODE>
+requires the singular (<I>this pizza</I>) and <CODE>These</CODE> the plural
+(<I>these pizzas</I>). The <CODE>Kind</CODE> is the same in both cases: <CODE>Pizza</CODE>.
+Thus a <CODE>Kind</CODE> must have both singular and plural forms.
+The obvious way to express this is by using a table:
+</P>
+<PRE>
+ lincat Kind = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+The linearization rules for <CODE>This</CODE> and <CODE>These</CODE> can now be written
+</P>
+<PRE>
+ lin This kind = {
+ s = "this" ++ kind.s ! Sg ;
+ n = Sg
+ } ;
+
+ lin These kind = {
+ s = "these" ++ kind.s ! Pl ;
+ n = Pl
+ } ;
+</PRE>
+<P>
+The grammatical relation between the determiner and the noun is similar to
+agreement, but due to some differences into which we don't go here
+it is often called <B>government</B>.
+</P>
+<P>
+Since the same pattern for determination is used four times in
+the <CODE>FoodsEng</CODE> grammar, we codify it as an operation,
+</P>
+<PRE>
+ oper det :
+ Str -&gt; Number -&gt; {s : Number =&gt; Str} -&gt; {s : Str ; n : Number} =
+ \det,n,kind -&gt; {
+ s = det ++ kind.s ! n ;
+ n = n
+ } ;
+</PRE>
+<P>
+Now we can write, for instance,
+</P>
+<PRE>
+ lin This = det Sg "this" ;
+ lin These = det Pl "these" ;
+</PRE>
+<P>
+Notice the order of arguments that permits partial
+application (<a href="#secpartapp">here</a>).
+</P>
+<P>
+In a more <B>lexicalized</B> grammar, determiners would be made into a
+category of their own and given an inherent number:
+</P>
+<PRE>
+ lincat Det = {s : Str ; n : Number} ;
+ fun Det : Det -&gt; Kind -&gt; Item ;
+ lin Det det kind = {
+ s = det.s ++ kind.s ! det.n ;
+ n = det.n
+ } ;
+</PRE>
+<P>
+Linguistically motivated grammars, such as the GF resource grammars,
+usually favour lexicalized treatments of words; see <a href="#seclexical">here</a> below.
+Notice that the fields of the record in <CODE>Det</CODE> are precisely the two
+arguments needed in the <CODE>det</CODE> operation.
+</P>
+<A NAME="toc59"></A>
+<H3>Parametric vs. inherent features</H3>
+<P>
+<CODE>Kind</CODE>s, as in general <B>common nouns</B> in English, have both singular
+and plural forms; what form is chosen is determined by the construction
+in which the noun is used. We say that the number is a
+<B>parametric feature</B> of nouns. In GF, parametric features
+appear as argument types of tables in linearization types.
+</P>
+<PRE>
+ lincat Kind = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+<CODE>Item</CODE>s, as in general <B>noun phrases</B> in English, don't
+have variation in number. The number is instead an <B>inherent feature</B>,
+which the noun phrase passes to the verb. In GF, inherent features
+appear as record fields in linearization types.
+</P>
+<PRE>
+ lincat Item = {s : Str ; n : Number} ;
+</PRE>
+<P>
+A category can have both parametric and inherent features. As we will see
+in the Italian <CODE>Foods</CODE> grammar, nouns have parametric number and
+inherent gender:
+</P>
+<PRE>
+ lincat Kind = {s : Number =&gt; Str ; g : Gender} ;
+</PRE>
+<P>
+Nothing prevents the same parameter type from appearing both
+as parametric and inherent feature, or the appearance of several inherent
+features of the same type, etc. Determining the linearization types
+of categories is one of the most crucial steps in the design of a GF
+grammar. These two conditions must be in balance:
+</P>
+<UL>
+<LI>existence: what forms are possible to build by morphological and
+ other means?
+<LI>need: what features are expected via agreement or government?
+</UL>
+
+<P>
+Grammar books and dictionaries give good advice on existence; for instance,
+an Italian dictionary has entries such as
+<center>
+<B>uomo</B>, pl. <I>uomini</I>, n.m. "man"
+</center>
+which tells that <I>uomo</I> is a masculine noun with the plural form <I>uomini</I>.
+From this alone, or with a couple more examples, we can generalize to the type
+for all nouns in Italian: they have both singular and plural forms and thus
+a parametric number, and they have an inherent gender.
+</P>
+<P>
+The distinction between parametric and inherent features can be stated in
+object-oriented programming terms: a linearization type is like a <B>class</B>,
+which has a <B>method</B> for linearization and also some <B>attributes</B>.
+In this class, the parametric features appear as arguments to the
+linearization method, whereas the inherent features appear as attributes.
+</P>
+<P>
+For words, inherent features are usually given <I>ad hoc</I> as lexical information.
+For combinations, they are typically <I>inherited</I> from some part of the construction.
+For instance, qualified noun constructs in Italian inherit their gender from noun part
+(called the <B>head</B> of the construction in linguistics):
+</P>
+<PRE>
+ lin QKind qual kind =
+ let gen = kind.g in {
+ s = table {n =&gt; kind.s ! n ++ qual.s ! gen ! n} ;
+ g = gen
+ } ;
+</PRE>
+<P>
+This rule uses a <B>local definition</B> (also known as a <B>let expression</B>) to
+avoid computing <CODE>kind.g</CODE> twice, and also to express the linguistic
+generalization that it is the same gender that is both passed to
+the adjective and inherited by the construct.
+The parametric number feature is in this rule passed to both the noun and
+the adjective. In the table, a <B>variable pattern</B> is used to match
+any possible number. Variables introduced in patterns are in scope in
+the right-hand sides of corresponding branches. Again, it is good to
+use a variable to express the linguistic generalization that the number
+is passed to the parts, rather than expand the table into <CODE>Sg</CODE> and <CODE>Pl</CODE>
+branches.
+</P>
+<P>
+Sometimes the puzzle of making agreement and government work in a grammar has
+several solutions. For instance, <B>precedence</B> in programming languages can
+be equivalently described by a parametric or an inherent feature
+(see <a href="#secprecedence">here</a> below).
+</P>
+<P>
+In natural language applications that use the resource grammar library,
+all parameters are hidden from the user, who thereby does not need to bother
+about them. The only thing that she has to think about is what linguistic
+categories are given as linearization types to each semantic category.
+</P>
+<P>
+For instance, the GF resource grammar library has a category <CODE>NP</CODE> of
+noun phrases, <CODE>AP</CODE> of adjectival phrases, and <CODE>Cl</CODE> of sentence-like clauses.
+In the implementation of <CODE>Foods</CODE> <a href="#secenglish">here</a>, we will define
+</P>
+<PRE>
+ lincat Phrase = Cl ; Item = NP ; Quality = AP ;
+</PRE>
+<P>
+To express that an item has a quality, we will use a resource function
+</P>
+<PRE>
+ mkCl : NP -&gt; AP -&gt; Cl ;
+</PRE>
+<P>
+in the linearization rule:
+</P>
+<PRE>
+ lin Is = mkCl ;
+</PRE>
+<P>
+In this way, we have no need to think about parameters and agreement.
+<a href="#chapfive">the fifth chapter</a> will show a complete implementation of <CODE>Foods</CODE> by the
+resource grammar, port it to many new languages, and extend it with
+many new constructs.
+</P>
+<A NAME="toc60"></A>
+<H2>An English concrete syntax for Foods with parameters</H2>
+<P>
+We repeat some of the rules above by showing the entire
+module <CODE>FoodsEng</CODE>, equipped with parameters. The parameters and
+operations are, for the sake of brevity, included in the same module
+and not in a separate <CODE>resource</CODE>. However, some string operations
+from the library <CODE>Prelude</CODE> are used.
+</P>
+<PRE>
+ --# -path=.:prelude
+
+ concrete FoodsEng of Foods = open Prelude in {
+
+ lincat
+ S, Quality = SS ;
+ Kind = {s : Number =&gt; Str} ;
+ Item = {s : Str ; n : Number} ;
+
+ lin
+ Is item quality = ss (item.s ++ copula item.n ++ quality.s) ;
+ This = det Sg "this" ;
+ That = det Sg "that" ;
+ These = det Pl "these" ;
+ Those = det Pl "those" ;
+ QKind quality kind = {s = table {n =&gt; quality.s ++ kind.s ! n}} ;
+ Wine = regNoun "wine" ;
+ Cheese = regNoun "cheese" ;
+ Fish = noun "fish" "fish" ;
+ Pizza = regNoun "pizza" ;
+ Very = prefixSS "very" ;
+ Fresh = ss "fresh" ;
+ Warm = ss "warm" ;
+ Italian = ss "Italian" ;
+ Expensive = ss "expensive" ;
+ Delicious = ss "delicious" ;
+ Boring = ss "boring" ;
+
+ param
+ Number = Sg | Pl ;
+
+ oper
+ det : Number -&gt; Str -&gt; {s : Number =&gt; Str} -&gt; {s : Str ; n : Number} =
+ \n,d,cn -&gt; {
+ s = d ++ cn.s ! n ;
+ n = n
+ } ;
+ noun : Str -&gt; Str -&gt; {s : Number =&gt; Str} =
+ \man,men -&gt; {s = table {
+ Sg =&gt; man ;
+ Pl =&gt; men
+ }
+ } ;
+ regNoun : Str -&gt; {s : Number =&gt; Str} =
+ \car -&gt; noun car (car + "s") ;
+ copula : Number -&gt; Str =
+ \n -&gt; case n of {
+ Sg =&gt; "is" ;
+ Pl =&gt; "are"
+ } ;
+ }
+</PRE>
+<P>
+To find the Prelude library --- or in general,
+GF files located in other directories, a <B>path directive</B> is needed
+either on the command line or as the first line of
+the topmost file compiled.
+The paths in the path list are separated by colons (<CODE>:</CODE>), and every item
+is interpreted primarily relative to the current directory and, secondarily,
+to the value of <CODE>GF_LIB_PATH</CODE> (<B>GF library path</B>). Hence it is a
+good idea to make <CODE>GF_LIB_PATH</CODE> to point into your <CODE>GF/lib/</CODE> whenever
+you start working in GF. For instance, in the Bash shell this is done by
+</P>
+<PRE>
+ % export GF_LIB_PATH=&lt;the location of GF/lib in your file system&gt;
+</PRE>
+<P></P>
+<A NAME="toc61"></A>
+<H2>More on inflection paradigms</H2>
+<P>
+<a name="secinflection"></a>
+</P>
+<P>
+Let us try to extend the English noun paradigms so that we can
+deal with all nouns, not just the regular ones. The goal is to
+provide a morphology module that is maximally easy to use when
+words are added to the lexicon. In fact, we can think of a
+division of labour where a linguistically trained grammarian
+writes a morphology and hands it over to the lexicon writer
+who knows much less about the rules of inflection.
+</P>
+<P>
+In passing, we will introduce some new GF constructs: local definitions,
+regular expression patterns, and operation overloading.
+</P>
+<A NAME="toc62"></A>
+<H3>Worst-case functions</H3>
+<P>
+To start with, it is useful to perform <B>data abstraction</B> from the type
+of nouns by writing a constructor operation, a <B>worst-case function</B>:
+</P>
+<PRE>
+ oper mkNoun : Str -&gt; Str -&gt; Noun = \x,y -&gt; {
+ s = table {
+ Sg =&gt; x ;
+ Pl =&gt; y
+ }
+ } ;
+</PRE>
+<P>
+This presupposes that we have defined
+</P>
+<PRE>
+ oper Noun : Type = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+Using <CODE>mkNoun</CODE>, we can define
+</P>
+<PRE>
+ lin Mouse = mkNoun "mouse" "mice" ;
+</PRE>
+<P>
+and
+</P>
+<PRE>
+ oper regNoun : Str -&gt; Noun = \x -&gt; mkNoun x (x + "s") ;
+</PRE>
+<P>
+instead of writing the inflection tables explicitly.
+</P>
+<P>
+Nouns like <I>mouse</I>-<I>mice</I>, are so irregular that
+it hardly makes sense to see them as instances of a
+paradigm that forms the plural from the singular form.
+But in general, as we will see, there can be different
+regular patterns in a language.
+</P>
+<P>
+The grammar engineering advantage of worst-case functions is that
+the author of the resource module may change the definitions of
+<CODE>Noun</CODE> and <CODE>mkNoun</CODE>, and still retain the
+interface (i.e. the system of type signatures) that makes it
+correct to use these functions in concrete modules. In programming
+terms, <CODE>Noun</CODE> is then treated as an <B>abstract datatype</B>:
+its definition is not available, but only an indirect way of constructing
+its objects.
+</P>
+<P>
+A case where a change of the <CODE>Noun</CODE> type could
+actually happen is if we introduces <B>case</B> (nominative or
+genitive) in the noun inflection:
+</P>
+<PRE>
+ param Case = Nom | Gen ;
+
+ oper Noun : Type = {s : Number =&gt; Case =&gt; Str} ;
+</PRE>
+<P>
+Now we have to redefine the worst-case function
+</P>
+<PRE>
+ oper mkNoun : Str -&gt; Str -&gt; Noun = \x,y -&gt; {
+ s = table {
+ Sg =&gt; table {
+ Nom =&gt; x ;
+ Gen =&gt; x + "'s"
+ } ;
+ Pl =&gt; table {
+ Nom =&gt; y ;
+ Gen =&gt; y + case last y of {
+ "s" =&gt; "'" ;
+ _ =&gt; "'s"
+ }
+ }
+ } ;
+</PRE>
+<P>
+But up from this level, we can retain the old definitions
+</P>
+<PRE>
+ lin Mouse = mkNoun "mouse" "mice" ;
+ oper regNoun : Str -&gt; Noun = \x -&gt; mkNoun x (x + "s") ;
+</PRE>
+<P>
+which will just compute to different values now.
+</P>
+<P>
+In the last definition of <CODE>mkNoun</CODE>, we used a case expression
+on the last character of the plural form to decide if the genitive
+should be formed with an <CODE>'</CODE> (as in <I>dogs</I>-<I>dogs'</I>) or with
+<CODE>'s</CODE> (as in <I>mice</I>-<I>mice's</I>). The expression <CODE>last y</CODE>
+uses the <CODE>Prelude</CODE> operation
+</P>
+<PRE>
+ last : Str -&gt; Str ;
+</PRE>
+<P>
+The case expression uses <B>pattern matching over strings</B>, which
+is supported in GF, alongside with pattern matching over
+parameters.
+</P>
+<A NAME="toc63"></A>
+<H3>Intelligent paradigms</H3>
+<P>
+Between the completely regular <I>dog</I>-<I>dogs</I> and the completely
+irregular <I>mouse</I>-<I>mice</I>, there are some
+predictable variations:
+</P>
+<UL>
+<LI>nouns ending with an <I>y</I>: <I>fly</I>-<I>flies</I>, except if
+ a vowel precedes the <I>y</I>: <I>boy</I>-<I>boys</I>
+<LI>nouns ending with <I>s</I>, <I>ch</I>, and a number of
+ other endings: <I>bus</I>-<I>buses</I>, <I>leech</I>-<I>leeches</I>
+</UL>
+
+<P>
+One way to deal with them would be to provide alternative paradigms:
+</P>
+<PRE>
+ noun_y : Str -&gt; Noun = \fly -&gt; mkNoun fly (init fly + "ies") ;
+ noun_s : Str -&gt; Noun = \bus -&gt; mkNoun bus (bus + "es") ;
+</PRE>
+<P>
+The Prelude function <CODE>init</CODE> drops the last character of a token.
+But this solution has some drawbacks:
+</P>
+<UL>
+<LI>it can be difficult to select the correct paradigm
+<LI>it can be difficult to remember the names of the different paradigms
+</UL>
+
+<P>
+To help the lexicon builder in this task, the morphology programmer
+can put some intelligence in the regular noun paradigm. The easiest
+way to express this in GF is by the use of <B>regular expression patterns</B>:
+</P>
+<PRE>
+ regNoun : Str -&gt; Noun = \w -&gt;
+ let
+ ws : Str = case w of {
+ _ + ("a" | "e" | "i" | "o") + "o" =&gt; w + "s" ; -- bamboo
+ _ + ("s" | "x" | "sh" | "o") =&gt; w + "es" ; -- bus, hero
+ _ + "z" =&gt; w + "zes" ;-- quiz
+ _ + ("a" | "e" | "o" | "u") + "y" =&gt; w + "s" ; -- boy
+ x + "y" =&gt; x + "ies" ;-- fly
+ _ =&gt; w + "s" -- car
+ }
+ in
+ mkNoun w ws
+</PRE>
+<P>
+In this definition, we have used a local definition just in order to
+structure the code, even though there is no multiple evaluation to eliminate.
+In the case expression itself, we have used
+</P>
+<UL>
+<LI><B>disjunctive patterns</B> <I>P</I> <CODE>|</CODE> <I>Q</I>
+<LI><B>concatenation patterns</B> <I>P</I> <CODE>+</CODE> <I>Q</I>
+</UL>
+
+<P>
+The patterns are ordered in such a way that, for instance,
+the suffix <CODE>"oo"</CODE> prevents <I>bamboo</I> from matching the suffix
+<CODE>"o"</CODE>.
+</P>
+<P>
+<B>Exercise</B>. The same rules that form plural nouns in English also
+apply in the formation of third-person singular verbs.
+Write a regular verb paradigm that uses this idea, but first
+rewrite <CODE>regNoun</CODE> so that the analysis needed to build <I>s</I>-forms
+is factored out as a separate <CODE>oper</CODE>, which is shared with
+<CODE>regVerb</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Extend the verb paradigms to cover all verb forms
+in English, with special care taken of variations with the suffix
+<I>ed</I> (e.g. <I>try</I>-<I>tried</I>, <I>use</I>-<I>used</I>).
+</P>
+<P>
+<B>Exercise</B>. Implement the German <B>Umlaut</B> operation on word stems.
+The operation changes the vowel of the stressed stem syllable as follows:
+<I>a</I> to <I>ä</I>, <I>au</I> to <I>äu</I>, <I>o</I> to <I>ö</I>, and <I>u</I> to <I>ü</I>. You
+can assume that the operation only takes syllables as arguments. Test the
+operation to see whether it correctly changes <I>Arzt</I> to <I>Ärzt</I>,
+<I>Baum</I> to <I>Bäum</I>, <I>Topf</I> to <I>Töpf</I>, and <I>Kuh</I> to <I>Küh</I>.
+</P>
+<A NAME="toc64"></A>
+<H3>Function types with variables</H3>
+<P>
+In <a href="#chapsix">the sixth chapter</a>, we will introduce <B>dependent function types</B>, where
+the value type depends on the argument. For this end, we need a notation
+that binds a variable to the argument type, as in
+</P>
+<PRE>
+ switchOff : (k : Kind) -&gt; Action k
+</PRE>
+<P>
+Function types <I>without</I>
+variables are actually a shorthand notation: writing
+</P>
+<PRE>
+ PredVP : NP -&gt; VP -&gt; S
+</PRE>
+<P>
+is shorthand for
+</P>
+<PRE>
+ PredVP : (x : NP) -&gt; (y : VP) -&gt; S
+</PRE>
+<P>
+or any other naming of the variables. Actually the use of variables
+sometimes shortens the code, since they can share a type:
+</P>
+<PRE>
+ octuple : (x,y,z,u,v,w,s,t : Str) -&gt; Str
+</PRE>
+<P>
+If a bound variable is not used, it can here, as elsewhere in GF, be replaced by
+a wildcard:
+</P>
+<PRE>
+ octuple : (_,_,_,_,_,_,_,_ : Str) -&gt; Str
+</PRE>
+<P>
+A good practice for functions with many arguments of the same type
+is to indicate the number of arguments:
+</P>
+<PRE>
+ octuple : (x1,_,_,_,_,_,_,x8 : Str) -&gt; Str
+</PRE>
+<P>
+One can also use heuristic variable names to document what
+information each argument is expected to provide.
+This is very handy in the types of inflection paradigms:
+</P>
+<PRE>
+ mkNoun : (mouse,mice : Str) -&gt; Noun
+</PRE>
+<P></P>
+<A NAME="toc65"></A>
+<H3>Separating operation types and definitions</H3>
+<P>
+In grammars intended as libraries, it is useful to separate oparation
+definitions from their type signatures. The user is only interested
+in the type, whereas the definition is kept for the implementor and
+the maintainer. This is possible by using separate <CODE>oper</CODE> fragments
+for the two parts:
+</P>
+<PRE>
+ oper regNoun : Str -&gt; Noun ;
+ oper regNoun s = mkNoun s (s + "s") ;
+</PRE>
+<P>
+The type checker combines the two into one <CODE>oper</CODE> judgement to see
+if the definition matches the type. Notice that, in this syntax, it
+is moreover possible to bind the argument variables on the left hand side
+instead of using lambda abstration.
+</P>
+<P>
+In the library module, the type signatures are typically placed in
+the beginning and the definitions in the end. A more radical separation
+can be achieved by using the <CODE>interface</CODE> and <CODE>instance</CODE> module types
+(see <a href="#secinterface">here</a>): the type signatures are placed in the interface
+and the definitions in the instance.
+</P>
+<A NAME="toc66"></A>
+<H3>Overloading of operations</H3>
+<P>
+Large libraries, such as the GF Resource Grammar Library, may define
+hundreds of names. This can be unpractical
+for both the library author and the user: the author has to invent longer
+and longer names which are not always intuitive,
+and the author has to learn or at least be able to find all these names.
+A solution to this problem, adopted by languages such as C++,
+is <B>overloading</B>: one and the same name can be used for several functions.
+When such a name is used, the
+compiler performs <B>overload resolution</B> to find out which of
+the possible functions is meant. Overload resolution is based on
+the types of the functions: all functions that
+have the same name must have different types.
+</P>
+<P>
+In C++, functions with the same name can be scattered everywhere in the program.
+In GF, they must be grouped together in <CODE>overload</CODE> groups. Here is an example
+of an overload group, giving the different ways to define nouns in English:
+</P>
+<PRE>
+ oper mkN : overload {
+ mkN : (dog : Str) -&gt; Noun ; -- regular nouns
+ mkN : (mouse,mice : Str) -&gt; Noun ; -- irregular nouns
+ }
+</PRE>
+<P>
+Intuitively, the function comes very close to the way in which
+regular and irregular words are given in most dictionaries. If the
+word is regular, just one form is needed. If it is irregular,
+more forms are given. There is no need to use explicit paradigm
+names.
+</P>
+<P>
+The <CODE>mkN</CODE> example gives only the possible types of the overloaded
+operation. Their definitions can be given separately, possibly in another module.
+Here is a definition of the above overload group:
+</P>
+<PRE>
+ oper mkN = overload {
+ mkN : (dog : Str) -&gt; Noun = regNoun ;
+ mkN : (mouse,mice : Str) -&gt; Noun = mkNoun ;
+ }
+</PRE>
+<P>
+Notice that the types of the branches must be repeated so that they can be
+associated with proper definitions; the order of the branches has no
+significance.
+</P>
+<P>
+<B>Exercise</B>. Design a system of English verb paradigms presented by
+an overload group.
+</P>
+<A NAME="toc67"></A>
+<H3>Morphological analysis and morphology quiz</H3>
+<P>
+Even though morphology is in GF
+mostly used as an auxiliary for syntax, it
+can also be useful on its own right. The command <CODE>morpho_analyse = ma</CODE>
+can be used to read a text and return for each word the analyses that
+it has in the current concrete syntax.
+</P>
+<PRE>
+ &gt; read_file bible.txt | morpho_analyse
+</PRE>
+<P>
+In the same way as translation exercises, morphological exercises can
+be generated, by the command <CODE>morpho_quiz = mq</CODE>. Usually,
+the category is then set to some lexical category. For instance,
+French irregular verbs in the resource grammar library can be trained as
+follows:
+</P>
+<PRE>
+ % gf -path=alltenses:prelude $GF_LIB_PATH/alltenses/IrregFre.gfc
+
+ &gt; morpho_quiz -cat=V
+
+ Welcome to GF Morphology Quiz.
+ ...
+
+ réapparaître : VFin VCondit Pl P2
+ réapparaitriez
+ &gt; No, not réapparaitriez, but
+ réapparaîtriez
+ Score 0/1
+</PRE>
+<P>
+Just like translation exercises, a list of morphological exercises can be generated
+off-line and saved in a
+file for later use, by the command <CODE>morpho_list = ml</CODE>
+</P>
+<PRE>
+ &gt; morpho_list -number=25 -cat=V | write_file exx.txt
+</PRE>
+<P>
+The <CODE>number</CODE> flag gives the number of exercises generated.
+</P>
+<A NAME="toc68"></A>
+<H2>The Italian Foods grammar</H2>
+<P>
+<a name="secitalian"></a>
+</P>
+<P>
+We conclude the parametrization of the Food grammar by presenting an
+Italian variant, now complete with parameters, inflection, and
+agreement.
+</P>
+<P>
+The header part is similar to English:
+</P>
+<PRE>
+ --# -path=.:prelude
+
+ concrete FoodsIta of Foods = open Prelude in {
+</PRE>
+<P>
+Parameters include not only number but also gender.
+</P>
+<PRE>
+ param
+ Number = Sg | Pl ;
+ Gender = Masc | Fem ;
+</PRE>
+<P>
+Qualities are inflected for gender and number, whereas kinds
+have a parametric number (as in English) and an inherent gender.
+Items have an inherent number (as in English) but also gender.
+</P>
+<PRE>
+ lincat
+ Phr = SS ;
+ Quality = {s : Gender =&gt; Number =&gt; Str} ;
+ Kind = {s : Number =&gt; Str ; g : Gender} ;
+ Item = {s : Str ; g : Gender ; n : Number} ;
+</PRE>
+<P>
+A Quality is expressed by an adjective, which in Italian has one form for each
+gender-number combination.
+</P>
+<PRE>
+ oper
+ adjective : (_,_,_,_ : Str) -&gt; {s : Gender =&gt; Number =&gt; Str} =
+ \nero,nera,neri,nere -&gt; {
+ s = table {
+ Masc =&gt; table {
+ Sg =&gt; nero ;
+ Pl =&gt; neri
+ } ;
+ Fem =&gt; table {
+ Sg =&gt; nera ;
+ Pl =&gt; nere
+ }
+ }
+ } ;
+</PRE>
+<P>
+The very common case of regular adjectives works by adding
+endings to the stem.
+</P>
+<PRE>
+ regAdj : Str -&gt; {s : Gender =&gt; Number =&gt; Str} = \nero -&gt;
+ let ner = init nero
+ in adjective nero (ner + "a") (ner + "i") (ner + "e") ;
+</PRE>
+<P></P>
+<P>
+For noun inflection, there are several paradigms; since only two forms
+are ever needed, we will just give them explicitly (the resource grammar
+library also has a paradigm that takes the singular form and infers the
+plural and the gender from it).
+</P>
+<PRE>
+ noun : Str -&gt; Str -&gt; Gender -&gt; {s : Number =&gt; Str ; g : Gender} =
+ \vino,vini,g -&gt; {
+ s = table {
+ Sg =&gt; vino ;
+ Pl =&gt; vini
+ } ;
+ g = g
+ } ;
+</PRE>
+<P>
+As in <CODE>FoodEng</CODE>, we need only number variation for the copula.
+</P>
+<PRE>
+ copula : Number -&gt; Str =
+ \n -&gt; case n of {
+ Sg =&gt; "è" ;
+ Pl =&gt; "sono"
+ } ;
+</PRE>
+<P>
+Determination is more complex than in English, because of gender:
+it uses separate determiner forms for the two genders, and selects
+one of them as function of the noun determined.
+</P>
+<PRE>
+ det : Number -&gt; Str -&gt; Str -&gt; {s : Number =&gt; Str ; g : Gender} -&gt;
+ {s : Str ; g : Gender ; n : Number} =
+ \n,m,f,cn -&gt; {
+ s = case cn.g of {Masc =&gt; m ; Fem =&gt; f} ++ cn.s ! n ;
+ g = cn.g ;
+ n = n
+ } ;
+</PRE>
+<P>
+Here is, finally, the complete set of linearization rules.
+</P>
+<PRE>
+ lin
+ Is item quality =
+ ss (item.s ++ copula item.n ++ quality.s ! item.g ! item.n) ;
+ This = det Sg "questo" "questa" ;
+ That = det Sg "quello" "quella" ;
+ These = det Pl "questi" "queste" ;
+ Those = det Pl "quelli" "quelle" ;
+ QKind quality kind = {
+ s = \\n =&gt; kind.s ! n ++ quality.s ! kind.g ! n ;
+ g = kind.g
+ } ;
+ Wine = noun "vino" "vini" Masc ;
+ Cheese = noun "formaggio" "formaggi" Masc ;
+ Fish = noun "pesce" "pesci" Masc ;
+ Pizza = noun "pizza" "pizze" Fem ;
+ Very qual = {s = \\g,n =&gt; "molto" ++ qual.s ! g ! n} ;
+ Fresh = adjective "fresco" "fresca" "freschi" "fresche" ;
+ Warm = regAdj "caldo" ;
+ Italian = regAdj "italiano" ;
+ Expensive = regAdj "caro" ;
+ Delicious = regAdj "delizioso" ;
+ Boring = regAdj "noioso" ;
+ }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Experiment with multilingual generation and translation in the
+<CODE>Foods</CODE> grammars.
+</P>
+<P>
+<B>Exercise</B>. Add items, qualities, and determiners to the grammar, and try to get
+their inflection and inherent features right.
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax of <CODE>Food</CODE> for a language of your choice,
+now aiming for complete grammatical correctness by the use of parameters.
+</P>
+<P>
+<B>Exercise</B>. Measure the size of the context-free grammar corresponding to
+<CODE>FoodsIta</CODE>. You can do this by printing the grammar in the context-free format
+(<CODE>print_grammar -printer=cfg</CODE>) and counting the lines.
+</P>
+<A NAME="toc69"></A>
+<H2>Discontinuous constituents</H2>
+<P>
+A linearization type may contain more strings than one.
+An example of where this is useful are English particle
+verbs, such as <I>switch off</I>. The linearization of
+a sentence may place the object between the verb and the particle:
+<I>he switched it off</I>.
+</P>
+<P>
+The following judgement defines transitive verbs as
+<B>discontinuous constituents</B>, i.e. as having a linearization
+type with two strings and not just one.
+</P>
+<PRE>
+ lincat TV = {s : Number =&gt; Str ; part : Str} ;
+</PRE>
+<P>
+In the abstract syntax, we can now have a rule that combines a subject and an
+object item with a transitive verb to form a sentence:
+</P>
+<PRE>
+ fun AppTV : Item -&gt; TV -&gt; Item -&gt; Phrase ;
+</PRE>
+<P>
+The linearization rule places the object between the two parts of the verb:
+</P>
+<PRE>
+ lin AppTV subj tv obj =
+ {s = subj.s ++ tv.s ! subj.n ++ obj.s ++ tv.part} ;
+</PRE>
+<P>
+There is no restriction in the number of discontinuous constituents
+(or other fields) a <CODE>lincat</CODE> may contain. The only condition is that
+the fields must be built from records, tables,
+parameters, and <CODE>Str</CODE>, but not functions.
+</P>
+<P>
+Notice that the parsing and linearization commands only give accurate
+results for categories whose linearization type has a unique <CODE>Str</CODE>
+valued field labelled <CODE>s</CODE>. Therefore, discontinuous constituents
+are not a good idea in top-level categories accessed by the users
+of a grammar application.
+</P>
+<P>
+<B>Exercise</B>. Define the language <CODE>a^n b^n c^n</CODE> in GF, i.e.
+any number of <I>a</I>'s followed by the same number of <I>b</I>'s and
+the same number of <I>c</I>'s. This language is not context-free,
+but can be defined in GF by using discontinuous constituents.
+</P>
+<A NAME="toc70"></A>
+<H2>Strings at compile time vs. run time</H2>
+<P>
+A common difficulty in GF are the conditions under which tokens
+can be created. Tokens are created in the following ways:
+</P>
+<UL>
+<LI>quoted string: <CODE>"foo"</CODE>
+<LI>gluing : <CODE>t + s</CODE>
+<LI>predefined operations <CODE>init, tail, tk, dp</CODE>
+<LI>pattern matching over strings
+</UL>
+
+<P>
+The general principle is that
+<I>tokens must be known at compile time</I>. This means that the above operations
+may not have <B>run-time variables</B> in their arguments. Run-time variables, in
+turn, are the variables that stand for function arguments in linearization rules.
+</P>
+<P>
+Hence it is not legal to write
+</P>
+<PRE>
+ cat Noun ;
+ fun Plural : Noun -&gt; Noun ;
+ lin Plural n = {s = n.s + "s"} ;
+</PRE>
+<P>
+because <CODE>n</CODE> is a run-time variable. Also
+</P>
+<PRE>
+ lin Plural n = {s = (regNoun n).s ! Pl} ;
+</PRE>
+<P>
+is incorrect with <CODE>regNoun</CODE> as defined <a href="#secinflection">here</a>, because the run-time
+variable is eventually sent to string pattern matching and gluing.
+</P>
+<P>
+Writing tokens together without a space is an often-wanted behaviour, for instance,
+with punctuation marks. Thus one might try to write
+</P>
+<PRE>
+ lin Question p = {s = p + "?"} ;
+</PRE>
+<P>
+which is incorrect. The way to go is to use an <B>unlexer</B> that creates correct spacing
+after linearization. Correspondingly, a <B>lexer</B> that e.g. analyses <CODE>"warm?"</CODE> into
+to tokens is needed before parsing. This can be done by using flags:
+</P>
+<PRE>
+ flags lexer=text ; unlexer=text ;
+</PRE>
+<P>
+works in the desired way for English text. More on lexers and unlexers will be
+told <a href="#seclexing">here</a>.
+</P>
+<A NAME="toc71"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc72"></A>
+<H3>Parameter and table types</H3>
+<P>
+A judgement of the form
+<center>
+ <CODE>param</CODE> <I>P</I> <CODE>=</CODE> <I>C1</I> <I>X1</I> <CODE>|</CODE> ... <CODE>|</CODE> <I>Cn</I> <I>Xn</I>
+</center>
+defines a <B>parameter type</B> <I>P</I> with <B>constructors</B> <I>C1</I> ... <I>Cn</I>.
+Each constructor has a <B>context</B> <I>X</I>, which is a (possibly empty)
+sequence of parameter types. A <B>parameter value</B> is an application
+of a constructor to a sequence of parameter values from each type in
+its context.
+</P>
+<P>
+In addition to types defined in <CODE>param</CODE> judgements, also
+records of parameter types are parameter types. Their values are records
+of corresponding field values.
+</P>
+<P>
+Moreover, the type <CODE>Ints</CODE> <I>n</I> is a parameter type for any positive
+integer <I>n</I>, and its values are <CODE>0</CODE>, ..., <I>n-1</I>.
+</P>
+<P>
+A <B>table type</B> <I>P</I> <CODE>=&gt;</CODE> <I>T</I> must have a parameter type <I>P</I> as
+its argument type. The normal form of an object of this type is a <B>table</B>
+<center>
+ <CODE>table {</CODE> <I>V1</I> <CODE>=&gt;</CODE> <I>t1</I> <CODE>;</CODE> ... <CODE>;</CODE> <I>Vm</I> <CODE>=&gt;</CODE> <I>tm</I> <CODE>}</CODE>
+</center>
+which has a <B>branch</B> for every parameter value <I>Vi</I> of type <I>P</I>.
+A table can be given in many other ways by using pattern matching.
+</P>
+<P>
+Tables with only one branch are a common special case.
+GF provides syntactic sugar for writing one-branch tables concisely:
+</P>
+<PRE>
+ \\P,...,Q =&gt; t === table {P =&gt; ... table {Q =&gt; t} ...}
+</PRE>
+<P></P>
+<A NAME="toc73"></A>
+<H3>Pattern matching</H3>
+<P>
+<a name="secmatching"></a>
+</P>
+<P>
+We will list all forms of patterns that can be used in table branches.
+the following are available for any parameter types, as well
+as for the types <CODE>Int</CODE> and <CODE>Str</CODE>
+</P>
+<UL>
+<LI>a constructor pattern <I>C P1 ... Pn</I> matches any value <I>C V1 ... Vn</I> where
+ each <I>Vi</I> matches <I>Pi</I>,
+ and binds the union of all variables bound in the subpatterns <I>Pi</I>
+<LI>a record pattern
+ <CODE>{</CODE> <I>r1</I> <CODE>=</CODE> <I>P1</I> <CODE>;</CODE> ... <CODE>;</CODE> <I>r1</I> <CODE>=</CODE> <I>P1</I> <CODE>}</CODE>
+ matches any record that has values of the corresponding fields.
+ and binds the union of all variables bound in the subpatterns <I>Pi</I>
+<LI>a variable pattern <I>x</I>
+ (identifier other than constant parameter) matches any value, and
+ binds <I>x</I> to this value
+<LI>the wild card <CODE>_</CODE> matches any value
+<LI>a disjunctive pattern <I>P</I> <CODE>|</CODE> <I>Q</I> matches anything that
+ either <I>P</I> or <I>Q</I> matches; bindings must be the same in both
+<LI>a negative pattern <CODE>-</CODE><I>P</I> matches anything that <I>P</I> does not match;
+ no bindings are returned
+<LI>an alias pattern <I>x</I> <CODE>@</CODE> <I>P</I> matches whatever value <I>P</I> matches and
+ binds <I>x</I> to this value; also the bindings in <I>P</I> are returned
+</UL>
+
+<P>
+The following patterns are only available for the type <CODE>Str</CODE>:
+</P>
+<UL>
+<LI>a string literal pattern, e.g. <CODE>"s"</CODE>, matches the same string
+<LI>a concatenation pattern <I>P</I> <CODE>+</CODE> <I>Q</I> matches any string that consists
+ of a prefix matching <I>P</I> and a suffix matching <I>Q</I>;
+ the union of bindings is returned
+<LI>a repetition pattern <I>P</I><CODE>*</CODE> matches any string that can be decomposed
+ into strings that match <I>P</I>; no bindings are returned
+</UL>
+
+<P>
+The following pattern is only available for the types <CODE>Int</CODE> and <CODE>Ints</CODE> <I>n</I>:
+</P>
+<UL>
+<LI>an integer literal pattern, e.g. <CODE>214</CODE>, matches the same integer
+</UL>
+
+<P>
+Pattern matching is performed in the order in which the branches
+appear in the table: the branch of the first matching pattern is followed.
+The type checker reject sets of patterns that are not exhaustive, and
+warns for completely overshadowed patterns.
+To guarantee exhaustivity when the infinite types <CODE>Int</CODE> and <CODE>Str</CODE> are
+used as argument types, the last pattern must be a "catch-all" variable
+or wild card.
+</P>
+<P>
+It follows from the definition of record pattern matching
+that it can utilize partial records: the branch
+</P>
+<PRE>
+ {g = Fem} =&gt; t
+</PRE>
+<P>
+in a table of type <CODE>{g : Gender ; n : Number} =&gt; T</CODE> means the same as
+</P>
+<PRE>
+ {g = Fem ; n = _} =&gt; t
+</PRE>
+<P>
+Variables in regular expression patterns
+are always bound to the <B>first match</B>, which is the first
+in the sequence of binding lists. For example:
+</P>
+<UL>
+<LI><CODE>x + "e" + y</CODE> matches <CODE>"peter"</CODE> with <CODE>x = "p", y = "ter"</CODE>
+<LI><CODE>x + "er"*</CODE> matches <CODE>"burgerer"</CODE> with ``x = "burg"
+</UL>
+
+<A NAME="toc74"></A>
+<H3>Overloading</H3>
+<P>
+Judgements of the <CODE>oper</CODE> form can introduce overloaded functions.
+The syntax is record-like, but all fields must have the same
+name and different types.
+</P>
+<PRE>
+ oper mkN = overload {
+ mkN : (dog : Str) -&gt; Noun = regNoun ;
+ mkN : (mouse,mice : Str) -&gt; Noun = mkNoun ;
+ }
+</PRE>
+<P>
+To give just the type of an overloaded operation, the record type
+syntax is used.
+</P>
+<PRE>
+ oper mkN : overload {
+ mkN : (dog : Str) -&gt; Noun ; -- regular nouns
+ mkN : (mouse,mice : Str) -&gt; Noun ; -- irregular nouns
+ }
+</PRE>
+<P>
+Overloading is not possible in other forms of judgement.
+</P>
+<A NAME="toc75"></A>
+<H3>Local definitions</H3>
+<P>
+Local definitions ("<CODE>let</CODE> expressions") can appear in groups:
+</P>
+<PRE>
+ oper regNoun : Str -&gt; Noun = \vino -&gt;
+ let
+ vin : Str = init vino ;
+ o = last vino
+ in
+ ...
+</PRE>
+<P>
+The type can be omitted if it can be inferred. Later definitions may
+refer to earlier ones.
+</P>
+<A NAME="toc76"></A>
+<H3>Supplementary constructs</H3>
+<P>
+The rest of the GF language constructs are presented for the sake
+of completeness. They will not be used in the rest of this tutorial.
+</P>
+<H4>Record extension and subtyping</H4>
+<P>
+Record types and records can be <B>extended</B> with new fields. For instance,
+in German it is natural to see transitive verbs as verbs with a case, which
+is usually accusative or dative, and is passed to the object of the verb.
+The symbol <CODE>**</CODE> is used for both record types and record objects.
+</P>
+<PRE>
+ lincat TV = Verb ** {c : Case} ;
+
+ lin Follow = regVerb "folgen" ** {c = Dative} ;
+</PRE>
+<P>
+To extend a record type or a record with a field whose label it
+already has is a type error. It is also an error to extend a type or
+object that is not a record.
+</P>
+<P>
+A record type <I>T</I> is a <B>subtype</B> of another one <I>R</I>, if <I>T</I> has
+all the fields of <I>R</I> and possibly other fields. For instance,
+an extension of a record type is always a subtype of it.
+If <I>T</I> is a subtype of <I>R</I>, then <I>R</I> is a <B>supertype</B> of <I>T</I>.
+</P>
+<P>
+If <I>T</I> is a subtype of <I>R</I>, an object of <I>T</I> can be used whenever
+an object of <I>R</I> is required.
+For instance, a transitive verb can be used whenever a verb is required.
+</P>
+<P>
+<B>Covariance</B> means that a function returning a record <I>T</I> as value can
+also be used to return a value of a supertype <I>R</I>.
+<B>Contravariance</B> means that a function taking an <I>R</I> as argument
+can also be applied to any object of a subtype <I>T</I>.
+</P>
+<H4>Tuples and product types</H4>
+<P>
+Product types and tuples are syntactic sugar for record types and records:
+</P>
+<PRE>
+ T1 * ... * Tn === {p1 : T1 ; ... ; pn : Tn}
+ &lt;t1, ..., tn&gt; === {p1 = T1 ; ... ; pn = Tn}
+</PRE>
+<P>
+Thus the labels <CODE>p1, p2,...</CODE> are hard-coded.
+As patterns, tuples are translated to record patterns in the
+same way as tuples to records; partial patterns make it
+possible to write, slightly surprisingly,
+</P>
+<PRE>
+ case &lt;g,n,p&gt; of {
+ &lt;Fem&gt; =&gt; t
+ ...
+ }
+</PRE>
+<P></P>
+<H4>Prefix-dependent choices</H4>
+<P>
+Sometimes a token has different forms depending on the token
+that follows. An example is the English indefinite article,
+which is <I>an</I> if a vowel follows, <I>a</I> otherwise.
+Which form is chosen can only be decided at run time, i.e.
+when a string is actually build. GF has a special construct for
+such tokens, the <CODE>pre</CODE> construct exemplified in
+</P>
+<PRE>
+ oper artIndef : Str =
+ pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
+</PRE>
+<P>
+Thus
+</P>
+<PRE>
+ artIndef ++ "cheese" ---&gt; "a" ++ "cheese"
+ artIndef ++ "apple" ---&gt; "an" ++ "apple"
+</PRE>
+<P>
+This very example does not work in all situations: the prefix
+<I>u</I> has no general rules, and some problematic words are
+<I>euphemism, one-eyed, n-gram</I>. Since the branches are matched in
+order, it is possible to write
+</P>
+<PRE>
+ oper artIndef : Str =
+ pre {"a" ;
+ "a" / strs {"eu" ; "one"} ;
+ "an" / strs {"a" ; "e" ; "i" ; "o" ; "n-"}
+ } ;
+</PRE>
+<P>
+Somewhat illogically, the default value is given as the first element in the list.
+</P>
+<P>
+<I>Prefix-dependent choice may be deprecated in GF version 3.</I>
+</P>
+<A NAME="toc77"></A>
+<H1>Using the resource grammar library</H1>
+<P>
+<a name="chapfive"></a>
+</P>
+<P>
+In this chapter, we will take a look at the GF resource grammar library.
+We will use the library to implement the <CODE>Foods</CODE> grammar of the
+previous chapter
+and port it to some new languages. Some new concepts of GF's module system
+are also introduced, most notably the technique of <B>parametrized modules</B>,
+which has become an important "design pattern" for multilingual grammars.
+</P>
+<A NAME="toc78"></A>
+<H2>The coverage of the library</H2>
+<P>
+The GF Resource Grammar Library contains grammar rules for
+10 languages. In addition, 2 languages are available as yet incomplete
+implementations, and a few more are under construction. The purpose
+of the library is to define the low-level morphological and syntactic
+rules of languages, and thereby enable application programmers
+to concentrate on the semantic and stylistic
+aspects of their grammars. The guiding principle is that
+<center>
+grammar checking becomes type checking
+</center>
+that is, whatever is type-correct in the resource grammar is also
+grammatically correct.
+</P>
+<P>
+The intended level of application grammarians
+is that of a skilled programmer with
+a practical knowledge of the target languages, but without
+theoretical knowledge about their grammars.
+Such a combination of
+skills is typical of programmers who, for instance, want to localize
+language software to new languages.
+</P>
+<P>
+The current resource languages are
+</P>
+<UL>
+<LI><CODE>Ara</CODE>bic (incomplete)
+<LI><CODE>Cat</CODE>alan (incomplete)
+<LI><CODE>Dan</CODE>ish
+<LI><CODE>Eng</CODE>lish
+<LI><CODE>Fin</CODE>nish
+<LI><CODE>Fre</CODE>nch
+<LI><CODE>Ger</CODE>man
+<LI><CODE>Ita</CODE>lian
+<LI><CODE>Nor</CODE>wegian
+<LI><CODE>Rus</CODE>sian
+<LI><CODE>Spa</CODE>nish
+<LI><CODE>Swe</CODE>dish
+</UL>
+
+<P>
+The first three letters (<CODE>Eng</CODE> etc) are used in grammar module names.
+We use the three-letter codes for languages from the ISO 639 standard.
+</P>
+<P>
+The incomplete Arabic and Catalan implementations are
+sufficient for use in some applications; they both contain, amoung other
+things, complete inflectional morphology.
+</P>
+<A NAME="toc79"></A>
+<H2>The structure of the library</H2>
+<P>
+<a name="seclexical"></a>
+</P>
+<A NAME="toc80"></A>
+<H3>Lexical vs. phrasal rules</H3>
+<P>
+So far we have looked at grammars from a semantic point of view:
+a grammar defines a system of meanings (specified in the abstract syntax) and
+tells how they are expressed in some language (as specified in the concrete syntax).
+In resource grammars, as often in the linguistic tradition, the goal is more modest:
+to specify the <B>grammatically correct combinations of words</B>, whatever their
+meanings are. With this more modest goal, it is possible to achieve a much
+wider coverage than with semantic grammars.
+</P>
+<P>
+Given the focus on <I>words</I> and their combinations,
+the resource grammar has two kinds of categories and two kinds of rules:
+</P>
+<UL>
+<LI>lexical:
+ <UL>
+ <LI>lexical categories, to classify words
+ <LI>lexical rules, to define words and their properties
+ </UL>
+</UL>
+
+<UL>
+<LI>phrasal (combinatorial, syntactic):
+ <UL>
+ <LI>phrasal categories, to classify phrases of arbitrary size
+ <LI>phrasal rules, to combine phrases into larger phrases
+ </UL>
+</UL>
+
+<P>
+Some grammar formalisms make a formal distinction between
+the lexical and syntactic
+components; sometimes it is necessary to use separate formalisms for these
+two kinds of rules. GF has no such restrictions.
+Nevertheless, it has turned out
+to be a good discipline to maintain a distinction between
+the lexical and syntactic components in the resource grammar. This fits
+also well with what is needed in applications: while syntactic structures
+are more or less the same across applications, vocabularies can be
+very different.
+</P>
+<A NAME="toc81"></A>
+<H3>Lexical categories</H3>
+<P>
+Within lexical categories, there is a further classification
+into <B>closed</B> and <B>open</B> categories. The definining property
+of closed categories is that the
+words in them can easily be enumerated; it is very seldom that any
+new words are introduced in them. In general, closed categories
+contain <B>structural words</B>, also known as <B>function words</B>.
+Examples of closed categories are
+</P>
+<PRE>
+ QuantSg ; -- singular quantifier e.g. "this"
+ QuantPl ; -- plural quantifier e.g. "those"
+ AdA ; -- adadjective e.g. "very"
+</PRE>
+<P>
+We have already used words of all these categories in the <CODE>Food</CODE>
+examples; they have just not been assigned a category, but
+treated as <B>syncategorematic</B>. In GF, a syncategoramatic
+word is one that is introduced in a linearization rule of
+some construction alongside with some other expressions that
+are combined; there is no abstract syntax tree for that word
+alone. Thus in the rules
+</P>
+<PRE>
+ fun That : Kind -&gt; Item ;
+ lin That k = {"that" ++ k.s} ;
+</PRE>
+<P>
+the word <I>that</I> is syncategoramatic. In linguistically motivated
+grammars, syncategorematic words are avoided, whereas in
+semantically motivated grammars, structural words are typically treated
+as syncategoramatic. This is partly so because the function expressed
+by a structural word in one language is often expressed by some other
+means than an individual word in another. For instance, the definite
+article <I>the</I> is a determiner word in English, whereas Swedish expresses
+determination by inflecting the determined noun: <I>the wine</I> is <I>vinet</I>
+in Swedish.
+</P>
+<P>
+As for open categories, we will start with these two:
+</P>
+<PRE>
+ N ; -- noun e.g. "pizza"
+ A ; -- adjective e.g. "good"
+</PRE>
+<P>
+Later in this chapter we will also need verbs of different kinds.
+</P>
+<P>
+<I>Note</I>. Having adadjectives as a closed category is not quite right, because
+one can form adadjectives from adjectives: <I>incredibly warm</I>.
+</P>
+<A NAME="toc82"></A>
+<H3>Lexical rules</H3>
+<P>
+The words of closed categories can be listed once and for all in a
+library. In the first example, the <CODE>Foods</CODE> grammar of the previous section,
+we will use the following structural words from the <CODE>Syntax</CODE> module:
+</P>
+<PRE>
+ this_QuantSg, that_QuantSg : QuantSg ;
+ these_QuantPl, those_QuantPl : QuantPl ;
+ very_AdA : AdA ;
+</PRE>
+<P>
+The naming convention for lexical rules is that we use a word followed by
+the category. In this way we can for instance distinguish the quantifier
+<I>that</I> from the conjunction <I>that</I>.
+</P>
+<P>
+Open lexical categories have no objects in <CODE>Syntax</CODE>. Such objects
+will be built as they are needed in applications. The abstract
+syntax of words in applications is already familiar, e.g.
+</P>
+<PRE>
+ fun Wine : Kind ;
+</PRE>
+<P>
+The concrete syntax can be given directly, e.g.
+</P>
+<PRE>
+ lin Wine = mkN "wine" ;
+</PRE>
+<P>
+by using the morphological paradigm library <CODE>ParadigmsEng</CODE>.
+However, there are some advantages in giving the concrete syntax
+indirectly, via the creation of a <B>resource lexicon</B>. In this lexicon,
+there will be entries such as
+</P>
+<PRE>
+ oper wine_N : N = mkN "wine" ;
+</PRE>
+<P>
+which can then be used in the linearization rules,
+</P>
+<PRE>
+ lin Wine = wine_N ;
+</PRE>
+<P>
+One advantage of this indirect method is that each new word gives
+an addition to a reusable resource lexicon, instead of just doing
+the job of implementing the application. Another advantage will
+be shown <a href="#secfunctor">here</a>: the possibility to write functors over
+lexicon interfaces.
+</P>
+<A NAME="toc83"></A>
+<H3>Phrasal categories</H3>
+<P>
+There are just four phrasal categories needed in the first application:
+</P>
+<PRE>
+ Cl ; -- clause e.g. "this pizza is good"
+ NP ; -- noun phrase e.g. "this pizza"
+ CN ; -- common noun e.g. "warm pizza"
+ AP ; -- adjectival phrase e.g. "very warm"
+</PRE>
+<P>
+Clauses are, roughly, the same as declarative sentences; we will
+define in <a href="#secextended">here</a> a sentence <CODE>S</CODE> as a clause that has a fixed tense.
+The distinction between common nouns and noun phrases is that common nouns
+cannot generally be used alone as subjects (?<I>dog sleeps</I>),
+whereas noun phrases can (<I>the dog sleeps</I>).
+Noun phrases can be built from common nouns by adding determiners,
+such as quantifiers; but there are also other kinds of noun phrases, e.g.
+pronouns.
+</P>
+<P>
+The syntactic combinations we need are the following:
+</P>
+<PRE>
+ mkCl : NP -&gt; AP -&gt; Cl ; -- e.g. "this pizza is very warm"
+ mkNP : QuantSg -&gt; CN -&gt; NP ; -- e.g. "this pizza"
+ mkNP : QuantPl -&gt; CN -&gt; NP ; -- e.g. "these pizzas"
+ mkCN : AP -&gt; CN -&gt; CN ; -- e.g. "warm pizza"
+ mkAP : AdA -&gt; AP -&gt; AP ; -- e.g. "very warm"
+</PRE>
+<P>
+To start building phrases, we need rules of <B>lexical insertion</B>, which
+form phrases from single words:
+</P>
+<PRE>
+ mkCN : N -&gt; NP ;
+ mkAP : A -&gt; AP ;
+</PRE>
+<P>
+Notice that all (or, as many as possible) operations in the resource library
+have the name <CODE>mk</CODE><I>C</I>, where <I>C</I> is the value category of the operation.
+This means of course heavy overloading. For instance, the current library
+(version 1.2) has no less than 23 operations named <CODE>mkNP</CODE>!
+</P>
+<P>
+Now the sentence
+<center>
+<I>these very warm pizzas are Italian</I>
+</center>
+can be built as follows:
+</P>
+<PRE>
+ mkCl
+ (mkNP these_QuantPl
+ (mkCN (mkAP very_AdA (mkAP warm_A)) (mkCN pizza_CN)))
+ (mkAP italian_AP)
+</PRE>
+<P>
+The task we are facing now is to define the concrete syntax of <CODE>Foods</CODE> so that
+this syntactic tree gives the value of linearizing the semantic tree
+</P>
+<PRE>
+ Is (These (QKind (Very Warm) Pizza)) Italian
+</PRE>
+<P></P>
+<A NAME="toc84"></A>
+<H2>The resource API</H2>
+<P>
+The resource library API is divided into language-specific
+and language-independent parts. To put it roughly,
+</P>
+<UL>
+<LI>the syntax API is language-independent, i.e. has the same types and
+ functions for all languages.
+ Its name is <CODE>Syntax</CODE><I>L</I> for each language <I>L</I>
+<LI>the morphology API is language-specific, i.e. has partly
+ different types and functions
+ for different languages.
+ Its name is <CODE>Paradigms</CODE><I>L</I> for each language <I>L</I>
+</UL>
+
+<P>
+A full documentation of the API is available on-line in the
+<B>resource synopsis</B>.
+For the examples of this chapter, we will only need a
+fragment of the full API. The fragment needed for <CODE>Foods</CODE> has
+already been introduced, but let us summarize the descriptions
+by giving tables of the same form as used in the resource synopsis.
+</P>
+<P>
+Thus we will make use of the following categories from the module <CODE>Syntax</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Category</TH>
+<TH>Explanation</TH>
+<TH COLSPAN="2">Example</TH>
+</TR>
+<TR>
+<TD><CODE>Cl</CODE></TD>
+<TD>clause (sentence), with all tenses</TD>
+<TD><I>she looks at this</I></TD>
+</TR>
+<TR>
+<TD><CODE>AP</CODE></TD>
+<TD>adjectival phrase</TD>
+<TD><I>very warm</I></TD>
+</TR>
+<TR>
+<TD><CODE>CN</CODE></TD>
+<TD>common noun (without determiner)</TD>
+<TD><I>red house</I></TD>
+</TR>
+<TR>
+<TD><CODE>NP</CODE></TD>
+<TD>noun phrase (subject or object)</TD>
+<TD><I>the red house</I></TD>
+</TR>
+<TR>
+<TD><CODE>AdA</CODE></TD>
+<TD>adjective-modifying adverb,</TD>
+<TD><I>very</I></TD>
+</TR>
+<TR>
+<TD><CODE>QuantSg</CODE></TD>
+<TD>singular quantifier</TD>
+<TD><I>these</I></TD>
+</TR>
+<TR>
+<TD><CODE>QuantPl</CODE></TD>
+<TD>plural quantifier</TD>
+<TD><I>these</I></TD>
+</TR>
+<TR>
+<TD><CODE>A</CODE></TD>
+<TD>one-place adjective</TD>
+<TD><I>warm</I></TD>
+</TR>
+<TR>
+<TD><CODE>N</CODE></TD>
+<TD>common noun</TD>
+<TD><I>house</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+We will use the following syntax rules from <CODE>Syntax</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH>Type</TH>
+<TH COLSPAN="2">Example</TH>
+</TR>
+<TR>
+<TD><CODE>mkCl</CODE></TD>
+<TD><CODE>NP -&gt; AP -&gt; Cl</CODE></TD>
+<TD><I>John is very old</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkNP</CODE></TD>
+<TD><CODE>QuantSg -&gt; CN -&gt; NP</CODE></TD>
+<TD><I>this old man</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkNP</CODE></TD>
+<TD><CODE>QuantPl -&gt; CN -&gt; NP</CODE></TD>
+<TD><I>these old man</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkCN</CODE></TD>
+<TD><CODE>N -&gt; CN</CODE></TD>
+<TD><I>house</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkCN</CODE></TD>
+<TD><CODE>AP -&gt; CN -&gt; CN</CODE></TD>
+<TD><I>very big blue house</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkAP</CODE></TD>
+<TD><CODE>A -&gt; AP</CODE></TD>
+<TD><I>old</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkAP</CODE></TD>
+<TD><CODE>AdA -&gt; AP -&gt; AP</CODE></TD>
+<TD><I>very very old</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+We will use the following structural words from <CODE>Syntax</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH>Type</TH>
+<TH COLSPAN="2">In English</TH>
+</TR>
+<TR>
+<TD><CODE>this_QuantSg</CODE></TD>
+<TD><CODE>QuantSg</CODE></TD>
+<TD><I>this</I></TD>
+</TR>
+<TR>
+<TD><CODE>that_QuantSg</CODE></TD>
+<TD><CODE>QuantSg</CODE></TD>
+<TD><I>that</I></TD>
+</TR>
+<TR>
+<TD><CODE>these_QuantPl</CODE></TD>
+<TD><CODE>QuantPl</CODE></TD>
+<TD><I>this</I></TD>
+</TR>
+<TR>
+<TD><CODE>those_QuantPl</CODE></TD>
+<TD><CODE>QuantPl</CODE></TD>
+<TD><I>that</I></TD>
+</TR>
+<TR>
+<TD><CODE>very_AdA</CODE></TD>
+<TD><CODE>AdA</CODE></TD>
+<TD><I>very</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For English, we will use the following part of <CODE>ParadigmsEng</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(dog : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(man,men : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(cold : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For Italian, we need just the following part of <CODE>ParadigmsIta</CODE>
+(Exercise). The "smart" paradigms will take care of variations
+such as <I>formaggio</I>-<I>formaggi</I>, and also infer the genders
+correctly.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(vino : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(caro : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For German, we will use the following part of <CODE>ParadigmsGer</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>Gender</CODE></TD>
+<TD><CODE>Type</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>masculine</CODE></TD>
+<TD><CODE>Gender</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>feminine</CODE></TD>
+<TD><CODE>Gender</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>neuter</CODE></TD>
+<TD><CODE>Gender</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(Stufe : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(Bild,Bilder : Str) -&gt; Gender -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(klein : Str) -&gt; A</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(gut,besser,beste : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For Finnish, we only need the smart regular paradigms:
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(talo : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(hieno : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+<B>Exercise</B>. Try out the morphological paradigms in different languages. Do
+as follows:
+</P>
+<PRE>
+ &gt; i -path=alltenses:prelude -retain alltenses/ParadigmsGer.gfr
+ &gt; cc mkN "Farbe"
+ &gt; cc mkA "gut" "besser" "beste"
+</PRE>
+<P></P>
+<A NAME="toc85"></A>
+<H2>Example: English</H2>
+<P>
+<a name="secenglish"></a>
+</P>
+<P>
+We work with the abstract syntax <CODE>Foods</CODE> from <a href="#chaptwo">the fourth chapter</a>, and
+build first an English implementation. Now we can do it without
+thinking about inflection and agreement, by just picking appropriate
+functions from the resource grammar library.
+</P>
+<P>
+The concrete syntax opens <CODE>SyntaxEng</CODE> and <CODE>ParadigmsEng</CODE>
+to get access to the resource libraries needed. In order to find
+the libraries, a <CODE>path</CODE> directive is prepended. It contains
+two resource subdirectories --- <CODE>present</CODE> and <CODE>prelude</CODE> ---
+which are found relative to the environment variable <CODE>GF_LIB_PATH</CODE>.
+It also contains the current directory <CODE>.</CODE> and the directory <CODE>../foods</CODE>,
+in which <CODE>Foods.gf</CODE> resides.
+</P>
+<PRE>
+ --# -path=.:../foods:present:prelude
+
+ concrete FoodsEng of Foods = open SyntaxEng,ParadigmsEng in {
+</PRE>
+<P>
+As linearization types, we will use clauses for <CODE>Phrase</CODE>, noun phrases
+for <CODE>Item</CODE>, common nouns for <CODE>Kind</CODE>, and adjectival phrases for <CODE>Quality</CODE>.
+</P>
+<PRE>
+ lincat
+ Phrase = Cl ;
+ Item = NP ;
+ Kind = CN ;
+ Quality = AP ;
+</PRE>
+<P>
+These types fit perfectly with the way we have used the categories
+in the application; hence
+the combination rules we need almost write themselves automatically:
+</P>
+<PRE>
+ lin
+ Is item quality = mkCl item quality ;
+ This kind = mkNP this_QuantSg kind ;
+ That kind = mkNP that_QuantSg kind ;
+ These kind = mkNP these_QuantPl kind ;
+ Those kind = mkNP those_QuantPl kind ;
+ QKind quality kind = mkCN quality kind ;
+ Very quality = mkAP very_AdA quality ;
+</PRE>
+<P>
+We write the lexical part of the grammar by using resource paradigms directly.
+Notice that we have to apply the lexical insertion rules to get type-correct
+linearizations. Notice also that we need to use the two-place noun paradigm for
+<I>fish</I>, but everythins else is regular.
+</P>
+<PRE>
+ Wine = mkCN (mkN "wine") ;
+ Pizza = mkCN (mkN "pizza") ;
+ Cheese = mkCN (mkN "cheese") ;
+ Fish = mkCN (mkN "fish" "fish") ;
+ Fresh = mkAP (mkA "fresh") ;
+ Warm = mkAP (mkA "warm") ;
+ Italian = mkAP (mkA "Italian") ;
+ Expensive = mkAP (mkA "expensive") ;
+ Delicious = mkAP (mkA "delicious") ;
+ Boring = mkAP (mkA "boring") ;
+ }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Compile the grammar <CODE>FoodsEng</CODE> and generate
+and parse some sentences.
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax of <CODE>Foods</CODE> for Italian
+or some other language included in the resource library. You can
+compare the results with the hand-written
+grammars presented earlier in this tutorial.
+</P>
+<A NAME="toc86"></A>
+<H2>Functor implementation of multilingual grammars</H2>
+<P>
+<a name="secfunctor"></a>
+</P>
+<P>
+If you did the exercise of writing a concrete syntax of <CODE>Foods</CODE> for some other
+language, you probably noticed that much of the code looks exactly the same
+as for English. The reason for this is that the <CODE>Syntax</CODE> API is the
+same for all languages. This is in turn possible because
+all languages (at least those in the resource package)
+implement the same syntactic structures. Moreover, languages tend to use the
+syntactic structures in similar ways, even though this is not exceptionless.
+But usually, it is only the lexical parts of a concrete syntax that
+we need to write anew for a new language. Thus, to port a grammar to
+a new language, you
+</P>
+<OL>
+<LI>copy the concrete syntax of a given language
+<LI>change the words (strings and inflection paradigms)
+</OL>
+
+<P>
+Now, programming by copy-and-paste is not worthy of a functional programmer!
+So, can we write a <I>function</I> that takes care of the shared parts of grammar modules?
+Yes, we can. It is not a function in the <CODE>fun</CODE> or <CODE>oper</CODE> sense, but
+a function operating on modules, called a <B>functor</B>. This construct
+is familiar from the functional programming
+languages ML and OCaml, but it does not
+exist in Haskell. It also bears some resemblance to templates in C++.
+Functors are also known as <B>parametrized modules</B>.
+</P>
+<P>
+In GF, a functor is a module that <CODE>open</CODE>s one or more <B>interfaces</B>.
+An <CODE>interface</CODE> is a module similar to a <CODE>resource</CODE>, but it only
+contains the <I>types</I> of <CODE>oper</CODE>s, not their definitions. You can think
+of an interface as a kind of a record type. The <CODE>oper</CODE> names are the
+labels of this record type. The corresponding <I>record</I> is called an
+<B>instance</B> of the interface.
+Thus a functor is a module-level function taking instances as
+arguments and producing modules as values.
+</P>
+<P>
+Let us now write a functor implementation of the <CODE>Food</CODE> grammar.
+Consider its module header first:
+</P>
+<PRE>
+ incomplete concrete FoodsI of Foods = open Syntax, LexFoods in
+</PRE>
+<P>
+A functor is distinguished from an ordinary module by the leading
+keyword <CODE>incomplete</CODE>.
+</P>
+<P>
+In the functor-function analogy, <CODE>FoodsI</CODE> would be presented as a function
+with the following type signature:
+</P>
+<PRE>
+ FoodsI :
+ instance of Syntax -&gt; instance of LexFoods -&gt; concrete of Foods
+</PRE>
+<P>
+It takes as arguments instances of two interfaces:
+</P>
+<UL>
+<LI><CODE>Syntax</CODE>, the resource grammar interface
+<LI><CODE>LexFoods</CODE>, the domain-specific lexicon interface
+</UL>
+
+<P>
+Functors opening <CODE>Syntax</CODE> and a domain lexicon interface are in fact
+so typical in GF applications, that this structure could be called
+a <B>design pattern</B>
+for GF grammars. What makes this pattern so useful is, again, that
+languages tend to use the same syntactic structures and only differ in words.
+</P>
+<P>
+We will show the exact syntax of interfaces and instances in next Section.
+Here it is enough to know that we have
+</P>
+<UL>
+<LI><CODE>SyntaxGer</CODE>, an instance of <CODE>Syntax</CODE>
+<LI><CODE>LexFoodsGer</CODE>, an instance of <CODE>LexFoods</CODE>
+</UL>
+
+<P>
+Then we can complete the German implementation by "applying" the functor:
+</P>
+<PRE>
+ FoodI SyntaxGer LexFoodsGer : concrete of Foods
+</PRE>
+<P>
+The GF syntax for doing so is
+</P>
+<PRE>
+ concrete FoodsGer of Foods = FoodsI with
+ (Syntax = SyntaxGer),
+ (LexFoods = LexFoodsGer) ;
+</PRE>
+<P>
+Notice that this is the <I>whole</I> module, not just a header of it.
+The module body is received from <CODE>FoodsI</CODE>, by instantiating the
+interface constants with their definitions given in the German
+instances. A module of this form, characterized by the keyword <CODE>with</CODE>, is
+called a <B>functor instantiation</B>.
+</P>
+<P>
+Here is the complete code for the functor <CODE>FoodsI</CODE>:
+</P>
+<PRE>
+ --# -path=.:../foods:present:prelude
+
+ incomplete concrete FoodsI of Foods = open Syntax, LexFoods in {
+ lincat
+ Phrase = Cl ;
+ Item = NP ;
+ Kind = CN ;
+ Quality = AP ;
+ lin
+ Is item quality = mkCl item quality ;
+ This kind = mkNP this_QuantSg kind ;
+ That kind = mkNP that_QuantSg kind ;
+ These kind = mkNP these_QuantPl kind ;
+ Those kind = mkNP those_QuantPl kind ;
+ QKind quality kind = mkCN quality kind ;
+ Very quality = mkAP very_AdA quality ;
+
+ Wine = mkCN wine_N ;
+ Pizza = mkCN pizza_N ;
+ Cheese = mkCN cheese_N ;
+ Fish = mkCN fish_N ;
+ Fresh = mkAP fresh_A ;
+ Warm = mkAP warm_A ;
+ Italian = mkAP italian_A ;
+ Expensive = mkAP expensive_A ;
+ Delicious = mkAP delicious_A ;
+ Boring = mkAP boring_A ;
+ }
+</PRE>
+<P></P>
+<A NAME="toc87"></A>
+<H2>Interfaces and instances</H2>
+<P>
+<a name="secinterface"></a>
+</P>
+<P>
+Let us now define the <CODE>LexFoods</CODE> interface:
+</P>
+<PRE>
+ interface LexFoods = open Syntax in {
+ oper
+ wine_N : N ;
+ pizza_N : N ;
+ cheese_N : N ;
+ fish_N : N ;
+ fresh_A : A ;
+ warm_A : A ;
+ italian_A : A ;
+ expensive_A : A ;
+ delicious_A : A ;
+ boring_A : A ;
+ }
+</PRE>
+<P>
+In this interface, only lexical items are declared. In general, an
+interface can declare any functions and also types. The <CODE>Syntax</CODE>
+interface does so.
+</P>
+<P>
+Here is a German instance of the interface.
+</P>
+<PRE>
+ instance LexFoodsGer of LexFoods = open SyntaxGer, ParadigmsGer in {
+ oper
+ wine_N = mkN "Wein" ;
+ pizza_N = mkN "Pizza" "Pizzen" feminine ;
+ cheese_N = mkN "Käse" "Käsen" masculine ;
+ fish_N = mkN "Fisch" ;
+ fresh_A = mkA "frisch" ;
+ warm_A = mkA "warm" "wärmer" "wärmste" ;
+ italian_A = mkA "italienisch" ;
+ expensive_A = mkA "teuer" ;
+ delicious_A = mkA "köstlich" ;
+ boring_A = mkA "langweilig" ;
+ }
+</PRE>
+<P>
+Notice that when an interface opens an interface, such as <CODE>Syntax</CODE>,
+here, then its instance has to open an instance of it. But the instance
+may also open some other resources --- very typically, like here,
+a domain lexicon instance opens a <CODE>Paradigms</CODE> module.
+</P>
+<P>
+Just to complete the picture, we repeat the German functor instantiation
+for <CODE>FoodsI</CODE>, this time with a path directive that makes it compilable.
+</P>
+<PRE>
+ --# -path=.:../foods:present:prelude
+
+ concrete FoodsGer of Foods = FoodsI with
+ (Syntax = SyntaxGer),
+ (LexFoods = LexFoodsGer) ;
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Compile and test <CODE>FoodsGer</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Refactor <CODE>FoodsEng</CODE> into a functor instantiation.
+</P>
+<A NAME="toc88"></A>
+<H2>Adding languages to a functor implementation</H2>
+<P>
+Once we have an application grammar defined by using a functor,
+adding a new language is simple. Just two modules need to be written:
+</P>
+<UL>
+<LI>a domain lexicon instance
+<LI>a functor instantiation
+</UL>
+
+<P>
+The functor instantiation is completely mechanical to write.
+Here is one for Finnish:
+</P>
+<PRE>
+ --# -path=.:../foods:present:prelude
+
+ concrete FoodsFin of Foods = FoodsI with
+ (Syntax = SyntaxFin),
+ (LexFoods = LexFoodsFin) ;
+</PRE>
+<P>
+The domain lexicon instance requires some knowledge of the words of the
+language: what words are used for which concepts, how the words are
+inflected, plus features such as genders. Here is a lexicon instance for
+Finnish:
+</P>
+<PRE>
+ instance LexFoodsFin of LexFoods = open SyntaxFin, ParadigmsFin in {
+ oper
+ wine_N = mkN "viini" ;
+ pizza_N = mkN "pizza" ;
+ cheese_N = mkN "juusto" ;
+ fish_N = mkN "kala" ;
+ fresh_A = mkA "tuore" ;
+ warm_A = mkA "lämmin" ;
+ italian_A = mkA "italialainen" ;
+ expensive_A = mkA "kallis" ;
+ delicious_A = mkA "herkullinen" ;
+ boring_A = mkA "tylsä" ;
+ }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Instantiate the functor <CODE>FoodsI</CODE> to some language of
+your choice.
+</P>
+<A NAME="toc89"></A>
+<H2>Division of labour revisited</H2>
+<P>
+One purpose with the resource grammars was stated to be a division
+of labour between linguists and application grammarians. We can now
+reflect on what this means more precisely, by asking ourselves what
+skills are required of grammarians working on different components.
+</P>
+<P>
+Building a GF application starts from the abstract syntax. Writing
+an abstract syntax requires
+</P>
+<UL>
+<LI>understanding of the semantic structure of the application domain
+<LI>knowledge of the GF fragment with categories and functions
+</UL>
+
+<P>
+If the concrete syntax is written by using a functor, the programmer
+has to decide what parts of the implementation are put to the interface
+and what parts are shared in the functor. This requires
+</P>
+<UL>
+<LI>knowing how the domain concepts are expressed in natural language
+<LI>knowledge of the resource grammar library --- the categories and combinators
+<LI>understanding what parts are likely to be expressed in language-dependent
+ ways, so that they are put to an interface and not the functor
+<LI>knowledge of the GF fragment with function applications and strings
+</UL>
+
+<P>
+Instantiating a ready-made functor to a new language is less demanding.
+It requires essentially
+</P>
+<UL>
+<LI>knowing how the domain words are expressed in the language
+<LI>knowing, roughly, how these words are inflected
+<LI>knowledge of the paradigms available in the library
+<LI>knowledge of the GF fragment with function applications and strings
+</UL>
+
+<P>
+Notice that none of these tasks requires the use of GF records, tables,
+or parameters. Thus only a small fragment of GF is needed; the rest of
+GF is only relevant for those who write the libraries. Essentially,
+all the machinery introduced in <a href="#chaptwo">the fourth chapter</a> is unnecessary!
+</P>
+<P>
+Of course, grammar writing is not always just straightforward usage of libraries.
+For example, GF can be used for other languages than just those in the
+libraries --- for both natural and formal languages. A knowledge of records
+and tables can, unfortunately, also be needed for understanding GF's error
+messages.
+</P>
+<P>
+<B>Exercise</B>. Design a small grammar that can be used for controlling
+an MP3 player. The grammar should be able to recognize commands such
+as <I>play this song</I>, with the following variations:
+</P>
+<UL>
+<LI>verbs: <I>play</I>, <I>remove</I>
+<LI>objects: <I>song</I>, <I>artist</I>
+<LI>determiners: <I>this</I>, <I>the previous</I>
+<LI>verbs without arguments: <I>stop</I>, <I>pause</I>
+</UL>
+
+<P>
+The implementation goes in the following phases:
+</P>
+<OL>
+<LI>abstract syntax
+<LI>functor and lexicon interface
+<LI>lexicon instance for the first language
+<LI>functor instantiation for the first language
+<LI>lexicon instance for the second language
+<LI>functor instantiation for the second language
+<LI>...
+</OL>
+
+<A NAME="toc90"></A>
+<H2>Restricted inheritance</H2>
+<P>
+A functor implementation using the resource <CODE>Syntax</CODE> interface
+works well as long as all concepts are expressed by using the same structures
+in all languages. If this is not the case, the deviant linearization can
+be made into a parameter and moved to the domain lexicon interface.
+</P>
+<P>
+The <CODE>Foods</CODE> grammar works so well that we have to
+take a contrived example: assume that English has
+no word for <CODE>Pizza</CODE>, but has to use the paraphrase <I>Italian pie</I>.
+This paraphrase is no longer a noun <CODE>N</CODE>, but a complex phrase
+in the category <CODE>CN</CODE>. An obvious way to solve this problem is
+to change interface <CODE>LexFoods</CODE> so that the constant declared for
+<CODE>Pizza</CODE> gets a new type:
+</P>
+<PRE>
+ oper pizza_CN : CN ;
+</PRE>
+<P>
+But this solution is unstable: we may end up changing the interface
+and the function with each new language, and we must every time also
+change the interface instances for the old languages to maintain
+type correctness.
+</P>
+<P>
+A better solution is to use <B>restricted inheritance</B>: the English
+instantiation inherits the functor implementation except for the
+constant <CODE>Pizza</CODE>. This is how we write:
+</P>
+<PRE>
+ --# -path=.:../foods:present:prelude
+
+ concrete FoodsEng of Foods = FoodsI - [Pizza] with
+ (Syntax = SyntaxEng),
+ (LexFoods = LexFoodsEng) **
+ open SyntaxEng, ParadigmsEng in {
+
+ lin Pizza = mkCN (mkA "Italian") (mkN "pie") ;
+ }
+</PRE>
+<P>
+Restricted inheritance is available for all inherited modules. One can for
+instance exclude some mushrooms and pick up just some fruit in
+the <CODE>FoodMarket</CODE> example "Rsecarchitecture:
+</P>
+<PRE>
+ abstract Foodmarket = Food, Fruit [Peach], Mushroom - [Agaric]
+</PRE>
+<P>
+A concrete syntax of <CODE>Foodmarket</CODE> must then have the same inheritance
+restrictions, in order to be well-typed with respect to the abstract syntax.
+</P>
+<A NAME="toc91"></A>
+<H2>Grammar reuse</H2>
+<P>
+The alert reader has certainly noticed an analogy between <CODE>abstract</CODE>
+and <CODE>concrete</CODE>, on the one hand, and <CODE>interface</CODE> and <CODE>instance</CODE>,
+on the other. Why are these two pairs of module types kept separate
+at all? There is, in fact, a very close correspondence between
+judgements in the two kinds of modules:
+</P>
+<PRE>
+ cat C &lt;---&gt; oper C : Type
+
+ fun f : A &lt;---&gt; oper f : A
+
+ lincat C = T &lt;---&gt; oper C : Type = T
+
+ lin f = t &lt;---&gt; oper f : A = t
+</PRE>
+<P>
+But there are also some differences:
+</P>
+<UL>
+<LI><CODE>abstract</CODE> and <CODE>concrete</CODE> modules define <B>top-level grammars</B>, i.e.
+ grammars that can be used for parsing and linearization; this is because
+<LI>the types and terms in <CODE>concrete</CODE> modules are restricted to a subset
+ of those available in <CODE>interface</CODE>, <CODE>instance</CODE>, and <CODE>resource</CODE>
+<LI><CODE>param</CODE> judgements have no counterparts in top-level grammars
+</UL>
+
+<P>
+The term that can be used for interfaces, instances, and resources is
+<B>resource-level grammars</B>.
+From these explanations and the above translations it follows that top-level
+grammars are, in a sense, a special case of resource-level grammars.
+</P>
+<P>
+Thus, indeed, abstract syntax modules can be used like interfaces, and concrete syntaxes
+as their instances. The use of top-level grammars as resources
+is called <B>grammar reuse</B>. Whether a library module is a top-level or a
+resource-level module is mostly invisible to application programmers
+(see the Summary <a href="#seclock">here</a>
+for an exception to this). The GF resource grammar
+library itself is in fact built in two layers:
+</P>
+<UL>
+<LI>the <B>ground resource</B>: a set of top-level grammars for syntactic structures
+<LI>the <B>surface resource</B>: a resource-level grammar with overloaded operations
+ defined in terms of the ground resource
+</UL>
+
+<P>
+Both the ground
+resource and the surface resource can be used by application programmers,
+but it is the surface resource that we use in this book. Because of overloading,
+it has much fewer function names and also flatter trees. For instance, the clause
+<center>
+<I>these very warm pizzas are Italian</I>
+</center>
+which in the surface resource can be built as
+</P>
+<PRE>
+ mkCl
+ (mkNP these_QuantPl
+ (mkCN (mkAP very_AdA (mkAP warm_A)) (mkCN pizza_CN)))
+ (mkAP italian_AP)
+</PRE>
+<P>
+has in the ground resource the much more complex tree
+</P>
+<PRE>
+ PredVP
+ (DetCN (DetPl (PlQuant this_Quant) NoNum NoOrd)
+ (AdjCN (AdAP very_AdA (PositA warm_A)) (UseN pizza_N)))
+ (UseComp (CompAP (PositA italian_A)))
+</PRE>
+<P>
+The main advantage of using the ground resource is that the trees can then be found
+by using the parser, as shown in the next section. Otherwise, the overloaded surface
+resource constants are much easier to use.
+</P>
+<P>
+Needless to say, once a library has been defined in some way, it is easy to
+build layers of <B>derived libraries</B> on top of it, by using grammar reuse
+and, in the case of multilingual libraries, functors. This is indeed how
+the surface resource has been implemented: as a functored parametrized on
+the abstract syntax of the ground resource.
+</P>
+<A NAME="toc92"></A>
+<H2>Browsing the resource with GF commands</H2>
+<P>
+<a name="secbrowsing"></a>
+</P>
+<P>
+In addition to reading the
+<A HREF="../../lib/resource-1.0/synopsis.html">resource synopsis</A>, you
+can find resource function combinations by using the parser. This
+is so because the resource library is in the end implemented as
+a top-level <CODE>abstract-concrete</CODE> grammar, on which parsing
+and linearization work.
+</P>
+<P>
+Unfortunately, currently (GF 2.8)
+only English and the Scandinavian languages can be
+parsed within acceptable computer resource limits when the full
+resource is used.
+</P>
+<P>
+To look for a syntax tree in the overload API by parsing, do like this:
+</P>
+<PRE>
+ % gf -path=alltenses:prelude $GF_LIB_PATH/alltenses/OverLangEng.gfc
+
+ &gt; p -cat=S -overload "this grammar is too big"
+ mkS (mkCl (mkNP this_QuantSg grammar_N) (mkAP too_AdA big_A))
+</PRE>
+<P>
+The <CODE>-overload</CODE> option given to the parser is a directive to find the
+shallowest overloaded term that matches the parse tree.
+</P>
+<P>
+To view linearizations in all languages by parsing from English:
+</P>
+<PRE>
+ % gf $GF_LIB_PATH/alltenses/langs.gfcm
+
+ &gt; p -cat=S -lang=LangEng "this grammar is too big" | tb
+ UseCl TPres ASimul PPos (PredVP (DetCN (DetSg (SgQuant this_Quant)
+ NoOrd) (UseN grammar_N)) (UseComp (CompAP (AdAP too_AdA (PositA big_A)))))
+ Den här grammatiken är för stor
+ Esta gramática es demasiado grande
+ (Cyrillic: eta grammatika govorit des'at' jazykov)
+ Denne grammatikken er for stor
+ Questa grammatica è troppo grande
+ Diese Grammatik ist zu groß
+ Cette grammaire est trop grande
+ Tämä kielioppi on liian suuri
+ This grammar is too big
+ Denne grammatik er for stor
+</PRE>
+<P>
+This method shows the unambiguous ground resource functions and not
+the overloaded ones. It uses a precompiled grammar package of the GFCM or GFCC
+format; see <a href="#chapeight">the eighth chapter</a> for more information on this.
+</P>
+<P>
+Unfortunately, the Russian grammar uses at the moment a different
+character encoding than the rest and is therefore not displayed correctly
+in a terminal window. However, the GF syntax editor does display all
+examples correctly --- again, using the ground resource:
+</P>
+<PRE>
+ % gfeditor $GF_LIB_PATH/alltenses/langs.gfcm
+</PRE>
+<P>
+When you have constructed the tree, you will see the following screen:
+</P>
+<P>
+<center>
+</P>
+<P>
+ <IMG ALIGN="right" SRC="10lang-small.png" BORDER="0" ALT="">
+</P>
+<P>
+</center>
+</P>
+<P>
+<B>Exercise</B>. Find the resource grammar translations for the following
+English phrases (parse in the category <CODE>Phr</CODE>). You can first try to
+build the terms manually.
+</P>
+<P>
+<I>every man loves a woman</I>
+</P>
+<P>
+<I>this grammar speaks more than ten languages</I>
+</P>
+<P>
+<I>which languages aren't in the grammar</I>
+</P>
+<P>
+<I>which languages did you want to speak</I>
+</P>
+<A NAME="toc93"></A>
+<H2>An extended Foods grammar</H2>
+<P>
+<a name="secextended"></a>
+</P>
+<P>
+Now that we know how to find information in the resource grammar,
+we can easily extend the <CODE>Foods</CODE> fragment considerably. We shall enable
+the following new expressions:
+</P>
+<UL>
+<LI>questions: <I>Is this pizza Italian?</I> <I>Which pizza do you want to eat?</I>
+<LI>imperatives: <I>Eat that pizza please!</I>
+<LI>denials: <I>These pizzas are not Italian.</I>
+<LI>verbs: <I>eat</I>, <I>pay</I>
+<LI>guests, in addition to food items: <I>I, you, this lady</I>
+</UL>
+
+<A NAME="toc94"></A>
+<H3>Abstract syntax</H3>
+<P>
+Since we don't want to change the already existing <CODE>Foods</CODE> module,
+we build an extension of it, <CODE>ExtFoods</CODE>:
+</P>
+<PRE>
+ abstract ExtFoods = Foods ** {
+
+ flags startcat=Move ;
+
+ cat
+ Move ; -- dialogue move: declarative, question, or imperative
+ Verb ; -- transitive verb
+ Guest ; -- guest in restaurant
+ GuestKind ; -- type of guest
+
+ fun
+ MAssert : Phrase -&gt; Move ; -- This pizza is warm.
+ MDeny : Phrase -&gt; Move ; -- This pizza isn't warm.
+ MAsk : Phrase -&gt; Move ; -- Is this pizza warm?
+
+ PVerb : Guest -&gt; Verb -&gt; Item -&gt; Phrase ; -- we eat this pizza
+ PVerbWant : Guest -&gt; Verb -&gt; Item -&gt; Phrase ; -- we want to eat this pizza
+
+ WhichVerb :
+ Kind -&gt; Guest -&gt; Verb -&gt; Move ; -- Which pizza do you eat?
+ WhichVerbWant :
+ Kind -&gt; Guest -&gt; Verb -&gt; Move ; -- Which pizza do you want to eat?
+ WhichIs : Kind -&gt; Quality -&gt; Move ; -- Which wine is Italian?
+
+ Do : Verb -&gt; Item -&gt; Move ; -- Pay this wine!
+ DoPlease : Verb -&gt; Item -&gt; Move ; -- Pay this wine please!
+
+ I, You, We : Guest ;
+
+ GThis, GThat, GThese, GThose : GuestKind -&gt; Guest ;
+
+ Eat, Drink, Pay : Verb ;
+
+ Lady, Gentleman : GuestKind ;
+ }
+</PRE>
+<P>
+The concrete syntax is implemented by a functor that extends the
+already defined functor <CODE>FoodsI</CODE>.
+</P>
+<PRE>
+ incomplete concrete ExtFoodsI of ExtFoods =
+ FoodsI ** open Syntax, LexFoods in {
+
+ flags lexer=text ; unlexer=text ;
+</PRE>
+<P>
+The flags set up a lexer and unlexer that can deal with sentence-initial
+capital letters and proper spacing with punctuation (see <a href="#seclexing">here</a>
+for more information on lexers and unlexers).
+</P>
+<A NAME="toc95"></A>
+<H3>Linearization types</H3>
+<P>
+If we look at the resource documentation, we find several categories
+that are above the clause level and can thus host different kinds
+of dialogue moves:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Category</TH>
+<TH>Explanation</TH>
+<TH COLSPAN="2">Example</TH>
+</TR>
+<TR>
+<TD><CODE>Text</CODE></TD>
+<TD>text consisting of phrases</TD>
+<TD><I>He is here. Why?</I></TD>
+</TR>
+<TR>
+<TD><CODE>Phr</CODE></TD>
+<TD>phrase in a text</TD>
+<TD><I>but be quiet please</I></TD>
+</TR>
+<TR>
+<TD><CODE>Utt</CODE></TD>
+<TD>sentence, question, word...</TD>
+<TD><I>be quiet</I></TD>
+</TR>
+<TR>
+<TD><CODE>S</CODE></TD>
+<TD>declarative sentence</TD>
+<TD><I>she lived here</I></TD>
+</TR>
+<TR>
+<TD><CODE>QS</CODE></TD>
+<TD>question</TD>
+<TD><I>where did she live</I></TD>
+</TR>
+<TR>
+<TD><CODE>Imp</CODE></TD>
+<TD>imperative</TD>
+<TD><I>look at this</I></TD>
+</TR>
+<TR>
+<TD><CODE>QCl</CODE></TD>
+<TD>question clause, with all tenses</TD>
+<TD><I>why does she walk</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+We also find that only the category <CODE>Text</CODE> contains punctuation marks.
+So we choose this as the linearization type of <CODE>Move</CODE>. The other types
+are quite obvious.
+</P>
+<PRE>
+ lincat
+ Move = Text ;
+ Verb = V2 ;
+ Guest = NP ;
+ GuestKind = CN ;
+</PRE>
+<P>
+The category <CODE>V2</CODE> of <B>two-place verbs</B> includes both
+<B>transitive verbs</B> that take <B>direct objects</B> (e.g. <I>we watch him</I>)
+and verbs that take other kinds of <B>complements</B>, often with
+prepositions (<I>we look at him</I>). In a multilingual grammar, it is
+not guaranteed that transitive verbs are transitive in all languages,
+so the more general notion of two-place verb is more appropriate.
+</P>
+<A NAME="toc96"></A>
+<H3>Linearization rules</H3>
+<P>
+Now we need to find constructors that combine the new categories in
+appropriate ways. To form a text from a clause, we first make it into
+a sentence with <CODE>mkS</CODE>, and then apply <CODE>mkText</CODE>:
+</P>
+<PRE>
+ lin MAssert p = mkText (mkS p) ;
+</PRE>
+<P>
+The function <CODE>mkS</CODE> has in the resource synopsis been given the type
+</P>
+<PRE>
+ mkS : (Tense) -&gt; (Ant) -&gt; (Pol) -&gt; Cl -&gt; S
+</PRE>
+<P>
+Parentheses around type names do not make any difference for the GF compiler,
+but in the synopsis notation they indicate <B>optionality</B>: any of the
+optional arguments can be omitted, and there is an instance of <CODE>mkS</CODE>
+available. For each optional type, it uses the <B>default value</B> for that
+type, which for the <B>polarity</B> <CODE>Pol</CODE> is positive i.e. unnegated.
+To build a negative sentence, we use an explicit polarity constructor:
+</P>
+<PRE>
+ MDeny p = mkText (mkS negativePol p) ;
+</PRE>
+<P>
+Of course, we could have used <CODE>positivePol</CODE> in the first rule, instead of
+relying on the default. (The types <CODE>Tense</CODE> and <CODE>Ant</CODE> will be explained
+<a href="#sectense">here</a>.)
+</P>
+<P>
+Phrases can be made into <B>question sentences</B>, which in turn can be
+made into texts in a similar way as sentences; the default
+punctuation mark is not the full stop but the question mark.
+</P>
+<PRE>
+ MAsk p = mkText (mkQS p) ;
+</PRE>
+<P>
+There is an <CODE>mkCl</CODE> instance that directly builds a clause from a noun phrase,
+a two-place verb, and another noun phrase.
+</P>
+<PRE>
+ PVerb = mkCl ;
+</PRE>
+<P>
+The auxiliary verb <I>want</I> requires a <B>verb phrase</B> (<CODE>VP</CODE>) as its complement. It
+can be built from a two-place verb and its noun phrase complement.
+</P>
+<PRE>
+ PVerbWant guest verb item = mkCl guest want_VV (mkVP verb item) ;
+</PRE>
+<P>
+The <B>interrogative determiner</B> (<CODE>IDet</CODE>) <I>which</I> can be combined with
+a common noun to form an <B>interrogative phrase</B> (<CODE>IP</CODE>). This <CODE>IP</CODE> can then
+be used as a subject in a <B>question clause</B> (<CODE>QCl</CODE>), which in turn is
+made into a <CODE>QS</CODE> and finally to a <CODE>Text</CODE>.
+</P>
+<PRE>
+ WhichIs kind quality =
+ mkText (mkQS (mkQCl (mkIP whichSg_IDet kind) (mkVP quality))) ;
+</PRE>
+<P>
+When interrogative phrases are used as <I>objects</I>, the resource library
+uses a category named <CODE>Slash</CODE> of
+objectless sentences. The name cames from the <B>slash categories</B> of the
+GPSG grammar formalism
+(Gazdar &amp; al. 1985). Slashes can be formed from subjects and two-place verbs,
+also with an intervening auxiliary verb.
+</P>
+<PRE>
+ WhichVerb kind guest verb =
+ mkText (mkQS (mkQCl (mkIP whichSg_IDet kind)
+ (mkSlash guest verb))) ;
+ WhichVerbWant kind guest verb =
+ mkText (mkQS (mkQCl (mkIP whichSg_IDet kind)
+ (mkSlash guest want_VV verb))) ;
+</PRE>
+<P>
+Finally, we form the <B>imperative</B> (<CODE>Imp</CODE>) of a transitive verb
+and its object. We make it into a <B>polite</B> form utterance, and finally
+into a <CODE>Text</CODE> with an exclamation mark.
+</P>
+<PRE>
+ Do verb item =
+ mkText
+ (mkPhr (mkUtt politeImpForm (mkImp verb item))) exclMarkPunct ;
+ DoPlease verb item =
+ mkText
+ (mkPhr (mkUtt politeImpForm (mkImp verb item)) please_Voc)
+ exclMarkPunct ;
+</PRE>
+<P>
+The rest of the concrete syntax is straightforward use of structural words,
+</P>
+<PRE>
+ I = mkNP i_Pron ;
+ You = mkNP youPol_Pron ;
+ We = mkNP we_Pron ;
+ GThis = mkNP this_QuantSg ;
+ GThat = mkNP that_QuantSg ;
+ GThese = mkNP these_QuantPl ;
+ GThose = mkNP those_QuantPl ;
+</PRE>
+<P>
+and of the food lexicon,
+</P>
+<PRE>
+ Eat = eat_V2 ;
+ Drink = drink_V2 ;
+ Pay = pay_V2 ;
+ Lady = lady_N ;
+ Gentleman = gentleman_N ;
+ }
+</PRE>
+<P>
+Notice that we have no reason to build an extension of <CODE>LexFoods</CODE>, but we just
+add words to the old one. Since <CODE>LexFoods</CODE> instances are resource modules,
+the superfluous definitions that they contain have no effect on the
+modules that just <CODE>open</CODE> them, and thus the smaller <CODE>Foods</CODE> grammars
+don't suffer from the additions we make.
+</P>
+<P>
+<B>Exercise</B>. Port the <CODE>ExtFoods</CODE> grammars to some new languages, building
+on the <CODE>Foods</CODE> implementations from previous sections, and using the functor
+defined in this section.
+</P>
+<A NAME="toc97"></A>
+<H2>Tenses</H2>
+<P>
+<a name="sectense"></a>
+</P>
+<P>
+When compiling the <CODE>ExtFoods</CODE> grammars, we have used the path
+</P>
+<PRE>
+ --# -path=.:../foods:present:prelude
+</PRE>
+<P>
+where the library subdirectory <CODE>present</CODE> refers to a restricted version
+of the resource that covers only the present tense of verbs and sentences.
+Having this version available is motivatad by efficiency reasons: tenses
+produce in many languages a manifold of forms and combinations, which
+multiply the size of the grammar; at the same time, many applications,
+both technical ones and spoken dialogues, only need the present tense.
+</P>
+<P>
+But it is easy change the grammars so that they admit of the full set
+of tenses. It is enough to change the path to
+</P>
+<PRE>
+ --# -path=.:../foods:alltenses:prelude
+</PRE>
+<P>
+and recompile the grammars from source (flag <CODE>-src</CODE>); the libraries are
+not recompiled, because their sources cannot be found on the path list.
+Then it is possible to see all the tenses of
+phrases, by using the <CODE>-all</CODE> flag in linearization:
+</P>
+<PRE>
+ &gt; gr -cat=Phrase | l -all
+ This wine is delicious
+ Is this wine delicious
+ This wine isn't delicious
+ Isn't this wine delicious
+ This wine is not delicious
+ Is this wine not delicious
+ This wine has been delicious
+ Has this wine been delicious
+ This wine hasn't been delicious
+ Hasn't this wine been delicious
+ This wine has not been delicious
+ Has this wine not been delicious
+ This wine was delicious
+ Was this wine delicious
+ This wine wasn't delicious
+ Wasn't this wine delicious
+ This wine was not delicious
+ Was this wine not delicious
+ This wine had been delicious
+ Had this wine been delicious
+ This wine hadn't been delicious
+ Hadn't this wine been delicious
+ This wine had not been delicious
+ Had this wine not been delicious
+ This wine will be delicious
+ Will this wine be delicious
+ This wine won't be delicious
+ Won't this wine be delicious
+ This wine will not be delicious
+ Will this wine not be delicious
+ This wine will have been delicious
+ Will this wine have been delicious
+ This wine won't have been delicious
+ Won't this wine have been delicious
+ This wine will not have been delicious
+ Will this wine not have been delicious
+ This wine would be delicious
+ Would this wine be delicious
+ This wine wouldn't be delicious
+ Wouldn't this wine be delicious
+ This wine would not be delicious
+ Would this wine not be delicious
+ This wine would have been delicious
+ Would this wine have been delicious
+ This wine wouldn't have been delicious
+ Wouldn't this wine have been delicious
+ This wine would not have been delicious
+ Would this wine not have been delicious
+</PRE>
+<P>
+In addition to tenses, the linearization writes all parametric
+variations --- polarity and word order (direct vs. inverted) --- as
+well as the variation between contracted and full negation words.
+Of course, the list is even longer in languages that have more
+tenses and moods, e.g. the Romance languages.
+</P>
+<P>
+In the <CODE>ExtFoods</CODE> grammar, tenses never find their way to the
+top level of <CODE>Move</CODE>s. Therefore it is useless to carry around
+the clause and verb tenses given in the <CODE>alltenses</CODE> set of libraries.
+But with the library, it is easy to add tenses to <CODE>Move</CODE>s. For
+instance, one can add the rules
+</P>
+<PRE>
+ fun MAssertFut : Phrase -&gt; Move ; -- I will pay this wine
+ fun MAssertPastPerf : Phrase -&gt; Move ; -- I had paid that wine
+ lin MAssertFut p = mkText (mkS futureTense p) ;
+ lin MAssertPastPerf p = mkText (mkS pastTense anteriorAnt p) ;
+</PRE>
+<P>
+Comparison with <CODE>MAssert</CODE> above shows that the absence of the tense
+and anteriority features defaults to present simultaneous tenses.
+</P>
+<P>
+<B>Exercise</B>. Measure the size of the context-free grammar corresponding to
+some concrete syntax of <CODE>ExtFoods</CODE> with all tenses.
+You can do this by printing the grammar in the context-free format
+(<CODE>print_grammar -printer=cfg</CODE>) and counting the lines.
+</P>
+<A NAME="toc98"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc99"></A>
+<H3>Interfaces and instances</H3>
+<P>
+An <B>interface module</B> (<CODE>interface</CODE> <I>I</I>) is like a <CODE>resource</CODE> module,
+the difference being that it does not need to give definitions in
+its <CODE>oper</CODE> and <CODE>param</CODE> judgements. Definitions are, however,
+allowed, and they may use constants that appear undefined in the
+module. For example, here is an interface for predication, which
+is parametrized on NP case and agreement features, and on the constituent
+order:
+</P>
+<PRE>
+ interface Predication = {
+ param
+ Case ;
+ Agreement ;
+ oper
+ subject : Case ;
+ object : Case ;
+ order : (verb,subj,obj : String) -&gt; String ;
+
+ NP : Type = {s : Case =&gt; Str ; a : Agreement} ;
+ TV : Type = {s : Agreement =&gt; Str} ;
+
+ sentence : TV -&gt; NP -&gt; NP -&gt; {s : Str} = \verb,subj,obj -&gt; {
+ s = order (verb ! subj.a) (subj ! subject) (obj ! object) ;
+ }
+</PRE>
+<P>
+An <B>instance module</B> (<CODE>instance</CODE> <I>J</I> <CODE>of</CODE> <I>I</I>) is also like a
+<CODE>resource</CODE>, but it is compiled in union with the interface that it
+is an instance <CODE>of</CODE>. This means that the definitions given in the
+instance are type-checked with respect to the types given in the
+interface. Moreover, overwriting types or definitions given in the interface
+is not allowed. But it is legal for an instance to contain definitions
+not included in the corresponding interface. Here is an instance of
+<CODE>Predication</CODE>, suitable for languages like English.
+</P>
+<PRE>
+ instance PredicationSimpleSVO of Predication = {
+ param
+ Case = Nom | Acc | Gen ;
+ Agreement = Agr Number Person ;
+
+ -- two new types
+ Number = Sg | Pl ;
+ Person = P1 | P2 | P3 ;
+
+ oper
+ subject = Nom ;
+ object = Acc ;
+ order = \verb,subj,obj -&gt; subj ++ verb ++ obj ;
+
+ -- the rest of the definitions don't need repetition
+ }
+</PRE>
+<P></P>
+<A NAME="toc100"></A>
+<H3>Grammar reuse</H3>
+<P>
+<a name="seclock"></a>
+</P>
+<P>
+Abstract syntax modules can be used like interfaces, and concrete syntaxes
+as their instances. The following translations then take place:
+</P>
+<PRE>
+ cat C ---&gt; oper C : Type
+
+ fun f : A ---&gt; oper f : A*
+
+ lincat C = T ---&gt; oper C : Type = T'
+
+ lin f = t ---&gt; oper f : A* = t'
+</PRE>
+<P>
+This translation is called <B>grammar reuse</B>. It uses a homomorphism
+from abstract types and terms to the concrete types and terms. For the
+sake of more type safety, the types are not exactly the same. Currently
+(GF 2.8), the type <I>T'</I> formed from the linearization type <I>T</I> of
+a category <I>C</I> is <I>T</I> extended with a dummy <B>lock field</B>. Thus
+</P>
+<PRE>
+ lincat C = T ---&gt; oper C = T ** {lock_C : {}}
+</PRE>
+<P>
+and the linearization terms are lifted correspondingly. The user of
+a GF library should never see any lock fields; when they appear in
+the compiler's warnings, they indicate that some library category is
+constructed improperly by a user program.
+</P>
+<A NAME="toc101"></A>
+<H3>Functors</H3>
+<P>
+A <B>parametrized module</B>, aka. an <B>incomplete module</B>, or a
+<B>functor</B>, is any module that <CODE>open</CODE>s an <CODE>interface</CODE> (or
+an <CODE>abstract</CODE>). Several interfaces may be opened by one
+functor. The module header must be prefixed by the word <CODE>incomplete</CODE>.
+Here is a typical example, using the resource <CODE>Syntax</CODE> and
+a domain specific lexicon:
+</P>
+<PRE>
+ incomplete concrete DomainI of Domain = open Syntax, Lex in {...} ;
+</PRE>
+<P>
+A <B>functor instantiation</B> is a module that inherits a functor and
+provides an instance to each of its open interfaces. Here is an example:
+</P>
+<PRE>
+ concrete DomainSwe of Domain = DomainI with
+ (Syntax = SyntaxSwe),
+ (Lex = LexSwe) ;
+</PRE>
+<P></P>
+<A NAME="toc102"></A>
+<H3>Restricted inheritance</H3>
+<P>
+A module of any type can make <B>restricted inheritance</B>, which is
+either exclusion or inclusion:
+</P>
+<PRE>
+ module M = A[f,g], B-[k] ** ...
+</PRE>
+<P>
+A concrete syntax given to an abstract syntax that uses restricted inheritance
+must make the corresponding restrictions. In addition, the concrete syntax can
+make its own restrictions in order to redefine inherited linearization types and
+rules.
+</P>
+<P>
+Overriding old definitions without explicit restrictions is not allowed.
+</P>
+<A NAME="toc103"></A>
+<H1>Refining semantics in abstract syntax</H1>
+<P>
+<a name="chapsix"></a>
+</P>
+<P>
+While the concrete syntax constructs of GF have been already
+covered, there is much more that can be done in the abstract
+syntax. The techniques of <B>dependent types</B> and
+<B>higher order abstract syntax</B> are introduced in this chapter,
+which thereby concludes the presentation of the GF language.
+</P>
+<P>
+Many of the examples in this chapter are somewhat less close to
+applications than the ones shown before. Moreover, the tools for
+embedded grammars in <a href="#chapeight">the eighth chapter</a> do not yet fully support dependent
+types and higher order abstract syntax.
+</P>
+<A NAME="toc104"></A>
+<H2>GF as a logical framework</H2>
+<P>
+In this chapter, we will show how
+to encode advanced semantic concepts in an abstract syntax.
+We use concepts inherited from <B>type theory</B>. Type theory
+is the basis of many systems known as <B>logical frameworks</B>, which are
+used for representing mathematical theorems and their proofs on a computer.
+In fact, GF has a logical framework as its proper part:
+this part is the abstract syntax.
+</P>
+<P>
+In a logical framework, the formalization of a mathematical theory
+is a set of type and function declarations. The following is an example
+of such a theory, represented as an <CODE>abstract</CODE> module in GF.
+</P>
+<PRE>
+ abstract Arithm = {
+ cat
+ Prop ; -- proposition
+ Nat ; -- natural number
+ fun
+ Zero : Nat ; -- 0
+ Succ : Nat -&gt; Nat ; -- the successor of x
+ Even : Nat -&gt; Prop ; -- x is even
+ And : Prop -&gt; Prop -&gt; Prop ; -- A and B
+ }
+</PRE>
+<P>
+This example does not show any new type-theoretical constructs yet, but
+it could nevertheless be used as a part of a proof system for arithmetic.
+</P>
+<P>
+<B>Exercise</B>. Give a concrete syntax of <CODE>Arithm</CODE>, preferably
+by using the resource library.
+</P>
+<A NAME="toc105"></A>
+<H2>Dependent types</H2>
+<P>
+<a name="secsmarthouse"></a>
+</P>
+<P>
+<B>Dependent types</B> are a characteristic feature of GF,
+inherited from the <B>constructive type theory</B> of Martin-Löf and
+distinguishing GF from most other grammar formalisms and
+functional programming languages.
+</P>
+<P>
+Dependent types can be used for stating stronger
+<B>conditions of well-formedness</B> than ordinary types.
+A simple example is a "smart house" system, which
+defines voice commands for household appliances. This example
+is borrowed from the
+Regulus Book
+(Rayner &amp; al. 2006).
+</P>
+<P>
+One who enters a smart house can use a spoken <CODE>Command</CODE> to dim lights, switch
+on the fan, etc. For <CODE>Device</CODE>s of each <CODE>Kind</CODE>, there is a set of
+<CODE>Action</CODE>s that can be performed on them; thus one can dim the lights but
+ not the fan, for example. These dependencies can be expressed
+by making the type <CODE>Action</CODE> dependent on <CODE>Kind</CODE>. We express these
+dependencies in <CODE>cat</CODE> declarations by attaching argument types to
+categories:
+</P>
+<PRE>
+ cat
+ Command ;
+ Kind ;
+ Device Kind ; -- argument type Kind
+ Action Kind ;
+</PRE>
+<P>
+The crucial use of the dependencies is made in the rule for forming commands:
+</P>
+<PRE>
+ fun CAction : (k : Kind) -&gt; Action k -&gt; Device k -&gt; Command ;
+</PRE>
+<P>
+In other words: an action and a device can be combined into a command only
+if they are of the same <CODE>Kind</CODE> <CODE>k</CODE>. If we have the functions
+</P>
+<PRE>
+ DKindOne : (k : Kind) -&gt; Device k ; -- the light
+
+ light, fan : Kind ;
+ dim : Action light ;
+</PRE>
+<P>
+we can form the syntax tree
+</P>
+<PRE>
+ CAction light dim (DKindOne light)
+</PRE>
+<P>
+but we cannot form the trees
+</P>
+<PRE>
+ CAction light dim (DKindOne fan)
+ CAction fan dim (DKindOne light)
+ CAction fan dim (DKindOne fan)
+</PRE>
+<P>
+Linearization rules are written as usual: the concrete syntax does not
+know if a category is a dependent type. In English, one could write as follows:
+</P>
+<PRE>
+ lincat Action = {s : Str} ;
+ lin CAction _ act dev = {s = act.s ++ dev.s} ;
+</PRE>
+<P>
+Notice that the argument for <CODE>Kind</CODE> does not appear in the linearization;
+therefore it is good practice to make this clear by
+using a wild card for it, rather than a real
+variable.
+As we will show,
+the type checker can reconstruct the kind from the <CODE>dev</CODE> argument.
+</P>
+<P>
+Parsing with dependent types is performed in two phases:
+</P>
+<OL>
+<LI>context-free parsing
+<LI>filtering through type checker
+</OL>
+
+<P>
+If you just parse in the usual way, you don't enter the second phase, and
+the <CODE>kind</CODE> argument is not found:
+</P>
+<PRE>
+ &gt; parse "dim the light"
+ CAction ? dim (DKindOne light)
+</PRE>
+<P>
+Moreover, type-incorrect commands are not rejected:
+</P>
+<PRE>
+ &gt; parse "dim the fan"
+ CAction ? dim (DKindOne fan)
+</PRE>
+<P>
+The question mark <CODE>?</CODE> is a <B>metavariable</B>, and is returned by the parser
+for any subtree that is suppressed by a linearization rule.
+These are exactly the same kind of metavariables as were used <a href="#secediting">here</a>
+to mark incomplete parts of trees in the syntax editor.
+</P>
+<P>
+To get rid of metavariables, we must feed the parse result into the
+second phase of <B>solving</B> them. The <CODE>solve</CODE> process uses the dependent
+type checker to restore the values of the metavariables. It is invoked by
+the command <CODE>put_tree = pt</CODE> with the flag <CODE>-transform=solve</CODE>:
+</P>
+<PRE>
+ &gt; parse "dim the light" | put_tree -transform=solve
+ CAction light dim (DKindOne light)
+</PRE>
+<P>
+The <CODE>solve</CODE> process may fail, in which case no tree is returned:
+</P>
+<PRE>
+ &gt; parse "dim the fan" | put_tree -transform=solve
+ no tree found
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Write an abstract syntax module with above contents
+and an appropriate English concrete syntax. Try to parse the commands
+<I>dim the light</I> and <I>dim the fan</I>, with and without <CODE>solve</CODE> filtering.
+</P>
+<P>
+<B>Exercise</B>. Perform random and exhaustive generation, with and without
+<CODE>solve</CODE> filtering.
+</P>
+<P>
+<B>Exercise</B>. Add some device kinds and actions to the grammar.
+</P>
+<A NAME="toc106"></A>
+<H2>Polymorphism</H2>
+<P>
+<a name="secpolymorphic"></a>
+</P>
+<P>
+Sometimes an action can be performed on all kinds of devices. It would be
+possible to introduce separate <CODE>fun</CODE> constants for each kind-action pair,
+but this would be tedious. Instead, one can use <B>polymorphic</B> actions,
+i.e. actions that take a <CODE>Kind</CODE> as an argument and produce an <CODE>Action</CODE>
+for that <CODE>Kind</CODE>:
+</P>
+<PRE>
+ fun switchOn, switchOff : (k : Kind) -&gt; Action k ;
+</PRE>
+<P>
+Functions that are not polymorphic are <B>monomorphic</B>. However, the
+dichotomy into monomorphism and full polymorphism is not always sufficient
+for good semantic modelling: very typically, some actions are defined
+for a proper subset of devices, but not just one. For instance, both doors and
+windows can be opened, whereas lights cannot.
+We will return to this problem by introducing the
+concept of <B>restricted polymorphism</B> later,
+after a section on proof objects.
+</P>
+<P>
+<B>Exercise</B>. The grammar <CODE>ExtFoods</CODE> <a href="#secextended">here</a> permits the
+formation of phrases such as <I>we drink this fish</I> and <I>we eat this wine</I>.
+A way to prevent them is to distinguish between eatable and drinkable food items.
+Another, related problem is that there is some duplicated code
+due to a category distinction between guests and food items, for instance,
+two constructors for the determiner <I>this</I>. This problem can also
+be solved by dependent types. Rewrite the abstract syntax in <CODE>Foods</CODE> and
+<CODE>ExtFoods</CODE> by using such a type system, and also update the concrete syntaxes.
+If you do this right, you only have to change the functor modules
+<CODE>FoodsI</CODE> and <CODE>ExtFoodsI</CODE> in the concrete syntax.
+</P>
+<A NAME="toc107"></A>
+<H3>Digression: dependent types in concrete syntax</H3>
+<P>
+The <B>functional fragment</B> of GF
+terms and types comprises function types, applications, lambda
+abstracts, constants, and variables. This fragment is the same in
+abstract and concrete syntax. In particular,
+dependent types are also available in concrete syntax.
+We have not made use of them yet,
+but we will now look at one example of how they
+can be used.
+</P>
+<P>
+Those readers who are familiar with functional programming languages
+like ML and Haskell, may already have missed <B>polymorphic</B>
+functions. For instance, Haskell programmers have access to
+the functions
+</P>
+<PRE>
+ const :: a -&gt; b -&gt; a
+ const c _ = c
+
+ flip :: (a -&gt; b -&gt; c) -&gt; b -&gt; a -&gt; c
+ flip f y x = f x y
+</PRE>
+<P>
+which can be used for any given types <CODE>a</CODE>,<CODE>b</CODE>, and <CODE>c</CODE>.
+</P>
+<P>
+The GF counterpart of polymorphic functions are <B>monomorphic</B>
+functions with explicit <B>type variables</B> --- a techniques that we already
+used in abstract syntax for modelling actions that can be performed
+on all kinds of devices. Thus the above definitions can be written
+</P>
+<PRE>
+ oper const :(a,b : Type) -&gt; a -&gt; b -&gt; a =
+ \_,_,c,_ -&gt; c ;
+
+ oper flip : (a,b,c : Type) -&gt; (a -&gt; b -&gt;c) -&gt; b -&gt; a -&gt; c =
+ \_,_,_,f,x,y -&gt; f y x ;
+</PRE>
+<P>
+When the operations are used, the type checker requires
+them to be equipped with all their arguments; this may be a nuisance
+for a Haskell or ML programmer. They have not been used very much,
+except in the <CODE>Coordination</CODE> module of the resource library.
+</P>
+<A NAME="toc108"></A>
+<H2>Proof objects</H2>
+<P>
+Perhaps the most well-known idea in constructive type theory is
+the <B>Curry-Howard isomorphism</B>, also known as the
+<B>propositions as types principle</B>. Its earliest formulations
+were attempts to give semantics to the logical systems of
+propositional and predicate calculus. In this section, we will consider
+a more elementary example, showing how the notion of proof is useful
+outside mathematics, as well.
+</P>
+<P>
+We use the already shown category of unary (also known as Peano-style)
+natural numbers:
+</P>
+<PRE>
+ cat Nat ;
+ fun Zero : Nat ;
+ fun Succ : Nat -&gt; Nat ;
+</PRE>
+<P>
+The <B>successor function</B> <CODE>Succ</CODE> generates an infinite
+sequence of natural numbers, beginning from <CODE>Zero</CODE>.
+</P>
+<P>
+We then define what it means for a number <I>x</I> to be <I>less than</I>
+a number <I>y</I>. Our definition is based on two axioms:
+</P>
+<UL>
+<LI><CODE>Zero</CODE> is less than <CODE>Succ</CODE> <I>y</I> for any <I>y</I>.
+<LI>If <I>x</I> is less than <I>y</I>, then <CODE>Succ</CODE> <I>x</I> is less than <CODE>Succ</CODE> <I>y</I>.
+</UL>
+
+<P>
+The most straightforward way of expressing these axioms in type theory
+is with a dependent type <CODE>Less</CODE> <I>x y</I>, and two functions constructing
+its objects:
+</P>
+<PRE>
+ cat Less Nat Nat ;
+ fun lessZ : (y : Nat) -&gt; Less Zero (Succ y) ;
+ fun lessS : (x,y : Nat) -&gt; Less x y -&gt; Less (Succ x) (Succ y) ;
+</PRE>
+<P>
+Objects formed by <CODE>lessZ</CODE> and <CODE>lessS</CODE> are
+called <B>proof objects</B>: they establish the truth of certain
+mathematical propositions.
+For instance, the fact that 2 is less that
+4 has the proof object
+</P>
+<PRE>
+ lessS (Succ Zero) (Succ (Succ (Succ Zero)))
+ (lessS Zero (Succ (Succ Zero)) (lessZ (Succ Zero)))
+</PRE>
+<P>
+whose type is
+</P>
+<PRE>
+ Less (Succ (Succ Zero)) (Succ (Succ (Succ (Succ Zero))))
+</PRE>
+<P>
+which is the formalization of the proposition that 2 is less than 4.
+</P>
+<P>
+GF grammars can be used to provide a <B>semantic control</B> of
+well-formedness of expressions. We have already seen examples of this:
+the grammar of well-formed actions on household devices. By introducing proof objects
+we have now added an even more powerful technique of expressing semantic conditions.
+</P>
+<P>
+A simple example of the use of proof objects is the definition of
+well-formed <I>time spans</I>: a time span is expected to be from an earlier to
+a later time:
+</P>
+<PRE>
+ from 3 to 8
+</PRE>
+<P>
+is thus well-formed, whereas
+</P>
+<PRE>
+ from 8 to 3
+</PRE>
+<P>
+is not. The following rules for spans impose this condition
+by using the <CODE>Less</CODE> predicate:
+</P>
+<PRE>
+ cat Span ;
+ fun span : (m,n : Nat) -&gt; Less m n -&gt; Span ;
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Write an abstract and concrete syntax with the
+concepts of this section, and experiment with it in GF.
+</P>
+<P>
+<B>Exercise</B>. Define the notions of "even" and "odd" in terms
+of proof objects. <B>Hint</B>. You need one function for proving
+that 0 is even, and two other functions for propagating the
+properties.
+</P>
+<A NAME="toc109"></A>
+<H3>Proof-carrying documents</H3>
+<P>
+Another possible application of proof objects is <B>proof-carrying documents</B>:
+to be semantically well-formed, the abstract syntax of a document must contain a proof
+of some property, although the proof is not shown in the concrete document.
+Think, for instance, of small documents describing flight connections:
+</P>
+<P>
+<I>To fly from Gothenburg to Prague, first take LH3043 to Frankfurt, then OK0537 to Prague.</I>
+</P>
+<P>
+The well-formedness of this text is partly expressible by dependent typing:
+</P>
+<PRE>
+ cat
+ City ;
+ Flight City City ;
+ fun
+ Gothenburg, Frankfurt, Prague : City ;
+ LH3043 : Flight Gothenburg Frankfurt ;
+ OK0537 : Flight Frankfurt Prague ;
+</PRE>
+<P>
+This rules out texts saying <I>take OK0537 from Gothenburg to Prague</I>.
+However, there is a
+further condition saying that it must be possible to
+change from LH3043 to OK0537 in Frankfurt.
+This can be modelled as a proof object of a suitable type,
+which is required by the constructor
+that connects flights.
+</P>
+<PRE>
+ cat
+ IsPossible (x,y,z : City)(Flight x y)(Flight y z) ;
+ fun
+ Connect : (x,y,z : City) -&gt;
+ (u : Flight x y) -&gt; (v : Flight y z) -&gt;
+ IsPossible x y z u v -&gt; Flight x z ;
+</PRE>
+<P></P>
+<A NAME="toc110"></A>
+<H2>Restricted polymorphism</H2>
+<P>
+In the first version of the smart house grammar <CODE>Smart</CODE>,
+all Actions were either of
+</P>
+<UL>
+<LI><B>monomorphic</B>: defined for one Kind
+<LI><B>polymorphic</B>: defined for all Kinds
+</UL>
+
+<P>
+To make this scale up for new Kinds, we can refine this to
+<B>restricted polymorphism</B>: defined for Kinds of a certain <B>class</B>
+</P>
+<P>
+The notion of class can be expressed in abstract syntax
+by using the Curry-Howard isomorphism as follows:
+</P>
+<UL>
+<LI>a class is a <B>predicate</B> of Kinds --- i.e. a type depending of Kinds
+<LI>a Kind is in a class if there is a proof object of this type
+</UL>
+
+<P>
+Here is an example with switching and dimming. The classes are called
+<CODE>switchable</CODE> and <CODE>dimmable</CODE>.
+</P>
+<PRE>
+ cat
+ Switchable Kind ;
+ Dimmable Kind ;
+ fun
+ switchable_light : Switchable light ;
+ switchable_fan : Switchable fan ;
+ dimmable_light : Dimmable light ;
+
+ switchOn : (k : Kind) -&gt; Switchable k -&gt; Action k ;
+ dim : (k : Kind) -&gt; Dimmable k -&gt; Action k ;
+</PRE>
+<P>
+One advantage of this formalization is that classes for new
+actions can be added incrementally.
+</P>
+<P>
+<B>Exercise</B>. Write a new version of the <CODE>Smart</CODE> grammar with
+classes, and test it in GF.
+</P>
+<P>
+<B>Exercise</B>. Add some actions, kinds, and classes to the grammar.
+Try to port the grammar to a new language. You will probably find
+out that restricted polymorphism works differently in different languages.
+For instance, in Finnish not only doors but also TVs and radios
+can be "opened", which means switching them on.
+</P>
+<A NAME="toc111"></A>
+<H2>Variable bindings</H2>
+<P>
+<a name="secbinding"></a>
+</P>
+<P>
+Mathematical notation and programming languages have
+expressions that <B>bind</B> variables. For instance,
+a universally quantifier proposition
+</P>
+<PRE>
+ (All x)B(x)
+</PRE>
+<P>
+consists of the <B>binding</B> <CODE>(All x)</CODE> of the variable <CODE>x</CODE>,
+and the <B>body</B> <CODE>B(x)</CODE>, where the variable <CODE>x</CODE> can have
+<B>bound occurrences</B>.
+</P>
+<P>
+Variable bindings appear in informal mathematical language as well, for
+instance,
+</P>
+<PRE>
+ for all x, x is equal to x
+
+ the function that for any numbers x and y returns the maximum of x+y
+ and x*y
+
+ Let x be a natural number. Assume that x is even. Then x + 3 is odd.
+</PRE>
+<P>
+In type theory, variable-binding expression forms can be formalized
+as functions that take functions as arguments. The universal
+quantifier is defined
+</P>
+<PRE>
+ fun All : (Ind -&gt; Prop) -&gt; Prop
+</PRE>
+<P>
+where <CODE>Ind</CODE> is the type of individuals and <CODE>Prop</CODE>,
+the type of propositions. If we have, for instance, the equality predicate
+</P>
+<PRE>
+ fun Eq : Ind -&gt; Ind -&gt; Prop
+</PRE>
+<P>
+we may form the tree
+</P>
+<PRE>
+ All (\x -&gt; Eq x x)
+</PRE>
+<P>
+which corresponds to the ordinary notation
+</P>
+<PRE>
+ (All x)(x = x).
+</PRE>
+<P>
+An abstract syntax where trees have functions as arguments, as in
+the two examples above, has turned out to be precisely the right
+thing for the semantics and computer implementation of
+variable-binding expressions. The advantage lies in the fact that
+only one variable-binding expression form is needed, the lambda abstract
+<CODE>\x -&gt; b</CODE>, and all other bindings can be reduced to it.
+This makes it easier to implement mathematical theories and reason
+about them, since variable binding is tricky to implement and
+to reason about. The idea of using functions as arguments of
+syntactic constructors is known as <B>higher-order abstract syntax</B>.
+</P>
+<P>
+The question now arises: how to define linearization rules
+for variable-binding expressions?
+Let us first consider universal quantification,
+</P>
+<PRE>
+ fun All : (Ind -&gt; Prop) -&gt; Prop
+</PRE>
+<P>
+In GF, we write
+</P>
+<PRE>
+ lin All B = {s = "(" ++ "All" ++ B.$0 ++ ")" ++ B.s}
+</PRE>
+<P>
+to obtain the form shown above.
+This linearization rule brings in a new GF concept --- the <CODE>$0</CODE>
+field of <CODE>B</CODE> containing a bound variable symbol.
+The general rule is that, if an argument type of a function is
+itself a function type <CODE>A -&gt; C</CODE>, the linearization type of
+this argument is the linearization type of <CODE>C</CODE>
+together with a new field <CODE>$0 : Str</CODE>. In the linearization rule
+for <CODE>All</CODE>, the argument <CODE>B</CODE> thus has the linearization
+type
+</P>
+<PRE>
+ {$0 : Str ; s : Str},
+</PRE>
+<P>
+since the linearization type of <CODE>Prop</CODE> is
+</P>
+<PRE>
+ {s : Str}
+</PRE>
+<P>
+In other words, the linearization of a function
+consists of a linearization of the body together with a
+field for a linearization of the bound variable.
+Those familiar with type theory or lambda calculus
+should notice that GF requires trees to be in
+<B>eta-expanded</B> form in order for this to make sense:
+for any function of type
+</P>
+<PRE>
+ A -&gt; B
+</PRE>
+<P>
+an eta-expanded syntax tree has the form
+</P>
+<PRE>
+ \x -&gt; b
+</PRE>
+<P>
+where <CODE>b : B</CODE> under the assumption <CODE>x : A</CODE>.
+It is in this form that an expression can be analysed
+as having a bound variable and a body, which can be put into
+a linearization record.
+</P>
+<P>
+Given the linearization rule
+</P>
+<PRE>
+ lin Eq a b = {s = "(" ++ a.s ++ "=" ++ b.s ++ ")"}
+</PRE>
+<P>
+the linearization of
+</P>
+<PRE>
+ \x -&gt; Eq x x
+</PRE>
+<P>
+is the record
+</P>
+<PRE>
+ {$0 = "x", s = ["( x = x )"]}
+</PRE>
+<P>
+Thus we can compute the linearization of the formula,
+</P>
+<PRE>
+ All (\x -&gt; Eq x x) --&gt; {s = "[( All x ) ( x = x )]"}.
+</PRE>
+<P>
+But how did we get the linearization of the variable <CODE>x</CODE>
+into the string <CODE>"x"</CODE>? GF grammars have no rules for
+this: it is just hard-wired in GF that variable symbols are
+linearized into the same strings that represent them in
+the print-out of the abstract syntax.
+</P>
+<P>
+To be able to <I>parse</I> variable symbols, however, GF needs to know what
+to look for (instead of e.g. trying to parse <I>any</I>
+string as a variable). What strings are parsed as variable symbols
+is defined in the lexical analysis part of GF parsing
+</P>
+<PRE>
+ &gt; p -cat=Prop -lexer=codevars "(All x)(x = x)"
+ All (\x -&gt; Eq x x)
+</PRE>
+<P>
+(see more details on lexers <a href="#seclexing">here</a>). If several variables are bound in the
+same argument, the labels are <CODE>$0, $1, $2</CODE>, etc.
+</P>
+<P>
+<B>Exercise</B>. Write an abstract syntax of the whole
+<B>predicate calculus</B>, with the
+<B>connectives</B> "and", "or", "implies", and "not", and the
+<B>quantifiers</B> "exists" and "for all". Use higher-order functions
+to guarantee that unbounded variables do not occur.
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax for your favourite
+notation of predicate calculus. Use Latex as target language
+if you want nice output. You can also try producing boolean
+expressions of some programming language. Use as many parenthesis as you need to
+guarantee non-ambiguity.
+</P>
+<A NAME="toc112"></A>
+<H2>Semantic definitions</H2>
+<P>
+<a name="secdefdef"></a>
+</P>
+<P>
+Just like any functional programming language, abstract syntax in
+GF has declarations of functions, telling what the type of a function is.
+But we have not yet shown how to <B>compute</B>
+these functions: all we can do is provide them with arguments
+and linearize the resulting terms.
+Since our main interest is the well-formedness of expressions,
+this has not yet bothered
+us very much. As we will see, however, computation does play a role
+even in the well-formedness of expressions when dependent types are
+present.
+</P>
+<P>
+GF has a form of judgement for <B>semantic definitions</B>,
+marked by the key word <CODE>def</CODE>. At its simplest, it is just
+the definition of one constant, e.g.
+</P>
+<PRE>
+ fun one : Nat ;
+ def one = Succ Zero ;
+</PRE>
+<P>
+Notice a <CODE>def</CODE> definition can only be given to names declared by
+<CODE>fun</CODE> judgements in the same module; it is not possible to define
+an inherited name.
+</P>
+<P>
+We can also define a function with arguments,
+</P>
+<PRE>
+ fun twice : Nat -&gt; Nat ;
+ def twice x = plus x x ;
+</PRE>
+<P>
+which is still a special case of the most general notion of
+definition, that of a group of <B>pattern equations</B>:
+</P>
+<PRE>
+ fun plus : Nat -&gt; Nat -&gt; Nat ;
+ def
+ plus x Zero = x ;
+ plus x (Succ y) = Succ (Sum x y) ;
+</PRE>
+<P>
+To compute a term is, as in functional programming languages,
+simply to follow a chain of reductions until no definition
+can be applied. For instance, we compute
+</P>
+<PRE>
+ Sum one one --&gt;
+ Sum (Succ Zero) (Succ Zero) --&gt;
+ Succ (sum (Succ Zero) Zero) --&gt;
+ Succ (Succ Zero)
+</PRE>
+<P>
+Computation in GF is performed with the <CODE>pt</CODE> command and the
+<CODE>compute</CODE> transformation, e.g.
+</P>
+<PRE>
+ &gt; p -tr "1 + 1" | pt -transform=compute -tr | l
+ sum one one
+ Succ (Succ Zero)
+ s(s(0))
+</PRE>
+<P></P>
+<P>
+The <CODE>def</CODE> definitions of a grammar induce a notion of
+<B>definitional equality</B> among trees: two trees are
+definitionally equal if they compute into the same tree.
+Thus, trivially, all trees in a chain of computation
+(such as the one above) are definitionally equal to each other.
+In general, there can be infinitely many definitionally equal trees.
+</P>
+<P>
+An important property of definitional equality is that it is
+<B>extensional</B>, i.e. has to do with the sameness of semantic value.
+Linearization, on the other hand, is an <B>intensional</B> operation,
+i.e. has to do with the sameness of expression. This means that
+<CODE>def</CODE> definitions are <I>not</I> evaluated as linearization steps.
+Intensionality is a crucial property of linearization, since we want
+to use it for things like tracing a chain of evaluation.
+For instance, each of the steps of the computation above
+has a different linearization into standard arithmetic notation:
+</P>
+<PRE>
+ 1 + 1
+ s(0) + s(0)
+ s(s(0) + 0)
+ s(s(0))
+</PRE>
+<P>
+In most programming languages, the operations that can be performed on
+expressions are extensional, i.e. give equal values to equal arguments.
+But GF has both extensional and intensional operations.
+Type checking is extensional:
+in the type theory with dependent types, types may depend on terms,
+and types depending on definitionally equal terms are
+equal types. For instance,
+</P>
+<PRE>
+ Less Zero one
+ Less Zero (Succ Zero))
+</PRE>
+<P>
+are equal types. Hence, any tree that type checks as a proof that
+1 is odd also type checks as a proof that the successor of 0 is odd.
+(Recall, in this connection, that the
+arguments a category depends on never play any role
+in the linearization of trees of that category,
+nor in the definition of the linearization type.)
+</P>
+<P>
+When pattern matching is performed with <CODE>def</CODE> equations, it is
+crucial to distinguish between <B>constructors</B> and other functions
+(cf. <a href="#secmatching">here</a> on pattern matching in concrete syntax).
+GF has a judgement form <CODE>data</CODE> to tell that a category has
+certain functions as constructors:
+</P>
+<PRE>
+ data Nat = Succ | Zero ;
+</PRE>
+<P>
+Unlike in Haskell and ML, new constructors can be added to
+a type with new <CODE>data</CODE> judgements. The type signatures of constructors
+are given separately, in ordinary <CODE>fun</CODE> judgements.
+One can also write directly
+</P>
+<PRE>
+ data Succ : Nat -&gt; Nat ;
+</PRE>
+<P>
+which is syntactic sugar for the pair of judgements
+</P>
+<PRE>
+ fun Succ : Nat -&gt; Nat ;
+ data Nat = Succ ;
+</PRE>
+<P>
+If we did not mark <CODE>Zero</CODE> as <CODE>data</CODE>, the definition
+</P>
+<PRE>
+ fun isZero : Nat -&gt; Bool ;
+ def isZero Zero = True ;
+ def isZero _ = False ;
+</PRE>
+<P>
+would return <CODE>True</CODE> for all arguments, because the pattern <CODE>Zero</CODE>
+would be treated as a variable and it would hence match all values!
+This is a common pitfall in GF.
+</P>
+<P>
+<B>Exercise</B>. Implement an interpreter of a small functional programming
+language with natural numbers, lists, pairs, lambdas, etc. Use higher-order
+abstract syntax with semantic definitions. As onject language, use
+your favourite programming language.
+</P>
+<A NAME="toc113"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc114"></A>
+<H3>Judgements</H3>
+<P>
+We have generalized the <CODE>cat</CODE> judgement form and introduced two new forms
+for abstract syntax:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>form</TH>
+<TH COLSPAN="2">reading</TH>
+</TR>
+<TR>
+<TD><CODE>cat</CODE> <I>C</I> <I>G</I></TD>
+<TD><I>C</I> is a category in context <I>G</I></TD>
+</TR>
+<TR>
+<TD><CODE>def</CODE> <I>f</I> <I>P1</I> ... <I>Pn</I> <CODE>=</CODE> t</TD>
+<TD>function <I>f</I> applied to <I>P1</I>...<I>Pn</I> has value <I>t</I></TD>
+</TR>
+<TR>
+<TD><CODE>data</CODE> <I>C</I> <CODE>=</CODE> <I>C1</I> <CODE>|</CODE> ... <CODE>|</CODE> <I>Cn</I></TD>
+<TD>category <I>C</I> has constructors <I>C1</I>...<I>Cn</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The <B>context</B> in the <CODE>cat</CODE> judgement has the form
+</P>
+<PRE>
+ (x1 : T1) ... (xn : Tn)
+</PRE>
+<P>
+where the types <I>T1 ... Tn</I> may be increasingly dependent. To form a
+type, <I>C</I> must be equipped with arguments of each type in the
+context, satisfying the dependencies. As syntactic sugar, we have
+</P>
+<PRE>
+ T G === (x : T) G
+</PRE>
+<P>
+if <I>x</I> does not occur in <I>G</I>. The linearization type definition of a
+category does not mention the context.
+</P>
+<P>
+In <CODE>def</CODE> judgements, the arguments <I>P1</I>...<I>Pn</I> can be constructor and
+variable patterns as well as wild cards, and the binding and
+evaluation rules are the same as <a href="#secmatching">here</a>.
+</P>
+<P>
+A <CODE>data</CODE> judgement states that the names on the right-hand side are constructors
+of the category on the left-hand side. The precise types of the constructors are
+given in the <CODE>fun</CODE> judgements introducing them; the value type of a constructor
+of <I>C</I> must be of the form <I>C a1 ... am</I>. As syntactic sugar,
+</P>
+<PRE>
+ data f : A1 ... An -&gt; C a1 ... am ===
+ fun f : A1 ... An -&gt; C a1 ... am ; data C = f ;
+</PRE>
+<P></P>
+<A NAME="toc115"></A>
+<H3>Dependent function types</H3>
+<P>
+A <B>dependent function type</B> has the form
+</P>
+<PRE>
+ (x : A) -&gt; B
+</PRE>
+<P>
+where <I>B</I> depends on a variable <I>x</I> of type <I>A</I>. We have the
+following syntactic sugar:
+</P>
+<PRE>
+ (x,y : A) -&gt; B === (x : A) -&gt; (y : A) -&gt; B
+
+ (_ : A) -&gt; B === (x : A) -&gt; B if B does not depend on x
+
+ A -&gt; B === (_ : A) -&gt; B
+</PRE>
+<P>
+A <CODE>fun</CODE> function in abstract syntax may have function types as
+argument types. This is called <B>higher-order abstract syntax</B>.
+The linearization of an argument
+</P>
+<PRE>
+ \z0, ..., zn -&gt; b : (x0 : A1) -&gt; ... -&gt; (xn : An) -&gt; B
+</PRE>
+<P>
+if formed from the linearization of <I>b*</I> of <I>b</I> by adding
+fields that hold the variable symbols:
+</P>
+<PRE>
+ b* ** {$0 = "z0" ; ... ; $n = "zn"}
+</PRE>
+<P>
+If an argument function is itself a higher-order function, its
+bound variables cannot be reached in linearization. Thus, in a sense,
+the higher-order abstract syntax of GF is just <B>second-orde abstract syntax</B>.
+</P>
+<P>
+A <B>syntax tree</B> is a well-typed term in <B>beta-eta normal form</B>, which
+means that
+</P>
+<UL>
+<LI>its type is a basic type, i.e. it is not a partial application;
+<LI>its arguments are in eta normal form, i.e. either full applications or
+ lambda abstractions with bodies that are full applications;
+<LI>it has no beta redexes, i.e. applications of abstractions.
+</UL>
+
+<P>
+Terms that are not in this form may occur as arguments of dependent types
+and in <CODE>def</CODE> judgements, but they cannot be linearized.
+</P>
+<A NAME="toc116"></A>
+<H1>Grammars of formal languages</H1>
+<P>
+<a name="chapseven"></a>
+</P>
+<P>
+In this chapter, we will write a grammar for arithmetic expressions as known
+from school mathematics and many programming languages. We will see how to
+define precedences in GF, how to include built-in integers in grammars, and
+how to deal with spaces between tokens in desired ways. As an alternative concrete
+syntax, we will generate code for a JVM-like stack machine. We will conclude
+by extending the language with variable declarations and assignments, which
+are handled in a type-safe way by using higher-order abstract syntax.
+</P>
+<P>
+To write grammars for formal languages is usually less challenging than for
+natural languages. There are standard tools for this task, such as the YACC
+family of parser generators. Using GF would be overkill for many projects,
+and come with a penalty in efficiency. However, it is still worth while to
+look at this task. A typical application of GF are natural-language interfaces
+to formal systems: in such applications, the translation between natural and
+formal language can be defined as a multilingual grammar. The use of higher-order
+abstract syntax, together with dependent types, provides a way to define a
+complete compiler in GF.
+</P>
+<A NAME="toc117"></A>
+<H2>Arithmetic expressions</H2>
+<A NAME="toc118"></A>
+<H3>Abstract syntax</H3>
+<P>
+We want to write a grammar for what is usually called <B>expressions</B>
+in programming languages. The expressions are built from integers by
+the binary operations of addition, subtraction, multiplication, and
+division. The abstract syntax is easy to write. We call it <CODE>Calculator</CODE>,
+since it can be used as the basis of a calculator.
+</P>
+<PRE>
+ abstract Calculator = {
+
+ cat Exp ;
+
+ fun
+ EPlus, EMinus, ETimes, EDiv : Exp -&gt; Exp -&gt; Exp ;
+ EInt : Int -&gt; Exp ;
+ }
+</PRE>
+<P>
+Notice the use of the category <CODE>Int</CODE>. It is a built-in category of
+integers. Its syntax trees are denoted by <B>integer literals</B>, which are
+sequences of digits. For instance,
+</P>
+<PRE>
+ 5457455814608954681 : Int
+</PRE>
+<P>
+These are the only objects of type <CODE>Int</CODE>:
+grammars are not allowed to declare functions with <CODE>Int</CODE> as value type.
+</P>
+<A NAME="toc119"></A>
+<H3>Concrete syntax: a simple approach</H3>
+<P>
+Arithmetic expressions should be unambiguous. If we write
+</P>
+<PRE>
+ 2 + 3 * 4
+</PRE>
+<P>
+it should be parsed as one, but not both, of
+</P>
+<PRE>
+ EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+ ETimes (EPlus (EInt 2) (EInt 3)) (EInt 4)
+</PRE>
+<P>
+Under normal conventions, the former is chosen, because
+multiplication has <B>higher precedence</B> than addition.
+If we want to express the latter tree, we have to use
+parentheses:
+</P>
+<PRE>
+ (2 + 3) * 4
+</PRE>
+<P>
+However, it is not completely trivial to decide when to use
+parentheses and when not. We will therefore begin with a
+concrete syntax that always uses parentheses around binary
+operator applications.
+</P>
+<PRE>
+ concrete CalculatorP of Calculator = {
+
+ lincat
+ Exp = SS ;
+ lin
+ EPlus = infix "+" ;
+ EMinus = infix "-" ;
+ ETimes = infix "*" ;
+ EDiv = infix "/" ;
+ EInt i = i ;
+
+ oper
+ infix : Str -&gt; SS -&gt; SS -&gt; SS = \f,x,y -&gt;
+ ss ("(" ++ x.s ++ f ++ y.s ++ ")") ;
+ }
+</PRE>
+<P>
+Now we will obtain
+</P>
+<PRE>
+ &gt; linearize EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+ ( 2 + ( 3 * 4 ) )
+</PRE>
+<P>
+The first problem, even more urgent than superfluous parentheses, is
+to get rid of superfluous spaces and to recognize integer literals
+in the parser.
+</P>
+<A NAME="toc120"></A>
+<H2>Lexing and unlexing</H2>
+<P>
+<a name="seclexing"></a>
+</P>
+<P>
+The input of parsing in GF is not just a string, but a list of
+<B>tokens</B>. By default, a list of tokens is obtained from a string
+by analysing it into <B>words</B>, which means chunks separated by
+spaces. Thus for instance
+</P>
+<PRE>
+ "(12 + (3 * 4))"
+</PRE>
+<P>
+is split into the tokens
+</P>
+<PRE>
+ "(12", "+", "(3". "*". "4))"
+</PRE>
+<P>
+The parser then tries to find each of these tokens among the terminals
+of the grammar, i.e. among the strings that can appear in linearizations.
+In our example, only the tokens <CODE>"+"</CODE> and <CODE>"*"</CODE> can be found, and
+parsing therefore fails.
+</P>
+<P>
+The proper way to split the above string into tokens would be
+</P>
+<PRE>
+ "(", "12", "+", "(", "3", "*", "4", ")", ")"
+</PRE>
+<P>
+Moreover, the tokens <CODE>"12"</CODE>, <CODE>"3"</CODE>, and <CODE>"4"</CODE> should not be sought
+among the terminals in the grammar, but treated as integer tokens, which
+are defined outside the grammar. Since GF aims to be fully general, such
+conventions are not built in: it must be possible for a grammar to have
+tokens such as <CODE>"12"</CODE> and <CODE>"12)"</CODE>. Therefore, GF has a way to select
+a <B>lexer</B>, a function that splits strings into tokens and classifies
+them into terminals, literalts, etc.
+</P>
+<P>
+A lexer can be given as a flag to the parsing command:
+</P>
+<PRE>
+ &gt; parse -cat=Exp -lexer=codelit "(2 + (3 * 4))"
+ EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+</PRE>
+<P>
+Since the lexer is usually a part of the language specification, it
+makes sense to put it in the concrete syntax by using the judgement
+</P>
+<PRE>
+ flags lexer = codelit ;
+</PRE>
+<P>
+The problem of getting correct spacing after linearization is likewise solved
+by an <B>unlexer</B>:
+</P>
+<PRE>
+ &gt; l -unlexer=code EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+ (2 + (3 * 4))
+</PRE>
+<P>
+Also this flag is usually put into the concrete syntax file.
+</P>
+<P>
+The lexers and unlexers that are available in the GF system can be
+seen by
+</P>
+<PRE>
+ &gt; help -lexer
+ &gt; help -unlexer
+</PRE>
+<P>
+A table of the most common lexers and unlexers is given in the Summary
+section 7.8.
+</P>
+<A NAME="toc121"></A>
+<H2>Precedence and fixity</H2>
+<P>
+<a name="secprecedence"></a>
+</P>
+<P>
+Here is a summary of the usual
+precedence rules in mathematics and programming languages:
+</P>
+<UL>
+<LI>Integer constants and expressions in parentheses have the highest precedence.
+<LI>Multiplication and division have equal precedence, lower than the highest
+ but higher than addition and subtraction, which are again equal.
+<LI>All the four binary operations are <B>left-associative</B>, which means that
+ e.g. <CODE>1 + 2 + 3</CODE> means the same as <CODE>(1 + 2) + 3</CODE>.
+</UL>
+
+<P>
+One way of dealing with precedences in compiler books is by dividing expressions
+into three categories:
+</P>
+<UL>
+<LI>expressions: addition and subtraction
+<LI>terms: multiplication and division
+<LI>factors: constants and expressions in parentheses
+</UL>
+
+<P>
+The context-free grammar, also taking care of associativity, is the following:
+</P>
+<PRE>
+ Exp ::= Exp "+" Term | Exp "-" Term | Term ;
+ Term ::= Term "*" Fact | Term "/" Fact | Fact ;
+ Fact ::= Int | "(" Exp ")" ;
+</PRE>
+<P>
+A compiler, however, does not want to make a semantic distinction between the
+three categories. Nor does it want to build syntax trees with the
+<B>coercions</B> that enable the use of a higher level expressions on a lower, and
+encode the use of parentheses. In compiler tools such as YACC, building abstract
+syntax trees is performed as a <B>semantic action</B>. For instance, if the parser
+recognizes an expression in parentheses, the action is to return only the
+expression, without encoding the parentheses.
+</P>
+<P>
+In GF, semantic actions could be encoded by using <CODE>def</CODE> definitions introduced
+<a href="#secdefdef">here</a>. But there is a more straightforward way of thinking about
+precedences: we introduce a parameter for precedence, and treat it as
+an inherent feature of expressions:
+</P>
+<PRE>
+ oper
+ param Prec = Ints 2 ;
+ TermPrec : Type = {s : Str ; p : Prec} ;
+
+ mkPrec : Prec -&gt; Str -&gt; TermPrec = \p,s -&gt; {s = s ; p = p} ;
+
+ lincat
+ Exp = TermPrec ;
+</PRE>
+<P>
+This example shows another way to use built-in integers in GF:
+the type <CODE>Ints 2</CODE> is a parameter type, whose values are the integers
+<CODE>0,1,2</CODE>. These are the three precedence levels we need. The main idea
+is to compare the inherent precedence of an expression with the context
+in which it is used. If the precedence is higher than or equal to
+the expected, then
+no parentheses are needed. Otherwise they are. We encode this rule in
+the operation
+</P>
+<PRE>
+ oper usePrec : TermPrec -&gt; Prec -&gt; Str = \x,p -&gt;
+ case lessPrec x.p p of {
+ True =&gt; "(" x.s ")" ;
+ False =&gt; x.s
+ } ;
+</PRE>
+<P>
+With this operation, we can build another one, that can be used for
+defining left-associative infix expressions:
+</P>
+<PRE>
+ infixl : Prec -&gt; Str -&gt; (_,_ : TermPrec) -&gt; TermPrec = \p,f,x,y -&gt;
+ mkPrec p (usePrec x p ++ f ++ usePrec y (nextPrec p)) ;
+</PRE>
+<P>
+Constant-like expressions (the highest level) can be built simply by
+</P>
+<PRE>
+ constant : Str -&gt; TermPrec = mkPrec 2 ;
+</PRE>
+<P>
+All these operations can be found in the library module <CODE>lib/prelude/Formal</CODE>,
+so we don't have to define them in our own code. Also the auxiliary operations
+<CODE>nextPrec</CODE> and <CODE>lessPrec</CODE> used in their definitions are defined there.
+The library has 5 levels instead of 3.
+</P>
+<P>
+Now we can express the whole concrete syntax of <CODE>Calculator</CODE> compactly:
+</P>
+<PRE>
+ concrete CalculatorC of Calculator = open Formal, Prelude in {
+
+ flags lexer = codelit ; unlexer = code ; startcat = Exp ;
+
+ lincat Exp = TermPrec ;
+
+ lin
+ EPlus = infixl 0 "+" ;
+ EMinus = infixl 0 "-" ;
+ ETimes = infixl 1 "*" ;
+ EDiv = infixl 1 "/" ;
+ EInt i = constant i.s ;
+ }
+</PRE>
+<P>
+Let us just take one more look at the operation <CODE>usePrec</CODE>, which decides whether
+to put parentheses around a term or not. The case where parentheses are not needed
+around a string was defined as the string itself.
+However, this would imply that superfluous parentheses
+are never correct. A more liberal grammar is obtained by using the operation
+</P>
+<PRE>
+ parenthOpt : Str -&gt; Str = \s -&gt; variants {s ; "(" ++ s ++ ")"} ;
+</PRE>
+<P>
+which is actually used in the <CODE>Formal</CODE> library.
+But even in this way, we can only allow one pair of superfluous parentheses.
+Thus the parameter-based grammar has not quite reached the goal
+of implementing the same language as the expression-term-factor grammar.
+But it has the advantage of eliminating precedence distinctions from the
+abstract syntax.
+</P>
+<P>
+<B>Exercise</B>. Define non-associative and right-associative infix operations
+analogous to <CODE>infixl</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Add a constructor that puts parentheses around expressions
+to raise their precedence, but that is eliminated by a <CODE>def</CODE> definition.
+Test parsing with and without a pipe to <CODE>pt -transform=compute</CODE>.
+</P>
+<A NAME="toc122"></A>
+<H2>Code generation as linearization</H2>
+<P>
+The classical use of grammars of programming languages is in <B>compilers</B>,
+which translate one language into another. Typically the source language of
+a compiler is a high-level language and the target language is a machine
+language. The hub of a compiler is abstract syntax: the <B>front end</B> of
+the compiler parses source language strings into abstract syntax trees, and
+the <B>back end</B> linearizes these trees into the target language. This processing
+model is of course what GF uses for natural language translation; the main
+difference is that, in GF, the compiler could run in the opposite direction as
+well, that is, function as a <B>decompiler</B>. (In full-size compilers, the
+abstract syntax is also transformed by several layers of semantic analysis
+and optimizations, before the target code is generated; this can destroy
+reversibility and hence decompilation.)
+</P>
+<P>
+More for the sake of illustration
+than as a serious compiler, let us write a concrete
+syntax of <CODE>Calculator</CODE> that generates machine code similar to JVM (Java Virtual
+Machine). JVM is a so-called <B>stack machine</B>, whose code follows the
+<B>postfix</B> notation, also known as <B>reverse Polish</B> notation. Thus the
+expression
+</P>
+<PRE>
+ 2 + 3 * 4
+</PRE>
+<P>
+is translated to
+</P>
+<PRE>
+ iconst 2 : iconst 3 ; iconst 4 ; imul ; iadd
+</PRE>
+<P>
+The linearization rules are not difficult to give:
+</P>
+<PRE>
+ lin
+ EPlus = postfix "iadd" ;
+ EMinus = postfix "isub" ;
+ ETimes = postfix "imul" ;
+ EDiv = postfix "idiv" ;
+ EInt i = ss ("iconst" ++ i.s) ;
+ oper
+ postfix : Str -&gt; SS -&gt; SS -&gt; SS = \op,x,y -&gt;
+ ss (x.s ++ ";" ++ y.s ++ ";" ++ op) ;
+</PRE>
+<P></P>
+<A NAME="toc123"></A>
+<H2>Speaking aloud arithmetic expressions</H2>
+<P>
+Natural languages have sometimes difficulties in expressing mathematical
+formulas unambiguously, because they have no universal device of parentheses.
+For arithmetic formulas, a solution exists:
+</P>
+<PRE>
+ 2 + 3 * 4
+</PRE>
+<P>
+can be expressed
+</P>
+<PRE>
+ the sum of 2 and the product of 3 and 4
+</PRE>
+<P>
+However, this format is very verbose and unnatural, and becomes
+impossible to understand when the complexity of expressions grows.
+Fortunately, spoken language
+has a very nice way of using <B>pauses</B> for disambiguation. This device was
+introduced by Torbjörn Lager (personal communication, 2003)
+as an input mechanism to a calculator dialogue
+system; it seems to correspond very closely to how we actually speak when we
+want to communicate arithmetic expressions. Another application would be as
+a part of a programming assistant that reads aloud code.
+</P>
+<P>
+The idea is that, after every completed operation, there is a pause. Try this
+by speaking aloud the following lines, making a pause instead of pronouncing the
+word <CODE>PAUSE</CODE>:
+</P>
+<PRE>
+ 2 plus 3 times 4 PAUSE
+ 2 plus 3 PAUSE times 4 PAUSE
+</PRE>
+<P>
+A grammar implementing this convention is again simple to write:
+</P>
+<PRE>
+ lin
+ EPlus = infix "plus" ;
+ EMinus = infix "minus" ;
+ ETimes = infix "times" ;
+ EDiv = infix ["divided by"] ;
+ EInt i = i ;
+ oper
+ infix : Str -&gt; SS -&gt; SS -&gt; SS = \op,x,y -&gt;
+ ss (x.s ++ op ++ y.s ++ "PAUSE") ;
+</PRE>
+<P>
+Intuitively, a pause is taken to give the hearer time to compute an
+intermediate result.
+</P>
+<P>
+<B>Exercise</B>. Is the pause-based grammar unambiguous? Test with random examples!
+</P>
+<A NAME="toc124"></A>
+<H2>Programs with variables</H2>
+<P>
+A useful extension of arithmetic expressions is a <B>straight code</B> programming
+language. The programs of this language are <B>assignments</B> of the form <CODE>x = exp</CODE>,
+which assign expressions to variables. Expressions can moreover contain variables
+that have been given values in previous assignments.
+</P>
+<P>
+In this language, we use two new categories: programs and variables.
+A program is a sequence of assignments, where a variable is given a value.
+Logically, we want to distinguish <B>initializations</B> from other assignments:
+these are the assignments where a variable is given a value for the first time.
+Just like in C-like languages,
+we prefix an initializing assignment with the type of the variable.
+Here is an example of a piece of code written in the language:
+</P>
+<PRE>
+ int x = 2 + 3 ;
+ int y = x + 1 ;
+ x = x + 9 * y ;
+</PRE>
+<P>
+We define programs by the following constructors:
+</P>
+<PRE>
+ fun
+ PEmpty : Prog ;
+ PInit : Exp -&gt; (Var -&gt; Prog) -&gt; Prog ;
+ PAss : Var -&gt; Exp -&gt; Prog -&gt; Prog ;
+</PRE>
+<P>
+The interesting constructor is <CODE>PInit</CODE>, which uses
+higher-order abstract syntax for making the initialized variable available in
+the <B>continuation</B> of the program. The abstract syntax tree for the above code
+is
+</P>
+<PRE>
+ PInit (EPlus (EInt 2) (EInt 3)) (\x -&gt;
+ PInit (EPlus (EVar x) (EInt 1)) (\y -&gt;
+ PAss x (EPlus (EVar x) (ETimes (EInt 9) (EVar y)))
+ PEmpty))
+</PRE>
+<P>
+Since we want to prevent the use of uninitialized variables in programs, we
+don't give any constructors for <CODE>Var</CODE>! We just have a rule for using variables
+as expressions:
+</P>
+<PRE>
+ fun EVar : Var -&gt; Exp ;
+</PRE>
+<P>
+The rest of the grammar is just the same as for arithmetic expressions
+<a href="#secprecedence">here</a>. The best way to implement it is perhaps by writing a
+module that extends the expression module. The most natural start category
+of the extension is <CODE>Prog</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Extend the straight-code language to expressions of type <CODE>float</CODE>.
+To guarantee type safety, you can define a category <CODE>Typ</CODE> of types, and
+make <CODE>Exp</CODE> and <CODE>Var</CODE> dependent on <CODE>Typ</CODE>. Basic floating point expressions
+can be formed from literal of the built-in GF type <CODE>Float</CODE>. The arithmetic
+operations should be made polymorphic (as <a href="#secpolymorphic">here</a>).
+</P>
+<A NAME="toc125"></A>
+<H3>The concrete syntax of assignments</H3>
+<P>
+We can define a C-like concrete syntax by using GF's <CODE>$</CODE> variables, as explained
+<a href="#secbinding">here</a>.
+</P>
+<P>
+In a JVM-like syntax, we need two more instructions: <CODE>iload</CODE> <I>x</I>, which
+loads (pushes on the stack) the value of the variable <I>x</I>, and <CODE>istore</CODE> <I>x</I>,
+which stores the value of the currently topmost expression in the variable <I>x</I>.
+Thus the code for the example in the previous section is
+</P>
+<PRE>
+ iconst 2 ; iconst 3 ; iadd ; istore x ;
+ iload x ; iconst 1 ; iadd ; istore y ;
+ iload x ; iconst 9 ; iload y ; imul ; iadd ; istore x ;
+</PRE>
+<P>
+Those familiar with JVM will notice that we are using <B>symbolic addresses</B>, i.e.
+variable names instead of integer offsets in the memory. Neither real JVM nor
+our variant makes any distinction between the initialization and reassignment
+of a variable.
+</P>
+<P>
+<B>Exercise</B>. Finish the implementation of the
+C-to-JVM compiler by extending the expression modules
+to straight code programs.
+</P>
+<P>
+<B>Exercise</B>. If you made the exercise of adding floating point numbers to
+the language, you can now cash out the main advantage of type checking
+for code generation: selecting type-correct JVM instructions. The floating
+point instructions are precisely the same as the integer one, except that
+the prefix is <CODE>f</CODE> instead of <CODE>i</CODE>, and that <CODE>fconst</CODE> takes floating
+point literals as arguments.
+</P>
+<A NAME="toc126"></A>
+<H3>A liberal syntax of variables</H3>
+<P>
+In many applications, the task of GF is just linearization and parsing;
+keeping track of bound variables and other semantic constraints is
+the task of other parts of the program. For instance, if we want to
+write a natural language interface that reads aloud C code, we can
+quite as well use a context-free grammar of C, and leave it to the C
+compiler to check that variables make sense. In such a program, we may
+want to treat variables as <I>Strings</I>, i.e. to have a constructor
+</P>
+<PRE>
+ fun VString : String -&gt; Var ;
+</PRE>
+<P>
+The built-in category <CODE>String</CODE> has as its values <B>string literals</B>,
+which are strings in double quotes. The lexer and unlexer <CODE>codelit</CODE>
+restore and remove the quotes; when the lexer finds a token that is
+neither a terminal in the grammar nor an integer literal, it sends
+it to the parser as a string literal.
+</P>
+<P>
+<B>Exercise</B>. Write a grammar for straight code without higher-order
+abstract syntax.
+</P>
+<P>
+<B>Exercise</B>. Extend the liberal straight code grammar to <CODE>while</CODE> loops and
+some other program constructs, and investigate if you can build a reasonable spoken
+language generator for this fragment.
+</P>
+<A NAME="toc127"></A>
+<H2>Conclusion</H2>
+<P>
+Since formal languages are syntactically simpler than natural languages, it
+is no wonder that their grammars can be defined in GF. Some thought is needed
+for dealing with precedences and spacing, but much of it is encoded in GF's
+libraries and built-in lexers and unlexers. If the sole purpose of a grammar
+is to implement a programming language, then the <B>BNF Converter</B> tool
+(BNFC) is more appropriate than GF:
+<center>
+<CODE>www.cs.chalmers.se/~markus/BNFC/</CODE>
+</center>
+BNFC uses standard YACC-like parser tools. GF has flags for printing
+grammars in the BNFC format.
+</P>
+<P>
+The most common applications of GF grammars of formal languages
+are in natural-language interfaces of various kinds.
+These systems don't usually need semantic control in GF abstract
+syntax. However, the situation can be different if the interface also comprises
+an interactive syntax editor, as in the GF-Key system
+(Beckert &amp; al. 2006, Burke &amp; Johannisson 2005).
+In that system, the editor is used for guiding programmers only to write
+type-correct code.
+</P>
+<P>
+The technique of continuations in modelling programming languages has recently
+been applied to natural language, for processing <B>anaphoric reference</B>,
+e.g. pronouns. It may be good to know that GF has the machinery available;
+for the time being, however (GF 2.8), dependent types and
+higher-order abstract syntax are not supported by the embedded GF implementations
+in Haskell and Java.
+</P>
+<P>
+<B>Exercise</B>. The book <I>C programming language</I> by Kernighan and Ritchie
+(p. 123, 2nd edition, 1988) describes an English-like syntax for pointer and
+array declarations, and a C program for translating between English and C.
+The following example pair shows all the expression forms needed:
+</P>
+<PRE>
+ char (*(*x[3])())[5]
+
+ x: array[3] of pointer to function returning
+ pointer to array[5] of char
+</PRE>
+<P>
+Implement these translations by a GF grammar.
+</P>
+<P>
+<B>Exercise</B>. Design a natural-language interface to Unix command lines.
+It should be able to express verbally commands such as
+<CODE>cat, cd, grep, ls, mv, rm, wc</CODE> and also
+pipes built from them.
+</P>
+<A NAME="toc128"></A>
+<H2>Summary of GF language constructs</H2>
+<A NAME="toc129"></A>
+<H3>Lexers and unlexers</H3>
+<P>
+Lexers are set by the flag <CODE>lexer</CODE> and unlexers by the flag <CODE>unlexer</CODE>.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>lexer</TH>
+<TH COLSPAN="2">description</TH>
+</TR>
+<TR>
+<TD><CODE>words</CODE></TD>
+<TD>(default) tokens are separated by spaces or newlines</TD>
+</TR>
+<TR>
+<TD><CODE>literals</CODE></TD>
+<TD>like words, but integer and string literals recognized</TD>
+</TR>
+<TR>
+<TD><CODE>chars</CODE></TD>
+<TD>each character is a token</TD>
+</TR>
+<TR>
+<TD><CODE>code</CODE></TD>
+<TD>program code conventions (uses Haskell's lex)</TD>
+</TR>
+<TR>
+<TD><CODE>text</CODE></TD>
+<TD>with conventions on punctuation and capital letters</TD>
+</TR>
+<TR>
+<TD><CODE>codelit</CODE></TD>
+<TD>like code, but recognize literals (unknown words as strings)</TD>
+</TR>
+<TR>
+<TD><CODE>textlit</CODE></TD>
+<TD>like text, but recognize literals (unknown words as strings)</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>unlexer</TH>
+<TH COLSPAN="2">description</TH>
+</TR>
+<TR>
+<TD><CODE>unwords</CODE></TD>
+<TD>(default) space-separated token list</TD>
+</TR>
+<TR>
+<TD><CODE>text</CODE></TD>
+<TD>format as text: punctuation, capitals, paragraph &lt;p&gt;</TD>
+</TR>
+<TR>
+<TD><CODE>code</CODE></TD>
+<TD>format as code (spacing, indentation)</TD>
+</TR>
+<TR>
+<TD><CODE>textlit</CODE></TD>
+<TD>like text, but remove string literal quotes</TD>
+</TR>
+<TR>
+<TD><CODE>codelit</CODE></TD>
+<TD>like code, but remove string literal quotes</TD>
+</TR>
+<TR>
+<TD><CODE>concat</CODE></TD>
+<TD>remove all spaces</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<A NAME="toc130"></A>
+<H3>Built-in abstract syntax types</H3>
+<P>
+There are three built-in types. Their syntax trees are literals of corresponding kinds:
+</P>
+<UL>
+<LI><CODE>Int</CODE>, with nonnegative integer literals e.g. <CODE>987031434</CODE>
+<LI><CODE>Float</CODE>, with nonnegative floating-point literals e.g. <CODE>907.219807</CODE>
+<LI><CODE>String</CODE>, with string literals e.g. <CODE>"foo"</CODE>
+</UL>
+
+<P>
+Their linearization type is uniformly <CODE>{s : Str}</CODE>.
+</P>
+<A NAME="toc131"></A>
+<H1>Embedded grammars</H1>
+<P>
+<a name="chapeight"></a>
+</P>
+<P>
+GF grammars can be used as parts of programs written in other programming
+languages. Haskell and Java.
+This facility is based on several components:
+</P>
+<UL>
+<LI>a portable format for multilingual GF grammars
+<LI>an interpreter for this format written in the host language
+<LI>an API that enables reading grammar files and calling the interpreter
+<LI>a way to manipulate abstract syntax trees in the host language
+</UL>
+
+<P>
+In this chapter, we will show the basic ways of producing such
+<B>embedded grammars</B> and using them in Haskell, Java, and JavaScript programs.
+We will build a simple example application in each language:
+</P>
+<UL>
+<LI>a question-answering system in Haskell
+<LI>a translator GUI in Java
+<LI>a multilingual syntax editor in JavaScript
+</UL>
+
+<P>
+Moreover, we will use how grammar applications can be extended to
+spoken language by generating <B>language models</B> for speech recognition
+in various standard formats.
+</P>
+<A NAME="toc132"></A>
+<H2>The portable grammar format</H2>
+<P>
+The portable format is called GFCC, "GF Canonical Compiled". A file
+of this form can be produced from GF by the command
+</P>
+<PRE>
+ &gt; print_multi -printer=gfcc | write_file FILE.gfcc
+</PRE>
+<P>
+Files written in this format can also be imported in the GF system,
+which recognizes the suffix <CODE>.gfcc</CODE> and builds the multilingual
+grammar in memory.
+</P>
+<P>
+<I>This applies to GF version 3 and upwards. Older GF used a format suffixed</I>
+<CODE>.gfcm</CODE>.
+<I>At the moment of writing, also the Java interpreter still uses the GFCM format.</I>
+</P>
+<P>
+GFCC is, in fact, the recommended format in
+which final grammar products are distributed, because they
+are stripped from superfluous information and can be started and applied
+faster than sets of separate modules.
+</P>
+<P>
+Application programmers have never any need to read or modify GFCC files.
+Also in this sense, they play the same role as machine code in
+general-purpose programming.
+</P>
+<A NAME="toc133"></A>
+<H2>The embedded interpreter and its API</H2>
+<P>
+The interpreter is a kind of a miniature GF system, which can parse and
+linearize with grammars. But it can only perform a subset of the commands of
+the GF system. For instance, it
+cannot compile source grammars into the GFCC format; the compiler is the most
+heavy-weight component of the GF system, and should not be carried around
+in end-user applications.
+Since GFCC is much
+simpler than source GF, building an interpreter is relatively easy.
+Full-scale interpreters currently exist in Haskell and Java, and partial
+ones in C++, JavaScript, and Prolog. We will in this chapter focus
+on Haskell, Java, and JavaScript.
+</P>
+<P>
+Application programmers never need to read or modify the interpreter.
+They only need to access it via its API.
+</P>
+<A NAME="toc134"></A>
+<H2>Embedded GF applications in Haskell</H2>
+<P>
+Readers unfamiliar with Haskell, or who just want to program in Java, can safely
+skip this section. Everything will be repeated in the corresponding Java
+section. However, seeing the Haskell code may still be helpful because
+Haskell is in many ways closer to GF than Java is. In particular, recursive
+types of syntax trees and pattern matching over them are very similar in
+Haskell and GF,
+but require a complex encoding with classes and visitors in Java.
+</P>
+<A NAME="toc135"></A>
+<H3>The EmbedAPI module</H3>
+<P>
+The Haskell API contains (among other things) the following types and functions:
+</P>
+<PRE>
+ module EmbedAPI where
+
+ type MultiGrammar
+ type Language
+ type Category
+ type Tree
+
+ file2grammar :: FilePath -&gt; IO MultiGrammar
+
+ linearize :: MultiGrammar -&gt; Language -&gt; Tree -&gt; String
+ parse :: MultiGrammar -&gt; Language -&gt; Category -&gt; String -&gt; [Tree]
+
+ linearizeAll :: MultiGrammar -&gt; Tree -&gt; [String]
+ linearizeAllLang :: MultiGrammar -&gt; Tree -&gt; [(Language,String)]
+
+ parseAll :: MultiGrammar -&gt; Category -&gt; String -&gt; [[Tree]]
+ parseAllLang :: MultiGrammar -&gt; Category -&gt; String -&gt; [(Language,[Tree])]
+
+ languages :: MultiGrammar -&gt; [Language]
+ categories :: MultiGrammar -&gt; [Category]
+ startCat :: MultiGrammar -&gt; Category
+</PRE>
+<P>
+This is the only module that needs to be imported in the Haskell application.
+It is available as a part of the GF distribution, in the file
+<CODE>src/GF/GFCC/API.hs</CODE>.
+</P>
+<A NAME="toc136"></A>
+<H3>First application: a translator</H3>
+<P>
+Let us first build a stand-alone translator, which can translate
+in any multilingual grammar between any languages in the grammar.
+The whole code for this translator is here:
+</P>
+<PRE>
+ module Main where
+
+ import GF.GFCC.API
+ import System (getArgs)
+
+ main :: IO ()
+ main = do
+ file:_ &lt;- getArgs
+ gr &lt;- file2grammar file
+ interact (translate gr)
+
+ translate :: MultiGrammar -&gt; String -&gt; String
+ translate gr = case parseAllLang gr (startCat gr) s of
+ (lg,t:_):_ -&gt; unlines [linearize gr l t | l &lt;- languages gr, l /= lg]
+ _ -&gt; "NO PARSE"
+</PRE>
+<P>
+To run the translator, first compile it by
+</P>
+<PRE>
+ % ghc --make -o trans Translator.hs
+</PRE>
+<P>
+Then produce a GFCC file. For instance, the <CODE>Food</CODE> grammar set can be
+compiled as follows:
+</P>
+<PRE>
+ % gfc --make FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+This produces the file <CODE>Food.gfcc</CODE> (its name comes from the abstract syntax).
+</P>
+<P>
+<I>The gfc batch compiler program is available in GF 3 and upwards.</I>
+<I>In earlier versions, the appropriate command can be piped to gf:</I>
+</P>
+<PRE>
+ % echo "pm -printer=gfcc | wf Food.gfcc" | gf FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+Equivalently, the grammars could be read into GF shell and the <CODE>pm</CODE> command
+issued from there. But the unix command has the advantage that it can
+be put into a <CODE>Makefile</CODE> to automate the compilation of an application.
+</P>
+<P>
+The Haskell library function <CODE>interact</CODE> makes the <CODE>trans</CODE> program work
+like a Unix filter, which reads from standard input and writes to standard
+output. Therefore it can be a part of a pipe and read and write files.
+The simplest way to translate is to <CODE>echo</CODE> input to the program:
+</P>
+<PRE>
+ % echo "this wine is delicious" | ./trans Food.gfcc
+ questo vino è delizioso
+</PRE>
+<P>
+The result is given in all languages except the input language.
+</P>
+<A NAME="toc137"></A>
+<H3>A looping translator</H3>
+<P>
+If the user wants to translate many expressions in a sequence, it
+is cumbersome to have to start the translator over and over again,
+because reading the grammar and building the parser always takes
+time. The translator of the previous section is easy to modify
+to enable this: just change <CODE>interact</CODE> in the main function to
+<CODE>loop</CODE>. It is not a standard Haskell function, so its definition has
+to be included:
+</P>
+<PRE>
+ loop :: (String -&gt; String) -&gt; IO ()
+ loop trans = do
+ s &lt;- getLine
+ if s == "quit" then putStrLn "bye" else do
+ putStrLn $ trans s
+ loop trans
+</PRE>
+<P>
+The loop keeps on translating line by line until the input line
+is <CODE>quit</CODE>.
+</P>
+<A NAME="toc138"></A>
+<H3>A question-answer system</H3>
+<P>
+<a name="secmathprogram"></a>
+</P>
+<P>
+The next application is also a translator, but it adds a
+<B>transfer</B> component to the grammar. Transfer is a function that
+takes the input syntax tree into some other syntax tree, which is
+then linearized and shown back to the user. The transfer function we
+are going to use is one that computes a question into an answer.
+The program accepts simple questions about arithmetic and answers
+"yes" or "no" in the language in which the question was made:
+</P>
+<PRE>
+ Is 123 prime?
+ No.
+ 77 est impair ?
+ Oui.
+</PRE>
+<P>
+The main change that is needed to the pure translator is to give
+the type of <CODE>translate</CODE> an extra argument: a transfer function.
+</P>
+<PRE>
+ translate :: (Tree -&gt; Tree) -&gt; MultiGrammar -&gt; String -&gt; String
+</PRE>
+<P>
+You can think of ordinary translation as a special case where
+transfer is the identity function (<CODE>id</CODE> in Haskell).
+</P>
+<P>
+Also the behaviour of returning the reply in different languages
+should be changed so that the reply is returned in the <I>same</I> language.
+Here is the complete definition of <CODE>translate</CODE> in the new form.
+</P>
+<PRE>
+ translate tr gr = case parseAllLang gr (startCat gr) s of
+ (lg,t:_):_ -&gt; linearize gr lg (tr t)
+ _ -&gt; "NO PARSE"
+</PRE>
+<P>
+To complete the system, we have to define the transfer function.
+So, how can we write a function from from abstract syntax trees
+to abstract syntax trees? The embedded API does not make
+the constructors of the type <CODE>Tree</CODE> available for users. Even if it did, it would
+be quite complicated to use the type, and programs would be likely
+to produce trees that are ill-typed in GF and therefore cannot
+be linearized.
+</P>
+<A NAME="toc139"></A>
+<H3>Exporting GF datatypes</H3>
+<P>
+The way to go in defining transfer is to use GF's tree constructors, that
+is, the <CODE>fun</CODE> functions, as if they were Haskell's data constructors.
+There is enough resemblance between GF and Haskell to make this possible
+in most cases. It is even possible in Java, as we shall see later.
+</P>
+<P>
+Thus every category of GF is translated into a Haskell datatype, where the
+functions producing a value of that category are treated as constructors.
+The translation is obtained by using the batch compiler with the command
+</P>
+<PRE>
+ % gfc -haskell Food.gfcc
+</PRE>
+<P>
+It is also possible to produce the Haskell file together with GFCC, by
+</P>
+<PRE>
+ % gfc --make -haskell FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+The result is a file named <CODE>GSyntax.hs</CODE>, containing a
+module named <CODE>GSyntax</CODE>.
+</P>
+<P>
+<I>In GF before version 3, the same result is obtained from within GF, by the command</I>
+</P>
+<PRE>
+ &gt; print_grammar -printer=gfcc_haskell | write_file GSyntax.hs
+</PRE>
+<P></P>
+<P>
+As an example, we take
+the grammar we are going to use for queries. The abstract syntax is
+</P>
+<PRE>
+ abstract Math = {
+
+ flags startcat = Question ;
+
+ cat Answer ; Question ; Object ;
+
+ fun
+ Even : Object -&gt; Question ;
+ Odd : Object -&gt; Question ;
+ Prime : Object -&gt; Question ;
+ Number : Int -&gt; Object ;
+
+ Yes : Answer ;
+ No : Answer ;
+ }
+</PRE>
+<P>
+It is translated to the following system of datatypes:
+</P>
+<PRE>
+ newtype GInt = GInt Integer
+
+ data GAnswer =
+ GYes
+ | GNo
+
+ data GObject = GNumber GInt
+
+ data GQuestion =
+ GPrime GObject
+ | GOdd GObject
+ | GEven GObject
+</PRE>
+<P>
+All type and constructor names are prefixed with a <CODE>G</CODE> to prevent clashes.
+</P>
+<P>
+Now it is possible to define functions from and to these datatype, in Haskell.
+Haskell's type checker guarantees that the functions are well-typed also with
+respect to GF. Here is a question-to-answer function for this language:
+</P>
+<PRE>
+ answer :: GQuestion -&gt; GAnswer
+ answer p = case p of
+ GOdd x -&gt; test odd x
+ GEven x -&gt; test even x
+ GPrime x -&gt; test prime x
+
+ value :: GObject -&gt; Int
+ value e = case e of
+ GNumber (GInt i) -&gt; fromInteger i
+
+ test :: (Int -&gt; Bool) -&gt; GObject -&gt; GAnswer
+ test f x = if f (value x) then GYes else GNo
+</PRE>
+<P>
+So it is the function <CODE>answer</CODE> that we want to apply as transfer.
+The only problem is the <I>type</I> of this function: the parsing and
+linearization method of <CODE>API</CODE> work with <CODE>Tree</CODE>s and not
+with <CODE>GQuestion</CODE>s and <CODE>GAnswers</CODE>.
+</P>
+<P>
+Fortunately the Haskell translation of GF takes care of translating
+between trees and the generated datatypes. This is done by using
+a class with the required translation methods:
+</P>
+<PRE>
+ class Gf a where
+ gf :: a -&gt; Tree
+ fg :: Tree -&gt; a
+</PRE>
+<P>
+The Haskell code generator also generates instances of these classes
+for each datatype, for example,
+</P>
+<PRE>
+ instance Gf GQuestion where
+ gf (GEven x1) = DTr [] (AC (CId "Even")) [gf x1]
+ gf (GOdd x1) = DTr [] (AC (CId "Odd")) [gf x1]
+ gf (GPrime x1) = DTr [] (AC (CId "Prime")) [gf x1]
+ fg t =
+ case t of
+ DTr [] (AC (CId "Even")) [x1] -&gt; GEven (fg x1)
+ DTr [] (AC (CId "Odd")) [x1] -&gt; GOdd (fg x1)
+ DTr [] (AC (CId "Prime")) [x1] -&gt; GPrime (fg x1)
+ _ -&gt; error ("no Question " ++ show t)
+</PRE>
+<P>
+Needless to say, <CODE>GSyntax</CODE> is a module that a programmer
+never needs to look into, let alone change: it is enough to know that it
+contains a systematic encoding and decoding between an abstract syntax
+and Haskell datatypes, where
+</P>
+<UL>
+<LI>all GF names are in Haskell prefixed with <CODE>G</CODE>
+<LI><CODE>gf</CODE> translates from Haskell to GF
+<LI><CODE>fg</CODE> translates from GF to Haskell
+</UL>
+
+<A NAME="toc140"></A>
+<H3>Putting it all together</H3>
+<P>
+Here is the complete code for the Haskell module <CODE>TransferLoop.hs</CODE>.
+</P>
+<PRE>
+ module Main where
+
+ import GF.GFCC.API
+ import TransferDef (transfer)
+
+ main :: IO ()
+ main = do
+ gr &lt;- file2grammar "Math.gfcc"
+ loop (translate transfer gr)
+
+ loop :: (String -&gt; String) -&gt; IO ()
+ loop trans = do
+ s &lt;- getLine
+ if s == "quit" then putStrLn "bye" else do
+ putStrLn $ trans s
+ loop trans
+
+ translate :: (Tree -&gt; Tree) -&gt; MultiGrammar -&gt; String -&gt; String
+ translate tr gr = case parseAllLang gr (startCat gr) s of
+ (lg,t:_):_ -&gt; linearize gr lg (tr t)
+ _ -&gt; "NO PARSE"
+</PRE>
+<P>
+This is the <CODE>Main</CODE> module, which just needs a function <CODE>transfer</CODE> from
+<CODE>TransferDef</CODE> in order to compile. In the current application, this module
+looks as follows:
+</P>
+<PRE>
+ module TransferDef where
+
+ import GF.GFCC.API (Tree)
+ import GSyntax
+
+ transfer :: Tree -&gt; Tree
+ transfer = gf . answer . fg
+
+ answer :: GQuestion -&gt; GAnswer
+ answer p = case p of
+ GOdd x -&gt; test odd x
+ GEven x -&gt; test even x
+ GPrime x -&gt; test prime x
+
+ value :: GObject -&gt; Int
+ value e = case e of
+ GNumber (GInt i) -&gt; fromInteger i
+
+ test :: (Int -&gt; Bool) -&gt; GObject -&gt; GAnswer
+ test f x = if f (value x) then GYes else GNo
+
+ prime :: Int -&gt; Bool
+ prime x = elem x primes where
+ primes = sieve [2 .. x]
+ sieve (p:xs) = p : sieve [ n | n &lt;- xs, n `mod` p &gt; 0 ]
+ sieve [] = []
+</PRE>
+<P>
+This module, in turn, needs <CODE>GSyntax</CODE> to compile, and the main module
+needs <CODE>Math.gfcc</CODE> to run. To automate the production of the system,
+we write a <CODE>Makefile</CODE> as follows:
+</P>
+<PRE>
+ all:
+ gfc --make -haskell MathEng.gf MathFre.gf
+ ghc --make -o ./math TransferLoop.hs
+ strip math
+</PRE>
+<P>
+(Notice that the empty segments starting the command lines in a Makefile must be tabs.)
+Now we can compile the whole system by just typing
+</P>
+<PRE>
+ make
+</PRE>
+<P>
+Then you can run it by typing
+</P>
+<PRE>
+ ./math
+</PRE>
+<P>
+Well --- you will of course need some concrete syntaxes of <CODE>Math</CODE> in order
+to succeed. We have defined ours by using the resource functor design pattern,
+as explained <a href="#secfunctor">here</a>.
+</P>
+<P>
+Just to summarize, the source of the application consists of the following files:
+</P>
+<PRE>
+ Makefile -- a makefile
+ Math.gf -- abstract syntax
+ Math???.gf -- concrete syntaxes
+ TransferDef.hs -- definition of question-to-answer function
+ TransferLoop.hs -- Haskell Main module
+</PRE>
+<P></P>
+<A NAME="toc141"></A>
+<H2>Embedded GF applications in Java</H2>
+<P>
+When an API for GFCC in Java is available,
+we will write the same applications in Java as
+were written in Haskell above. Until then, we will
+build another kind of application, which does not require
+modification of generated Java code.
+</P>
+<P>
+More information on embedded GF grammars in Java can be found in the document
+</P>
+<PRE>
+ www.cs.chalmers.se/~bringert/gf/gf-java.html
+</PRE>
+<P>
+by Björn Bringert.
+</P>
+<A NAME="toc142"></A>
+<H3>Translets</H3>
+<P>
+A Java system needs many more files than a Haskell system.
+To get started, one can fetch the package <CODE>gfc2java</CODE> from
+</P>
+<PRE>
+ www.cs.chalmers.se/~bringert/darcs/gfc2java/
+</PRE>
+<P>
+by using the Darcs version control system as described in the <CODE>gf-java</CODE> home page.
+</P>
+<P>
+The <CODE>gfc2java</CODE> package contains a script <CODE>build-translet</CODE>, which can be applied
+to any <CODE>.gfcm</CODE> file to create a <B>translet</B>, a small translation GUI. Foor the <CODE>Food</CODE>
+grammars of <a href="#chapthree">the third chapter</a>, we first create a file <CODE>food.gfcm</CODE> by
+</P>
+<PRE>
+ % echo "pm | wf food.gfcm" | gf FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+and then run
+</P>
+<PRE>
+ % build_translet food.gfcm
+</PRE>
+<P>
+The resulting file <CODE>translate-food.jar</CODE> can be run with
+</P>
+<PRE>
+ % java -jar translate-food.jar
+</PRE>
+<P>
+The translet looks like this:
+</P>
+<P>
+ <IMG ALIGN="right" SRC="food-translet.png" BORDER="0" ALT="">
+</P>
+<A NAME="toc143"></A>
+<H3>Dialogue systems</H3>
+<P>
+A question-answer system is a special case of a <B>dialogue system</B>, where the user and
+the computer communicate by writing or, even more properly, by speech. The <CODE>gf-java</CODE>
+homepage provides an example of a most simple dialogue system imaginable, where two
+the conversation has just two rules:
+</P>
+<UL>
+<LI>if the user says <I>here you go</I>, the system says <I>thanks</I>
+<LI>if the user says <I>thanks</I>, the system says <I>you are welcome</I>
+</UL>
+
+<P>
+The conversation can be made in both English and Swedish; the user's initiative
+decides which language the system replies in. Thus the structure is very similar
+to the <CODE>math</CODE> program <a href="#secmathprogram">here</a>. The GF and
+Java sources of the program can be
+found in
+</P>
+<PRE>
+ www.cs.chalmers.se/~bringert/darcs/simpledemo
+</PRE>
+<P>
+again accessible with the Darcs version control system.
+</P>
+<A NAME="toc144"></A>
+<H2>Language models for speech recognition</H2>
+<P>
+The standard way of using GF in speech recognition is by building
+<B>grammar-based language models</B>. To this end, GF comes with compilers
+into several formats that are used in speech recognition systems.
+One such format is GSL, used in the <A HREF="http://www.nuance.com">Nuance speech recognizer</A>.
+It is produced from GF simply by printing a grammar with the flag
+<CODE>-printer=gsl</CODE>. The following example uses the smart house grammar defined
+<a href="#secsmarthouse">here</a>.
+</P>
+<PRE>
+ &gt; import -conversion=finite SmartEng.gf
+ &gt; print_grammar -printer=gsl
+
+ ;GSL2.0
+ ; Nuance speech recognition grammar for SmartEng
+ ; Generated by GF
+
+ .MAIN SmartEng_2
+
+ SmartEng_0 [("switch" "off") ("switch" "on")]
+ SmartEng_1 ["dim" ("switch" "off")
+ ("switch" "on")]
+ SmartEng_2 [(SmartEng_0 SmartEng_3)
+ (SmartEng_1 SmartEng_4)]
+ SmartEng_3 ("the" SmartEng_5)
+ SmartEng_4 ("the" SmartEng_6)
+ SmartEng_5 "fan"
+ SmartEng_6 "light"
+</PRE>
+<P>
+Other formats available via the <CODE>-printer</CODE> flag include:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Format</TH>
+<TH COLSPAN="2">Description</TH>
+</TR>
+<TR>
+<TD><CODE>gsl</CODE></TD>
+<TD>Nuance GSL speech recognition grammar</TD>
+</TR>
+<TR>
+<TD><CODE>jsgf</CODE></TD>
+<TD>Java Speech Grammar Format (JSGF)</TD>
+</TR>
+<TR>
+<TD><CODE>jsgf_sisr_old</CODE></TD>
+<TD>JSGF with semantic tags in SISR WD 20030401 format</TD>
+</TR>
+<TR>
+<TD><CODE>srgs_abnf</CODE></TD>
+<TD>SRGS ABNF format</TD>
+</TR>
+<TR>
+<TD><CODE>srgs_xml</CODE></TD>
+<TD>SRGS XML format</TD>
+</TR>
+<TR>
+<TD><CODE>srgs_xml_prob</CODE></TD>
+<TD>SRGS XML format, with weights</TD>
+</TR>
+<TR>
+<TD><CODE>slf</CODE></TD>
+<TD>finite automaton in the HTK SLF format</TD>
+</TR>
+<TR>
+<TD><CODE>slf_sub</CODE></TD>
+<TD>finite automaton with sub-automata in HTK SLF</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+All currently available formats can be seen in gf with <CODE>help -printer</CODE>.
+</P>
+<A NAME="toc145"></A>
+<H2>Dependent types and spoken language models</H2>
+<P>
+We have used dependent types to control semantic well-formedness
+in grammars. This is important in traditional type theory
+applications such as proof assistants, where only mathematically
+meaningful formulas should be constructed. But semantic filtering has
+also proved important in speech recognition, because it reduces the
+ambiguity of the results.
+</P>
+<P>
+Now, GSL is a context-free format, so how does it cope with dependent types?
+In general, dependent types can give rise to infinitely many basic types
+(exercise!), whereas a context-free grammar can by definition only have
+finitely many nonterminals.
+</P>
+<P>
+This is where the flag <CODE>-conversion=finite</CODE> is needed in the <CODE>import</CODE>
+command. Its effect is to convert a GF grammar with dependent types to
+one without, so that each instance of a dependent type is replaced by
+an atomic type. This can then be used as a nonterminal in a context-free
+grammar. The <CODE>finite</CODE> conversion presupposes that every
+dependent type has only finitely many instances, which is in fact
+the case in the <CODE>Smart</CODE> grammar.
+</P>
+<P>
+<B>Exercise</B>. If you have access to the Nuance speech recognizer,
+test it with GF-generated language models for <CODE>SmartEng</CODE>. Do this
+both with and without <CODE>-conversion=finite</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Construct an abstract syntax with infinitely many instances
+of dependent types.
+</P>
+<A NAME="toc146"></A>
+<H3>Statistical language models</H3>
+<P>
+An alternative to grammar-based language models are
+<B>statistical language models</B> (<B>SLM</B>s). An SLM is
+built from a <B>corpus</B>, i.e. a set of utterances. It specifies the
+probability of each <B>n-gram</B>, i.e. sequence of <I>n</I> words. The
+typical value of <I>n</I> is 2 (bigrams) or 3 (trigrams).
+</P>
+<P>
+One advantage of SLMs over grammar-based models is that they are
+<B>robust</B>, i.e. they can be used to recognize sequences that would
+be out of the grammar or the corpus. Another advantage is that
+an SLM can be built "for free" if a corpus is available.
+</P>
+<P>
+However, collecting a corpus can require a lot of work, and writing
+a grammar can be less demanding, especially with tools such as GF or
+Regulus. This advantage of grammars can be combined with robustness
+by creating a back-up SLM from a <B>synthesized corpus</B>. This means
+simply that the grammar is used for generating such a corpus.
+In GF, this can be done with the <CODE>generate_trees</CODE> command.
+As with grammar-based models, the quality of the SLM is better
+if meaningless utterances are excluded from the corpus. Thus
+a good way to generate an SLM from a GF grammar is by using
+dependent types and filter the results through the type checker:
+</P>
+<PRE>
+ &gt; generate_trees | put_trees -transform=solve | linearize
+</PRE>
+<P>
+The method of creating statistical language model from corpora synthesized
+from GF grammars is applied and evaluated in (Jonson 2006).
+</P>
+<P>
+<B>Exercise</B>. Measure the size of the corpus generated from
+<CODE>SmartEng</CODE> (defined <a href="#secsmarthouse">here</a>), with and without type checker filtering.
+</P>
+
+<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
+<!-- cmdline: txt2tags -thtml -\-toc gf-tutorial.txt -->
+</BODY></HTML>
diff --git a/doc/mytree.png b/doc/mytree.png
new file mode 100644
index 000000000..fafcc8772
--- /dev/null
+++ b/doc/mytree.png
Binary files differ