new tutorial and reference manual

author: aarne <aarne@cs.chalmers.se> 2007-12-21 15:10:38 +0000
committer: aarne <aarne@cs.chalmers.se> 2007-12-21 15:10:38 +0000
commit: 5ee1714fd23e974d1cf2511fa398b6ce310a9807 (patch)
tree: 7a82f85d4f4681086430fdefd7903e4a26015c3f /doc/gf-tutorial.html
parent: c5017f28aad7702838b9861aa3f6cbf7b3bacca5 (diff)
1 files changed, 7952 insertions, 0 deletions
diff --git a/doc/gf-tutorial.html b/doc/gf-tutorial.html
new file mode 100644
index 000000000..1e6d961b8
--- /dev/null
+++ b/doc/gf-tutorial.html
@@ -0,0 +1,7952 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
+<HTML>
+<HEAD>
+<META NAME="generator" CONTENT="http://txt2tags.sf.net">
+<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
+<TITLE>Grammatical Framework Tutorial</TITLE>
+</HEAD><BODY BGCOLOR="white" TEXT="black">
+<P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1>
+<FONT SIZE="4">
+<I>Aarne Ranta</I><BR>
+Draft, November 2007
+</FONT></CENTER>
+
+<P></P>
+<HR NOSHADE SIZE=1>
+<P></P>
+  <UL>
+  <LI><A HREF="#toc1">Getting started with GF</A>
+    <UL>
+    <LI><A HREF="#toc2">What GF is</A>
+    <LI><A HREF="#toc3">Getting the GF system</A>
+    <LI><A HREF="#toc4">Running the GF system</A>
+    <LI><A HREF="#toc5">A "Hello World" grammar</A>
+      <UL>
+      <LI><A HREF="#toc6">The program: abstract syntax and concrete syntaxes</A>
+      <LI><A HREF="#toc7">Using the grammar in the GF system</A>
+      </UL>
+    <LI><A HREF="#toc8">Using grammars from outside GF</A>
+    <LI><A HREF="#toc9">What else can be done with the grammar</A>
+    <LI><A HREF="#toc10">Summary of GF language features</A>
+      <UL>
+      <LI><A HREF="#toc11">Modules</A>
+      <LI><A HREF="#toc12">Judgements</A>
+      <LI><A HREF="#toc13">Types and terms</A>
+      <LI><A HREF="#toc14">Type checking</A>
+      </UL>
+    </UL>
+  <LI><A HREF="#toc15">Designing a grammar for complex phrases</A>
+    <UL>
+    <LI><A HREF="#toc16">The abstract syntax Food</A>
+    <LI><A HREF="#toc17">The concrete syntax FoodEng</A>
+    <LI><A HREF="#toc18">Commands for testing grammars</A>
+      <UL>
+      <LI><A HREF="#toc19">Generating trees and strings</A>
+      <LI><A HREF="#toc20">More on pipes; tracing</A>
+      <LI><A HREF="#toc21">Writing and reading files</A>
+      <LI><A HREF="#toc22">Visualizing trees</A>
+      <LI><A HREF="#toc23">System commands</A>
+      </UL>
+    <LI><A HREF="#toc24">An Italian concrete syntax</A>
+    <LI><A HREF="#toc25">Free variation</A>
+    <LI><A HREF="#toc26">More application of multilingual grammars</A>
+      <UL>
+      <LI><A HREF="#toc27">Multilingual treebanks</A>
+      <LI><A HREF="#toc28">Translation session</A>
+      <LI><A HREF="#toc29">Translation quiz</A>
+      <LI><A HREF="#toc30">Multilingual syntax editing</A>
+      </UL>
+    <LI><A HREF="#toc31">Context-free grammars and GF</A>
+      <UL>
+      <LI><A HREF="#toc32">The "cf" grammar format</A>
+      <LI><A HREF="#toc33">Restrictions of context-free grammars</A>
+      </UL>
+    <LI><A HREF="#toc34">Modules and files</A>
+    <LI><A HREF="#toc35">Using operations and resource modules</A>
+      <UL>
+      <LI><A HREF="#toc36">The golden rule of functional programming</A>
+      <LI><A HREF="#toc37">Operation definitions</A>
+      <LI><A HREF="#toc38">The ``resource`` module type</A>
+      <LI><A HREF="#toc39">Opening a resource</A>
+      <LI><A HREF="#toc40">Partial application</A>
+      <LI><A HREF="#toc41">Testing resource modules</A>
+      </UL>
+    <LI><A HREF="#toc42">Grammar architecture</A>
+      <UL>
+      <LI><A HREF="#toc43">Extending a grammar</A>
+      <LI><A HREF="#toc44">Multiple inheritance</A>
+      <LI><A HREF="#toc45">Visualizing module structure</A>
+      </UL>
+    <LI><A HREF="#toc46">Summary of GF language features</A>
+      <UL>
+      <LI><A HREF="#toc47">Modules</A>
+      <LI><A HREF="#toc48">Judgements</A>
+      <LI><A HREF="#toc49">Free variation</A>
+      <LI><A HREF="#toc50">The context-free grammar format</A>
+      <LI><A HREF="#toc51">Character encoding</A>
+      </UL>
+    </UL>
+  <LI><A HREF="#toc52">Grammars with parameters</A>
+    <UL>
+    <LI><A HREF="#toc53">The problem: words have to be inflected</A>
+    <LI><A HREF="#toc54">Parameters and tables</A>
+    <LI><A HREF="#toc55">Inflection tables and paradigms</A>
+    <LI><A HREF="#toc56">Using parameters in concrete syntax</A>
+      <UL>
+      <LI><A HREF="#toc57">Agreement</A>
+      <LI><A HREF="#toc58">Determiners</A>
+      <LI><A HREF="#toc59">Parametric vs. inherent features</A>
+      </UL>
+    <LI><A HREF="#toc60">An English concrete syntax for Foods with parameters</A>
+    <LI><A HREF="#toc61">More on inflection paradigms</A>
+      <UL>
+      <LI><A HREF="#toc62">Worst-case functions</A>
+      <LI><A HREF="#toc63">Intelligent paradigms</A>
+      <LI><A HREF="#toc64">Function types with variables</A>
+      <LI><A HREF="#toc65">Separating operation types and definitions</A>
+      <LI><A HREF="#toc66">Overloading of operations</A>
+      <LI><A HREF="#toc67">Morphological analysis and morphology quiz</A>
+      </UL>
+    <LI><A HREF="#toc68">The Italian Foods grammar</A>
+    <LI><A HREF="#toc69">Discontinuous constituents</A>
+    <LI><A HREF="#toc70">Strings at compile time vs. run time</A>
+    <LI><A HREF="#toc71">Summary of GF language features</A>
+      <UL>
+      <LI><A HREF="#toc72">Parameter and table types</A>
+      <LI><A HREF="#toc73">Pattern matching</A>
+      <LI><A HREF="#toc74">Overloading</A>
+      <LI><A HREF="#toc75">Local definitions</A>
+      <LI><A HREF="#toc76">Supplementary constructs</A>
+      </UL>
+    </UL>
+  <LI><A HREF="#toc77">Using the resource grammar library</A>
+    <UL>
+    <LI><A HREF="#toc78">The coverage of the library</A>
+    <LI><A HREF="#toc79">The structure of the library</A>
+      <UL>
+      <LI><A HREF="#toc80">Lexical vs. phrasal rules</A>
+      <LI><A HREF="#toc81">Lexical categories</A>
+      <LI><A HREF="#toc82">Lexical rules</A>
+      <LI><A HREF="#toc83">Phrasal categories</A>
+      </UL>
+    <LI><A HREF="#toc84">The resource API</A>
+    <LI><A HREF="#toc85">Example: English</A>
+    <LI><A HREF="#toc86">Functor implementation of multilingual grammars</A>
+    <LI><A HREF="#toc87">Interfaces and instances</A>
+    <LI><A HREF="#toc88">Adding languages to a functor implementation</A>
+    <LI><A HREF="#toc89">Division of labour revisited</A>
+    <LI><A HREF="#toc90">Restricted inheritance</A>
+    <LI><A HREF="#toc91">Grammar reuse</A>
+    <LI><A HREF="#toc92">Browsing the resource with GF commands</A>
+    <LI><A HREF="#toc93">An extended Foods grammar</A>
+      <UL>
+      <LI><A HREF="#toc94">Abstract syntax</A>
+      <LI><A HREF="#toc95">Linearization types</A>
+      <LI><A HREF="#toc96">Linearization rules</A>
+      </UL>
+    <LI><A HREF="#toc97">Tenses</A>
+    <LI><A HREF="#toc98">Summary of GF language features</A>
+      <UL>
+      <LI><A HREF="#toc99">Interfaces and instances</A>
+      <LI><A HREF="#toc100">Grammar reuse</A>
+      <LI><A HREF="#toc101">Functors</A>
+      <LI><A HREF="#toc102">Restricted inheritance</A>
+      </UL>
+    </UL>
+  <LI><A HREF="#toc103">Refining semantics in abstract syntax</A>
+    <UL>
+    <LI><A HREF="#toc104">GF as a logical framework</A>
+    <LI><A HREF="#toc105">Dependent types</A>
+    <LI><A HREF="#toc106">Polymorphism</A>
+      <UL>
+      <LI><A HREF="#toc107">Digression: dependent types in concrete syntax</A>
+      </UL>
+    <LI><A HREF="#toc108">Proof objects</A>
+      <UL>
+      <LI><A HREF="#toc109">Proof-carrying documents</A>
+      </UL>
+    <LI><A HREF="#toc110">Restricted polymorphism</A>
+    <LI><A HREF="#toc111">Variable bindings</A>
+    <LI><A HREF="#toc112">Semantic definitions</A>
+    <LI><A HREF="#toc113">Summary of GF language features</A>
+      <UL>
+      <LI><A HREF="#toc114">Judgements</A>
+      <LI><A HREF="#toc115">Dependent function types</A>
+      </UL>
+    </UL>
+  <LI><A HREF="#toc116">Grammars of formal languages</A>
+    <UL>
+    <LI><A HREF="#toc117">Arithmetic expressions</A>
+      <UL>
+      <LI><A HREF="#toc118">Abstract syntax</A>
+      <LI><A HREF="#toc119">Concrete syntax: a simple approach</A>
+      </UL>
+    <LI><A HREF="#toc120">Lexing and unlexing</A>
+    <LI><A HREF="#toc121">Precedence and fixity</A>
+    <LI><A HREF="#toc122">Code generation as linearization</A>
+    <LI><A HREF="#toc123">Speaking aloud arithmetic expressions</A>
+    <LI><A HREF="#toc124">Programs with variables</A>
+      <UL>
+      <LI><A HREF="#toc125">The concrete syntax of assignments</A>
+      <LI><A HREF="#toc126">A liberal syntax of variables</A>
+      </UL>
+    <LI><A HREF="#toc127">Conclusion</A>
+    <LI><A HREF="#toc128">Summary of GF language constructs</A>
+      <UL>
+      <LI><A HREF="#toc129">Lexers and unlexers</A>
+      <LI><A HREF="#toc130">Built-in abstract syntax types</A>
+      </UL>
+    </UL>
+  <LI><A HREF="#toc131">Embedded grammars</A>
+    <UL>
+    <LI><A HREF="#toc132">The portable grammar format</A>
+    <LI><A HREF="#toc133">The embedded interpreter and its API</A>
+    <LI><A HREF="#toc134">Embedded GF applications in Haskell</A>
+      <UL>
+      <LI><A HREF="#toc135">The EmbedAPI module</A>
+      <LI><A HREF="#toc136">First application: a translator</A>
+      <LI><A HREF="#toc137">A looping translator</A>
+      <LI><A HREF="#toc138">A question-answer system</A>
+      <LI><A HREF="#toc139">Exporting GF datatypes</A>
+      <LI><A HREF="#toc140">Putting it all together</A>
+      </UL>
+    <LI><A HREF="#toc141">Embedded GF applications in Java</A>
+      <UL>
+      <LI><A HREF="#toc142">Translets</A>
+      <LI><A HREF="#toc143">Dialogue systems</A>
+      </UL>
+    <LI><A HREF="#toc144">Language models for speech recognition</A>
+    <LI><A HREF="#toc145">Dependent types and spoken language models</A>
+      <UL>
+      <LI><A HREF="#toc146">Statistical language models</A>
+      </UL>
+    </UL>
+  </UL>
+
+<P></P>
+<HR NOSHADE SIZE=1>
+<P></P>
+<P>
+<h2>Overview</h2>
+</P>
+<P>
+This tutorial gives a hands-on introduction to grammar writing in GF.
+It has been written for all programmers 
+who want to learn to write grammars in GF. 
+It will go through the programming concepts of GF, and also
+explain, without presupposing them, the main ingredients of GF:
+linguistics, functional programming, and type theory.
+This knowledge will be introduced as a part of grammar writing
+practice.
+Thus the tutorial should be accessible to anyone who has some 
+previous experience from any programming language; the basics
+of using computers are also presupposed, e.g. the use of 
+text editors and the management of files.
+</P>
+<P>
+We start in <a href="#chaptwo">the second chapter</a> 
+by building a "Hello World" grammar, which covers greetings
+in three languages: English (<I>hello world</I>), 
+Finnish (<I>terve maailma</I>), and Italian (<I>ciao mondo</I>).
+This <B>multilingual grammar</B> is based on the most central idea of GF: 
+the distinction between <B>abstract syntax</B>
+(the logical structure) and <B>concrete syntax</B> (the
+sequence of words).
+</P>
+<P>
+From the "Hello World" example, we proceed 
+in <a href="#chapthree">the third chapter</a>
+to a larger grammar for the domain of food.
+In this grammar, you can say things like
+<center>
+<I>this Italian cheese is delicious</I>
+</center>
+in English and Italian. This grammar illustrates how translation is
+more than just replacement of words. For instance, the order of 
+words may have to be changed:
+<center>
+<I>Italian cheese</I> 
+</P>
+<P>
+<I>formaggio italiano</I>
+</center>
+Moreover, words can have different forms, and which forms
+they have vary from language to language. For instance,
+Italian adjectives usually have four forms where English
+has just one:
+<center>
+<I>delicious</I> (<I>wine, wines, pizza, pizzas</I>)
+</P>
+<P>
+<I>vino delizioso, vini deliziosi, pizza deliziosa, pizze deliziose</I>
+</center>
+The <B>morphology</B> of a language describes the
+forms of its words, and the basics of implementing morphology and
+integrating it with syntax are covered in <a href="#chaptwo">the fourth chapter</a>.
+</P>
+<P>
+The complete description of morphology and syntax in natural 
+languages is in GF preferably left to the <B>resource grammar library</B>.
+Its use is therefore an important part of GF programming, and
+it is covered in <a href="#chapfive">the fifth chapter</a>. How to contribute to resource 
+grammars as an author will only be covered in Part III;
+however, the tutorial does go through all the
+programming concepts of GF, including those involved in 
+resource grammars.
+</P>
+<P>
+In addition to multilinguality, <B>semantics</B> is an important aspect of GF
+grammars. The "purely linguistic" aspects (morphology and syntax) belong to 
+the concrete syntax part of GF, whereas semantics is expressed in the abstract
+syntax. After the presentation of concrete syntax constructs, we proceed
+in <a href="#chapsix">the sixth chapter</a> to the enrichment of abstract syntax with <B>dependent types</B>,
+<B>variable bindings</B>, and <B>semantic definitions</B>.
+<a href="#chapseven">the seventh chapter</a> concludes the tutorial by technical tips for implementing formal
+languages. It will also illustrate the close relation between GF grammars
+and compilers by actually implementing a small compiler from C-like statements
+and expressions to machine code similar to Java Virtual Machine.
+</P>
+<P>
+English and Italian are used as example languages in many grammars.
+Of course, we will not presuppose that the reader knows any Italian.
+We have chosen Italian because it has a rich structure 
+that illustrates very well the capacities of GF. 
+Moreover, even those readers who don't know Italian, will find many of
+its words familiar, due to the Latin heritage. 
+The exercises will encourage the reader to
+port the examples to other languages as well; in particular,
+it should be instructive for the reader to look at her 
+own native language from the point of view of writing a grammar
+implementation.
+</P>
+<P>
+To learn how to write GF grammars is not the only goal of
+this tutorial. We will also explain the most important 
+commands of the GF system, mostly in passing. With these commands,
+simple application programs such as translation and
+quiz systems, can be built simply by writing scripts for the
+GF system. More complicated applications, such as natural-language
+interfaces and dialogue systems, moreover require programming in
+some general-purpose language; such applications are covered in <a href="#chapeight">the eighth chapter</a>.
+</P>
+<A NAME="toc1"></A>
+<H1>Getting started with GF</H1>
+<P>
+<a name="chaptwo"></a>
+</P>
+<P>
+In this chapter, we will introduce the GF system and write the first GF grammar,
+a "Hello World" grammar. While extremely small, this grammar already illustrates
+how GF can be used for the tasks of translation and multilingual
+generation.
+</P>
+<A NAME="toc2"></A>
+<H2>What GF is</H2>
+<P>
+We use the term GF for three different things:
+</P>
+<UL>
+<LI>a <B>system</B> (computer program) used for working with grammars
+<LI>a <B>programming language</B> in which grammars can be written
+<LI>a <B>theory</B> about grammars and languages
+</UL>
+
+<P>
+The relation between these things is obvious: the GF system is an implementation
+of the GF programming language, which in turn is built on the ideas of the
+GF theory. The main focus of this book is on the GF programming language.
+We learn how grammars are written in this language. At the same time, we learn
+the way of thinking in the GF theory. To make this all useful and fun, and
+to encourage experimenting, we make the grammars run on a computer by 
+using the GF system. 
+</P>
+<P>
+A GF program is called a <B>grammar</B>. A grammar is, traditionally, a 
+definition of a language. From this definition, different language 
+processing components can be derived:
+</P>
+<UL>
+<LI><B>parsing</B>: to analyse the language
+<LI><B>linearization</B>: to generate the language
+<LI><B>translation</B>: to analyse one language and generate another
+</UL>
+
+<P>
+A GF grammar is thus a declarative program from which these
+procedures can be automatically derived. In general, a GF grammar 
+is <B>multilingual</B>: it defines many languages and translations between them.
+</P>
+<A NAME="toc3"></A>
+<H2>Getting the GF system</H2>
+<P>
+The GF system is open-source free software, which can be downloaded via the
+GF Homepage:
+<center>
+<CODE>gf.digitalgrammars.com</CODE>
+</center>
+There you can download
+</P>
+<UL>
+<LI>binaries for Linux, Mac OS X, and Windows
+<LI>source code and documentation
+<LI>grammar libraries and examples
+</UL>
+
+<P>
+In particular, many of the examples in this book are included in the
+subdirectory <CODE>examples/tutorial</CODE> of the source distribution package.
+This directory is also available 
+<A HREF="http://digitalgrammars.com/gf/examples/tutorial">online</A>.
+</P>
+<P>
+If you want to compile GF from source, you need a Haskell compiler.
+To compile the interactive editor, you also need a Java compilers. 
+But normally you don't have to compile anything yourself, and you definitely
+don't need to know Haskell or Java to use GF.
+</P>
+<P>
+We are assuming the availability of a Unix shell. Linux and Mac OS X users
+have it automatically, the latter under the name "terminal".
+Windows users are recommended to install Cywgin, the free Unix shell for Windows.
+</P>
+<A NAME="toc4"></A>
+<H2>Running the GF system</H2>
+<P>
+To start the GF system, assuming you have installed it, just type 
+<CODE>gf</CODE> in the Unix (or Cygwin) shell:
+</P>
+<PRE>
+    % gf
+</PRE>
+<P>
+You will see GF's welcome message and the prompt <CODE>&gt;</CODE>.
+The command
+</P>
+<PRE>
+    &gt; help
+</PRE>
+<P>
+will give you a list of available commands.
+</P>
+<P>
+As a common convention in this book, we will use
+</P>
+<UL>
+<LI><CODE>%</CODE> as a prompt that marks system commands
+<LI><CODE>&gt;</CODE> as a prompt that marks GF commands
+</UL>
+
+<P>
+Thus you should not type these prompts, but only the characters that
+follow them.
+</P>
+<A NAME="toc5"></A>
+<H2>A "Hello World" grammar</H2>
+<P>
+The tradition in programming language tutorials is to start with a
+program that prints "Hello World" on the terminal. GF should be no 
+exception. But our program has features that distinguish it from
+most "Hello World" programs:
+</P>
+<UL>
+<LI><B>Multilinguality</B>: the message is printed in many languages.
+<LI><B>Reversibility</B>: in addition to printing, you can <B>parse</B> the
+  message and <B>translate</B> it to other languages.
+</UL>
+
+<A NAME="toc6"></A>
+<H3>The program: abstract syntax and concrete syntaxes</H3>
+<P>
+A GF program, in general, is a <B>multilingual grammar</B>. Its main parts
+are
+</P>
+<UL>
+<LI>an <B>abstract syntax</B>
+<LI>one or more <B>concrete syntaxes</B>
+</UL>
+
+<P>
+The abstract syntax defines, in a language-independent way, what <B>meanings</B>
+can be expressed in the grammar. In the "Hello World" grammar we want
+to express <I>Greetings</I>, where we greet a <I>Recipient</I>, which can be 
+<I>World</I> or <I>Mum</I> or <I>Friends</I>. Here is the entire 
+GF code for the abstract syntax:
+</P>
+<PRE>
+    -- a "Hello World" grammar
+    abstract Hello = {
+  
+      flags startcat = Greeting ;
+  
+      cat Greeting ; Recipient ;
+  
+      fun 
+        Hello : Recipient -&gt; Greeting ;
+        World, Mum, Friends : Recipient ;
+    }
+</PRE>
+<P>
+The code has the following parts:
+</P>
+<UL>
+<LI>a <B>comment</B> (optional), saying what the module is doing 
+<LI>a <B>module header</B> indicating that it is an abstract syntax
+  module named <CODE>Hello</CODE>
+<LI>a <B>module body</B> in braces, consisting of
+  <UL>
+  <LI>a <B>startcat flag declaration</B> stating that <CODE>Greeting</CODE> is the
+    main category, i.e. the one in which parsing and generation are
+    performed by default
+  <LI><B>category declarations</B> stating that <CODE>Greeting</CODE> and <CODE>Recipient</CODE>
+    are categories, i.e. types of meanings
+  <LI><B>function declarations</B> stating what meaning-building functions there
+    are; these are the function <CODE>Hello</CODE> constructing a greeting from a recipient,
+    as well as the three possible recipients
+  </UL>
+</UL>
+
+<P>
+A concrete syntax defines a mapping from the abstract meanings to their
+expressions in a language. We first give an English concrete syntax: 
+</P>
+<PRE>
+    concrete HelloEng of Hello = {
+  
+      lincat Greeting, Recipient = {s : Str} ;
+  
+      lin 
+        Hello recip = {s = "hello" ++ recip.s} ;
+        World = {s = "world"} ;
+        Mum = {s = "mum"} ;
+        Friends = {s = "friends"} ;
+    }
+</PRE>
+<P>
+The major parts of this code are:
+</P>
+<UL>
+<LI>a module header indicating that it is a concrete syntax of the abstract syntax
+  <CODE>Hello</CODE>, itself named <CODE>HelloEng</CODE>
+<LI>a module body in curly brackets, consisting of
+  <UL>
+  <LI><B>linearization type definitions</B> stating that 
+    <CODE>Greeting</CODE> and <CODE>Recipient</CODE> are <B>records</B> with a <B>string</B> <CODE>s</CODE>
+  <LI><B>linearization definitions</B> telling what records are assigned to
+    each of the meanings defined in the abstract syntax; the recipients are
+    linearized to records containing single words, whereas the <CODE>Hello</CODE> greeting
+    has a function telling that the word <CODE>hello</CODE> is prefixed to the string
+    <CODE>s</CODE> contained in the record <CODE>recip</CODE>
+  </UL>
+</UL>
+
+<P>
+To make the grammar truly multilingual, we add a Finnish and an Italian concrete
+syntax:
+</P>
+<PRE>
+    concrete HelloFin of Hello = {
+      lincat Greeting, Recipient = {s : Str} ;
+      lin 
+        Hello recip = {s = "terve" ++ recip.s} ;
+        World = {s = "maailma"} ;
+        Mum = {s = "�iti"} ;
+        Friends = {s = "yst�v�t"} ;
+    }
+  
+    concrete HelloIta of Hello = {
+      lincat Greeting, Recipient = {s : Str} ;
+      lin 
+        Hello recip = {s = "ciao" ++ recip.s} ;
+        World = {s = "mondo"} ;
+        Mum = {s = "mamma"} ;
+        Friends = {s = "amici"} ;
+    }
+</PRE>
+<P>
+Now we have a trilingual grammar usable for translation and
+many other tasks, which we will now start experimenting with.
+</P>
+<A NAME="toc7"></A>
+<H3>Using the grammar in the GF system</H3>
+<P>
+In order to compile the grammar in GF, each of the four modules
+has to be put into a file named <I>Modulename</I><CODE>.gf</CODE>:
+</P>
+<PRE>
+    Hello.gf  HelloEng.gf  HelloFin.gf  HelloIta.gf
+</PRE>
+<P>
+The first GF command needed when using a grammar is to <B>import</B> it.
+The command has a long name, <CODE>import</CODE>, and a short name, <CODE>i</CODE>.
+When you have started GF (by the shell command <CODE>gf</CODE>), you can thus type either
+</P>
+<PRE>
+    &gt; import HelloEng.gf
+</PRE>
+<P>
+or
+</P>
+<PRE>
+    &gt; i HelloEng.gf
+</PRE>
+<P>
+to get the same effect. In general, all GF commands have a long and a short name;
+short names are convenient when typing commands by hand, whereas long command
+names are more readable in scripts, i.e. files that include sequences of commands.
+</P>
+<P>
+The effect of <CODE>import</CODE> is that the GF system <B>compiles</B> your grammar 
+into an internal representation, and shows a new prompt when it is ready. 
+It will also show how much CPU time was consumed:
+</P>
+<PRE>
+    &gt; i HelloEng.gf
+    - compiling Hello.gf...   wrote file Hello.gfc 8 msec
+    - compiling HelloEng.gf...   wrote file HelloEng.gfc 12 msec
+  
+    12 msec
+    &gt;
+</PRE>
+<P>
+You can now use GF for <B>parsing</B>:
+</P>
+<PRE>
+    &gt; parse "hello world"
+    Hello World
+</PRE>
+<P>
+The <CODE>parse</CODE> (= <CODE>p</CODE>) command takes a <B>string</B>
+(in double quotes) and returns an <B>abstract syntax tree</B> --- the meaning
+of the string as defined in the abstract syntax.
+A tree is, in general, something easier than a string 
+for a machine to understand and to process further, although this
+is not so obvious in this simple grammar. The syntax for trees is that
+of <B>function application</B>, which in GF is written
+</P>
+<PRE>
+    function argument1 ... argumentn
+</PRE>
+<P>
+Parentheses are only needed for grouping. For instance, <CODE>f (a b)</CODE> is
+<CODE>f</CODE> applied to the application of <CODE>a</CODE> to <CODE>b</CODE>. This is different
+from <CODE>f a b</CODE>, which is <CODE>f</CODE> applied to <CODE>a</CODE> and <CODE>b</CODE>.
+</P>
+<P>
+Strings that return a tree when parsed do so in virtue of the grammar
+you imported. Try to parse something that is not in grammar, and you will fail
+</P>
+<PRE>
+    &gt; parse "hello dad"
+    Unknown words: dad
+  
+    &gt; parse "world hello"
+    no tree found
+</PRE>
+<P>
+In the first example, the failure is caused by an unknown word. 
+In the second example, the combination of words is ungrammatical.
+</P>
+<P>
+In addition to parsing, you can also use GF for <B>linearization</B>
+(<CODE>linearize = l</CODE>). This is the inverse of
+parsing, taking trees into strings:
+</P>
+<PRE>
+    &gt; linearize Hello World
+    hello world
+</PRE>
+<P>
+What is the use of this? Typically not that you type in a tree at
+the GF prompt. The utility of linearization comes from the fact that
+you can obtain a tree from somewhere else --- for instance, from
+a parser. A prime example of this is <B>translation</B>: you parse
+with one concrete syntax and linearize with another. Let us
+now do this by first importing the Italian grammar:
+</P>
+<PRE>
+    &gt; import HelloIta.gf
+</PRE>
+<P>
+We can now parse with <CODE>HelloEng</CODE> and <B>pipe</B> the result
+into linearizing with <CODE>HelloIta</CODE>:
+</P>
+<PRE>
+    &gt; parse -lang=HelloEng "hello mum" | linearize -lang=HelloIta
+    ciao mamma
+</PRE>
+<P>
+Notice that, since there are now two concrete syntaxes read into the
+system, the commands use a <B>language flag</B> to indicate
+which concrete syntax is used in each operation. If no language flag is
+given, the last-imported language is applied.
+</P>
+<P>
+To conclude the translation exercise, we import the Finnish grammar
+and pipe English parsing into <B>multilingual generation</B>:
+</P>
+<PRE>
+    &gt; parse -lang=HelloEng "hello friends" | linearize -multi
+    terve yst�v�t
+    ciao amici
+    hello friends
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Test the parsing and translation examples shown above, as well as
+some other examples, in different combinations of languages.
+</P>
+<P>
+<B>Exercise</B>. Extend the grammar <CODE>Hello.gf</CODE> and some of the
+concrete syntaxes by five new recipients and one new greeting
+form.
+</P>
+<P>
+<B>Exercise</B>. Add a concrete syntax for some other
+languages you might know.
+</P>
+<P>
+<B>Exercise</B>. Add a pair of greetings that are expressed in one and the same way in
+one language and in two different ways in another. For instance, <I>good morning</I>
+and <I>good afternoon</I> in English are both expressed as <I>buongiorno</I> in Italian.
+Test what happens when you translate <I>buongiorno</I> to English in GF.
+</P>
+<P>
+<B>Exercise</B>. Inject errors in the <CODE>Hello</CODE> grammars, for example, leave out
+some line, omit a variable in a <CODE>lin</CODE> rule, or change the name in one occurrence
+of a variable. Inspect the error messages generated by GF.
+</P>
+<A NAME="toc8"></A>
+<H2>Using grammars from outside GF</H2>
+<P>
+A normal "hello world" program written in C is executable from the
+Unix shell and print its output on the terminal. This is possible in GF
+as well, by using the <CODE>gf</CODE> program in a Unix pipe. Invoking <CODE>gf</CODE>
+can be made with grammar names as arguments,
+</P>
+<PRE>
+    % gf HelloEng.gf HelloFin.gf HelloIta.gf
+</PRE>
+<P>
+which has the same effect as opening <CODE>gf</CODE> and then importing the
+grammars. A command can be send to this <CODE>gf</CODE> state by piping it from
+Unix's <CODE>echo</CODE> command:
+</P>
+<PRE>
+    % echo "l -multi Hello Wordl" | gf HelloEng.gf HelloFin.gf HelloIta.gf
+</PRE>
+<P>
+which will execute the command and then quit. Alternatively, one can write
+a <B>script</B>, a file containing the lines
+</P>
+<PRE>
+    import HelloEng.gf
+    import HelloFin.gf
+    import HelloIta.gf
+    linearize -multi Hello World
+</PRE>
+<P>
+If we name this script <CODE>hello.gfs</CODE>, we can do
+</P>
+<PRE>
+    $ gf -batch -s &lt;hello.gfs s
+  
+    ciao mondo
+    terve maailma
+    hello world
+</PRE>
+<P>
+The options <CODE>-batch</CODE> and <CODE>-s</CODE> ("silent") remove prompts, CPU time,
+and other messages. Writing GF scripts and Unix shell scripts that call 
+GF is the simplest way to build application programs that use GF grammars.
+In <a href="#chapeight">the eighth chapter</a>, we will see how to build stand-alone programs that don't need
+the GF system to run.
+</P>
+<P>
+<B>Exercise</B>. (For Unix hackers.) Write a GF application that reads
+an English string from the standard input and writes an Italian
+translation to the output.
+</P>
+<A NAME="toc9"></A>
+<H2>What else can be done with the grammar</H2>
+<P>
+Now we have built our first multilingual grammar and seen the basic
+functionalities of GF: parsing and linearization. We have tested
+these functionalities inside the GF program. In the forthcoming
+chapters, we will build larger grammars and can then get more out of
+these functionalities. But we will also introduce new ones:
+</P>
+<UL>
+<LI><B>morphological analysis</B>: find out the possible inflection forms of words
+<LI><B>morphological synthesis</B>: generate all inflection forms of words
+<LI><B>random generation</B>: generate random expressions
+<LI><B>corpus generation</B>: generate all expressions
+<LI><B>treebank generation</B>: generate a list of trees with their linearizations
+<LI><B>teaching quizzes</B>: train morphology and translation
+<LI><B>multilingual authoring</B>: create a document in many languages simultaneously
+<LI><B>speech input</B>: optimize a speech recognition system for a grammar
+</UL>
+
+<P>
+The usefulness of GF would be quite limited if grammars were
+usable only inside the GF system. In <a href="#chapeight">the eighth chapter</a>,
+we will see other ways of using grammars:
+</P>
+<UL>
+<LI>compile them to new formats, such as speech recognition grammars
+<LI>embed them in Java and Haskell programs
+<LI>build applications using compilation and embedding:
+  <UL>
+  <LI>voice commands
+  <LI>spoken language translators
+  <LI>dialogue systems
+  <LI>user interfaces
+  <LI>localization: parametrize the messages printed by a program
+    to support different languages
+  </UL>
+</UL>
+
+<P>
+All GF functionalities, both those inside the GF program and those 
+ported to other environments, 
+are of course already applicable to the simplest of grammars,
+such as the <CODE>Hello</CODE> grammars presented above. But the main focus
+of this tutorial will be on grammar writing. Thus we will show
+how larger and more expressive grammars can be built by using 
+the constructs of the GF programming language, before entering the
+applications.
+</P>
+<A NAME="toc10"></A>
+<H2>Summary of GF language features</H2>
+<P>
+As the last section of each chapter, we will give a summary of the GF language
+features covered in the chapter. The presentation is rather technical and intended
+as a reference for later use, rather than to be read at once. The summaries
+may cover some new features, which complement the discussion in the main chapter.
+</P>
+<A NAME="toc11"></A>
+<H3>Modules</H3>
+<P>
+A GF grammar consists of <B>modules</B>, 
+into which judgements are grouped. The most important
+module forms are
+</P>
+<UL>
+<LI><CODE>abstract</CODE> A <CODE>= {...}</CODE> , abstract syntax A with judgements in
+  the <B>module body</B> <CODE>{...}</CODE>.
+<LI><CODE>concrete</CODE> C <CODE>of</CODE> A <CODE>=  {...}</CODE>, concrete syntax C of the
+       abstract syntax A, with judgements in the module body <CODE>{...}</CODE>.
+</UL>
+
+<P>
+Each module is written in a file named <I>Modulename</I><CODE>.gf</CODE>.
+</P>
+<A NAME="toc12"></A>
+<H3>Judgements</H3>
+<P>
+<a name="secjment"></a>
+</P>
+<P>
+Rules in a module body are called <B>judgements</B>. Keywords such as
+<CODE>fun</CODE> and <CODE>lin</CODE> are used for distinguishing between
+<B>judgement forms</B>. Here is a summary of the most important
+judgement forms, which we have considered by now:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>form</TH>
+<TH>reading</TH>
+<TH COLSPAN="2">module type</TH>
+</TR>
+<TR>
+<TD><CODE>cat</CODE> <I>C</I></TD>
+<TD><I>C</I> is a category</TD>
+<TD>abstract</TD>
+</TR>
+<TR>
+<TD><CODE>fun</CODE> <I>f</I> <CODE>:</CODE> <I>A</I></TD>
+<TD><I>f</I> is a function of type <I>A</I></TD>
+<TD>abstract</TD>
+</TR>
+<TR>
+<TD><CODE>lincat</CODE> <I>C</I> <CODE>=</CODE> <I>T</I></TD>
+<TD>category <I>C</I> has linearization type <I>T</I></TD>
+<TD>concrete</TD>
+</TR>
+<TR>
+<TD><CODE>lin</CODE> <I>f <i>x</i><sub>1</sub> ... <i>x</i><sub>n</sub></I> <CODE>=</CODE> <I>t</I></TD>
+<TD>function <I>f</I> has linearization <I>t</I></TD>
+<TD>concrete</TD>
+</TR>
+<TR>
+<TD><CODE>flags</CODE> <I>p</I> <CODE>=</CODE> <I>v</I></TD>
+<TD>flag <I>p</I> has value <I>v</I></TD>
+<TD>any</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+Both abstract and concrete modules may moreover contain <B>comments</B> of the forms
+</P>
+<UL>
+<LI><CODE>--</CODE> <I>anything until a newline</I>
+<LI><CODE>{-</CODE> <I>anything except hyphen followed by closing brace</I> <CODE>-}</CODE>
+</UL>
+
+<P>
+Judgements are terminated by semicolons. Shorthands permit the sharing of
+the keyword in subsequent judgements, 
+</P>
+<PRE>
+    cat C ; D ; ===   cat C ; cat D ; 
+</PRE>
+<P>
+and of the right-hand-side in subsequent judgements of the same form
+</P>
+<PRE>
+    fun f, g : A ;  ===  fun f : A ; g : A ; 
+</PRE>
+<P>
+We will use the symbol <CODE>===</CODE> to indicate <B>syntactic sugar</B> when
+speaking about GF. Thus it is not a symbol of the GF language.
+</P>
+<P>
+Each judgement declares a <B>name</B>, which is an <B>identifier</B>. 
+An identifier is a letter followed by a sequence of letters, digits, and
+characters <CODE>'</CODE> or <CODE>_</CODE>. Each identifier can only be
+defined once in the same module (that is, as next to the judgement keyword;
+local variables such as those in <CODE>lin</CODE> judgemenrs can be
+reused in other judgements).
+</P>
+<P>
+Names are in <B>scope</B> in the rest of the module, i.e. usable in the other 
+judgements of the module (subject to type restrictions, of course). Also
+the name of the module is an identifier in scope.
+</P>
+<P>
+The order of judgements in a module is free. In particular, an identifier
+need not be declared before it is used. 
+</P>
+<A NAME="toc13"></A>
+<H3>Types and terms</H3>
+<P>
+A <B>type</B> in an abstract syntax are either a <B>basic type</B>,
+i.e. one introduced in a <CODE>cat</CODE> judgement, or a
+<B>function type</B> of the form
+</P>
+<PRE>
+    A1 -&gt; ... -&gt; An -&gt; A
+</PRE>
+<P>
+where each of <CODE>A1, ..., An, A</CODE> is a basic type. 
+The last type in the arrow-separated sequence 
+is the <B>value type</B> of the function type, and the earlier types are 
+its <B>argument types</B>.
+</P>
+<P>
+In a concrete syntax, the available types include
+</P>
+<UL>
+<LI>the type of <B>token lists</B>, <CODE>Str</CODE>
+<LI><B>record types</B> of form <CODE>{</CODE> r1 : T1 ; ... ; rn : Tn <CODE>}</CODE>
+</UL>
+
+<P>
+Token lists are often briefly called <B>strings</B>.
+</P>
+<P>
+Each semi-colon separated part in a record type is called a
+<B>field</B>. The identifier introduced by the left-hand-side of a field
+is called a <B>label</B>.
+</P>
+<P>
+A <B>term</B> in abstract syntax is a <B>function application</B> of  form
+</P>
+<PRE>
+    f a1 ... an
+</PRE>
+<P>
+where <CODE>f</CODE> is a function declared in a <CODE>fun</CODE> judgement and <CODE>a1 ... an</CODE>
+are terms. These terms are also called <B>abstract syntax trees</B>, or just 
+<B>trees</B>. 
+The tree above is well-typed and has the type A, if
+</P>
+<PRE>
+    f : A1 -&gt; ... -&gt; An -&gt; A
+</PRE>
+<P>
+and each <CODE>ai</CODE> has type <CODE>an</CODE>.
+</P>
+<P>
+A term used in concrete syntax has one the forms
+</P>
+<UL>
+<LI>quoted string: <CODE>"foo"</CODE>, of type <CODE>Str</CODE>
+<LI>concatenation of strings: <CODE>"foo" ++ "bar"</CODE>,
+<LI>record: <CODE>{</CODE> r1 = t1 ; ... ; rn = Tn <CODE>}</CODE>, 
+  of type <CODE>{</CODE> r1 : R1 ; ... ; rn : Rn <CODE>}</CODE>
+<LI>projection <CODE>t.r</CODE> of a term <CODE>t</CODE> that has a record type,
+  with the record label <CODE>r</CODE>; the projection has the corresponding record
+  field type
+<LI>argument variable <CODE>x</CODE> bound by the left-hand-side of a <CODE>lin</CODE> rule,
+  of the corresponding linearization type
+</UL>
+
+<P>
+Each quoted string is treated as one <B>token</B>, and strings concatenated by
+<CODE>++</CODE> are treated as separate tokens. Tokens are, by default, written with
+a space in between. This behaviour can be changed by <CODE>lexer</CODE> and <CODE>unlexer</CODE>
+flags, as will be explained later "Rseclexing. Therefore it is usually 
+not correct to have a space in a token. Writing
+</P>
+<PRE>
+    "hello world"
+</PRE>
+<P>
+in a grammar would give the parser the task to find a token with a space
+in it, rather than two tokens <CODE>"hello"</CODE> and <CODE>"world"</CODE>. If the latter is
+what is meant, it is possible to use the shorthand
+</P>
+<PRE>
+    ["hello world"]  ===  "hello" ++ "world"
+</PRE>
+<P>
+The <B>empty string</B> is denoted by <CODE>[]</CODE> or, equivalently, <CODE>`` or ``[]</CODE>.
+</P>
+<A NAME="toc14"></A>
+<H3>Type checking</H3>
+<P>
+An important functionality of the GF system is <B>static type checking</B>.
+This means that the grammars are controlled to be well-formed, so that all
+run-time errors are eliminated. The main type checking principles are the
+following:
+</P>
+<UL>
+<LI>a concrete syntax must define the <CODE>lincat</CODE> of each <CODE>cat</CODE> and a <CODE>lin</CODE>
+  for each <CODE>fun</CODE> in the abstract syntax that it is "<CODE>of</CODE>"
+<LI><CODE>lin</CODE> rules are type checked with respect to the <CODE>lincat</CODE> and <CODE>fun</CODE>
+  rules
+<LI>terms have types as defined in the previous section
+</UL>
+
+<A NAME="toc15"></A>
+<H1>Designing a grammar for complex phrases</H1>
+<P>
+<a name="chapthree"></a>
+</P>
+<P>
+In this chapter, we will write a grammar that has much more structure than
+the <CODE>Hello</CODE> grammar. We will look at how the abstract syntax
+is divided into suitable categories, and how infinitely many
+phrases can be generated by using recursive rules. We will also
+introduce modularity by showing how a grammar can be 
+divided into modules, and how functional programming
+can be used to share code in and among modules.
+</P>
+<A NAME="toc16"></A>
+<H2>The abstract syntax Food</H2>
+<P>
+We will write a grammar that
+defines a set of phrases usable for speaking about food:
+</P>
+<UL>
+<LI>the start category is <CODE>Phrase</CODE>
+<LI>a <CODE>Phrase</CODE> can be built by assigning a <CODE>Quality</CODE> to an <CODE>Item</CODE> 
+  (e.g. <I>this cheese is Italian</I>)
+<LI>an<CODE>Item</CODE> are build from a <CODE>Kind</CODE> by prefixing <I>this</I> or <I>that</I> 
+  (e.g. <I>this wine</I>)
+<LI>a <CODE>Kind</CODE> is either <B>atomic</B> (e.g. <I>cheese</I>), or formed 
+  qualifying a given <CODE>Kind</CODE> with a <CODE>Quality</CODE> (e.g. <I>Italian cheese</I>)
+<LI>a <CODE>Quality</CODE> is either atomic (e.g. <I>Italian</I>,
+  or built by modifying a given <CODE>Quality</CODE> with the word <I>very</I> (e.g. <I>very warm</I>)
+</UL>
+
+<P>
+These verbal descriptions can be expressed as the following abstract syntax:
+</P>
+<PRE>
+    abstract Food = {
+  
+      flags startcat = Phrase ;
+  
+      cat
+        Phrase ; Item ; Kind ; Quality ;
+  
+      fun
+        Is : Item -&gt; Quality -&gt; Phrase ;
+        This, That : Kind -&gt; Item ;
+        QKind : Quality -&gt; Kind -&gt; Kind ;
+        Wine, Cheese, Fish : Kind ;
+        Very : Quality -&gt; Quality ;
+        Fresh, Warm, Italian, Expensive, Delicious, Boring : Quality ;
+    }
+</PRE>
+<P>
+In this abstract syntax, we can build <CODE>Phrase</CODE>s such as
+</P>
+<PRE>
+    Is (This (QKind Delicious (QKind Italian Wine))) (Very (Very Expensive))
+</PRE>
+<P>
+In the English concrete syntax, we will want to linearize this into
+</P>
+<PRE>
+    this delicious Italian wine is very very expensive
+</PRE>
+<P></P>
+<A NAME="toc17"></A>
+<H2>The concrete syntax FoodEng</H2>
+<P>
+The English concrete syntax gives no surprises:
+</P>
+<PRE>
+    concrete FoodEng of Food = {
+  
+      lincat
+        Phrase, Item, Kind, Quality = {s : Str} ;
+  
+      lin
+        Is item quality = {s = item.s ++ "is" ++ quality.s} ;
+        This kind = {s = "this" ++ kind.s} ;
+        That kind = {s = "that" ++ kind.s} ;
+        QKind quality kind = {s = quality.s ++ kind.s} ;
+        Wine = {s = "wine"} ;
+        Cheese = {s = "cheese"} ;
+        Fish = {s = "fish"} ;
+        Very quality = {s = "very" ++ quality.s} ;
+        Fresh = {s = "fresh"} ;
+        Warm = {s = "warm"} ;
+        Italian = {s = "Italian"} ;
+        Expensive = {s = "expensive"} ;
+        Delicious = {s = "delicious"} ;
+        Boring = {s = "boring"} ;
+    }  
+</PRE>
+<P>
+Let us test how the grammar works in parsing:
+</P>
+<PRE>
+    &gt; import FoodEng.gf
+    &gt; parse "this delicious wine is very very Italian"
+    Is (This (QKind Delicious Wine)) (Very (Very Italian))
+</PRE>
+<P>
+We can also try parsing in other categories than the <CODE>startcat</CODE>,
+by setting the command-line <CODE>cat</CODE> flag:
+</P>
+<PRE>
+    p -cat=Kind "very Italian wine"
+    QKind (Very Italian) Wine
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Extend the <CODE>Food</CODE> grammar by ten new food kinds and
+qualities, and run the parser with new kinds of examples.
+</P>
+<P>
+<B>Exercise</B>. Add a rule that enables question phrases of the form 
+<I>is this cheese Italian</I>.
+</P>
+<P>
+<B>Exercise</B>. Enable the optional prefixing of 
+phrases with the words "excuse me but". Do this in such a way that
+the prefix can occur at most once.
+</P>
+<A NAME="toc18"></A>
+<H2>Commands for testing grammars</H2>
+<A NAME="toc19"></A>
+<H3>Generating trees and strings</H3>
+<P>
+When we have a grammar above a trivial size, especially a recursive
+one, we need more efficient ways of testing it than just by parsing
+sentences that happen to come to our minds. One way to do this is 
+based on automatic generation, which can be either
+<B>random generation</B> or <B>exhaustive generation</B>.
+</P>
+<P>
+Random generation (<CODE>generate_random = gr</CODE>) is an operation that 
+builds a random tree in accordance with an abstract syntax:
+</P>
+<PRE>
+    &gt; generate_random
+    Is (This (QKind Italian Fish)) Fresh
+</PRE>
+<P>
+By using a pipe, random generation can be fed into linearization:
+</P>
+<PRE>
+    &gt; generate_random | linearize
+    this Italian fish is fresh
+</PRE>
+<P>
+Random generation is a good way to test a grammar. It can also give results
+that are surprising, which shows how fast we lose intuition
+when we write complex grammars.
+</P>
+<P>
+By using the <CODE>number</CODE> flag, several trees can be generated
+in one command:
+</P>
+<PRE>
+    &gt; gr -number=10 | l
+    that wine is boring
+    that fresh cheese is fresh
+    that cheese is very boring
+    this cheese is Italian
+    that expensive cheese is expensive
+    that fish is fresh
+    that wine is very Italian
+    this wine is Italian
+    this cheese is boring
+    this fish is boring
+</PRE>
+<P>
+To generate <I>all</I> phrases that a grammar can produce,
+GF provides the command <CODE>generate_trees = gt</CODE>.
+</P>
+<PRE>
+    &gt; generate_trees | l
+    that cheese is very Italian
+    that cheese is very boring
+    that cheese is very delicious
+    that cheese is very expensive
+    that cheese is very fresh
+    ...
+    this wine is expensive
+    this wine is fresh
+    this wine is warm
+  
+</PRE>
+<P>
+We get quite a few trees but not all of them: only up to a given
+<B>depth</B> of trees. The default depth is 3; the depth can be
+set by using the <CODE>depth</CODE> flag:
+</P>
+<PRE>
+    &gt; generate_trees -depth=5 | l
+</PRE>
+<P>
+Other options to the generation commands (like all commands) can be seen
+by GF's <CODE>help = h</CODE> command:
+</P>
+<PRE>
+    &gt; help gr
+    &gt; help gt
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. If the command <CODE>gt</CODE> generated all
+trees in your grammar, it would never terminate. Why?
+</P>
+<P>
+<B>Exercise</B>. Measure how many trees the grammar gives with depths 4 and 5,
+respectively. <B>Hint</B>. You can 
+use the Unix <B>word count</B> command <CODE>wc</CODE> to count lines.
+</P>
+<A NAME="toc20"></A>
+<H3>More on pipes; tracing</H3>
+<P>
+A pipe of GF commands can have any length, but the "output type"
+(either string or tree) of one command must always match the "input type"
+of the next command, in order for the result to make sense.
+</P>
+<P>
+The intermediate results in a pipe can be observed by putting the
+<B>tracing</B> option <CODE>-tr</CODE> to each command whose output you
+want to see:
+</P>
+<PRE>
+    &gt; gr -tr | l -tr | p
+  
+    Is (This Cheese) Boring
+    this cheese is boring
+    Is (This Cheese) Boring  
+</PRE>
+<P>
+This facility is useful for test purposes: the pipe above can show
+if a grammar is <B>ambiguous</B>, i.e.
+contains strings that can be parsed in more than one way.
+</P>
+<P>
+<B>Exercise</B>. Extend the <CODE>Food</CODE> grammar so that it produces ambiguous 
+strings, and try out the ambiguity test.
+</P>
+<A NAME="toc21"></A>
+<H3>Writing and reading files</H3>
+<P>
+To save the outputs of GF commands into a file, you can
+pipe it to the <CODE>write_file = wf</CODE> command,
+</P>
+<PRE>
+    &gt; gr -number=10 | linearize | write_file exx.tmp
+</PRE>
+<P>
+You can read the file back to GF with the
+<CODE>read_file = rf</CODE> command,
+</P>
+<PRE>
+    &gt; read_file exx.tmp | parse -lines
+</PRE>
+<P>
+Notice the flag <CODE>-lines</CODE> given to the parsing
+command. This flag tells GF to parse each line of
+the file separately. Without the flag, the grammar could
+not recognize the string in the file, because it is not
+a sentence but a sequence of ten sentences.
+</P>
+<P>
+Files with examples can be used for <B>regression testing</B>
+of grammars. The most systematic way to do this is by
+generating treebanks; see <a href="#sectreebank">here</a>.
+</P>
+<A NAME="toc22"></A>
+<H3>Visualizing trees</H3>
+<P>
+The gibberish code with parentheses returned by the parser does not
+look like trees. Why is it called so? From the abstract mathematical
+point of view, trees are a data structure that
+represents <B>nesting</B>: trees are branching entities, and the branches
+are themselves trees. Parentheses give a linear representation of trees,
+useful for the computer. But the human eye may prefer to see a visualization;
+for this purpose, GF provides the command <CODE>visualize_tree = vt</CODE>, to which
+parsing (and any other tree-producing command) can be piped:
+</P>
+<PRE>
+    &gt; parse "this delicious cheese is very Italian" | visualize_tree
+</PRE>
+<P></P>
+<P>
+<IMG ALIGN="middle" SRC="mytree.png" BORDER="0" ALT="">
+</P>
+<P>
+This command uses the programs Graphviz and Ghostview, which you
+might not have, but which are freely available on the web.
+</P>
+<P>
+Alternatively, you can print the tree into a file
+e.g. a <CODE>.png</CODE> file that
+can be be viewed with e.g. an HTML browser and also included in an
+HTML document. You can do this
+by saving the file <CODE>grphtmp.dot</CODE>, which the command <CODE>vt</CODE>
+produces. Then you can process this file with the <CODE>dot</CODE> 
+program (from the Graphviz package).
+</P>
+<PRE>
+    % dot -Tpng grphtmp.dot &gt; mytree.png
+</PRE>
+<P></P>
+<A NAME="toc23"></A>
+<H3>System commands</H3>
+<P>
+If you don't have Ghostview, or want to view graphs in some other way,
+you can call <CODE>dot</CODE> and a suitable
+viewer (e.g. <CODE>open</CODE> in Mac) without leaving GF, by using
+a <B>system command</B>: <CODE>!</CODE> followed by a Unix command,
+</P>
+<PRE>
+    &gt; ! dot -Tpng grphtmp.dot &gt; mytree.png
+    &gt; ! open mytree.png
+</PRE>
+<P>
+Another form of system commands are those that receive arguments from
+GF pipes. The escape symbol
+is then <CODE>?</CODE>.
+</P>
+<PRE>
+    &gt; generate_trees | ? wc
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. (Exercise drom 3.3.1 revisited.)
+Measure how many trees the grammar <CODE>FoodEng</CODE> gives with depths 4 and 5,
+respectively. Use the Unix <B>word count</B> command <CODE>wc</CODE> to count lines, and
+a pipe from a GF command into a Unix command.
+</P>
+<A NAME="toc24"></A>
+<H2>An Italian concrete syntax</H2>
+<P>
+<a name="secanitalian"></a>
+</P>
+<P>
+We write the Italian grammar in a straightforward way, by replacing
+English words with their dictionary equivalents:
+</P>
+<PRE>
+    concrete FoodIta of Food = {
+  
+      lincat
+        Phrase, Item, Kind, Quality = {s : Str} ;
+  
+      lin
+        Is item quality = {s = item.s ++ "�" ++ quality.s} ;
+        This kind = {s = "questo" ++ kind.s} ;
+        That kind = {s = "quello" ++ kind.s} ;
+        QKind quality kind = {s = kind.s ++ quality.s} ;
+        Wine = {s = "vino"} ;
+        Cheese = {s = "formaggio"} ;
+        Fish = {s = "pesce"} ;
+        Very quality = {s = "molto" ++ quality.s} ;
+        Fresh = {s = "fresco"} ;
+        Warm = {s = "caldo"} ;
+        Italian = {s = "italiano"} ;
+        Expensive = {s = "caro"} ;
+        Delicious = {s = "delizioso"} ;
+        Boring = {s = "noioso"} ;
+    }
+</PRE>
+<P>
+An alert reader, or one who already knows Italian, may notice one point in
+which the change is more substantial than just replacement of words: the order of
+a quality and the kind it modifies in
+</P>
+<PRE>
+      QKind quality kind = {s = kind.s ++ quality.s} ;
+</PRE>
+<P>
+Thus Italian says <CODE>vino italiano</CODE> for <CODE>Italian wine</CODE>. (Some Italian adjectives
+are put before the noun. This distinction can be controlled by parameters, which
+are introduced in <a href="#chaptwo">the fourth chapter</a>.)
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax of <CODE>Food</CODE> for some other language.
+You will probably end up with grammatically incorrect linearizations --- but don't
+worry about this yet.
+</P>
+<P>
+<B>Exercise</B>. If you have written <CODE>Food</CODE> for German, Swedish, or some
+other language, test with random or exhaustive generation what constructs
+come out incorrect, and prepare a list of those ones that cannot be helped
+with the currently available fragment of GF. You can return to your list
+after having worked out <a href="#chaptwo">the fourth chapter</a>.
+</P>
+<A NAME="toc25"></A>
+<H2>Free variation</H2>
+<P>
+Sometimes there are alternative ways to define a concrete syntax.
+For instance, if we use the <CODE>Food</CODE> grammars in a restaurant phrase
+book, we may want to accept different words for expressing the quality
+"delicious" ---- and different languages can differ in how many
+such words they have. Then we don't want to put the distinctions into
+the abstract syntax, but into concrete syntaxes. Such semantically
+neutral distinctions are known as <B>free variation</B> in linguistics.
+</P>
+<P>
+The <CODE>variants</CODE> construct of GF expresses free variation. For example,
+</P>
+<PRE>
+    lin Delicious = {s = variants {"delicious" ; "exquisit" ; "tasty"}} ;
+</PRE>
+<P>
+says that <CODE>Delicious</CODE> can be linearized to any of <I>delicious</I>,
+<I>exquisit</I>, and <I>tasty</I>. As a consequence, both these words result in the
+tree <CODE>Delicious</CODE> when parsed. By default, the <CODE>linearize</CODE> command
+shows only the first variant from each <CODE>variants</CODE> list; to see them
+all, the option <CODE>-all</CODE> can be used:
+</P>
+<PRE>
+    &gt; p "this exquisit wine is delicious" | l -all
+    this delicious wine is delicious
+    this delicious wine is exquisit
+    ...
+</PRE>
+<P>
+In linguistics, it is well known that free variation is almost
+non-existing, if all aspects of expressions are taken into account, including style.
+Therefore, free variation should not be used in grammars that are meant as
+libraries for other grammars, as in <a href="#chapfive">the fifth chapter</a>. However, in a specific
+application, free variation is an excellent way to scale up the parser to
+variations in user input that make no difference in the semantic
+treatment. 
+</P>
+<P>
+An example that clearly illustrates these points is the
+English negation. If we added to the <CODE>Food</CODE> grammar the negation
+of a quality, we could accept both contracted and uncontracted <I>not</I>:
+</P>
+<PRE>
+    fun IsNot : Item -&gt; Quality -&gt; Phrase ;
+    lin IsNot item qual = 
+      {s = item.s ++ variants {"isn't" ; ["is not"]} ++ qual.s} ;
+</PRE>
+<P>
+Both forms are likely to occur in user input. Since there is no
+corresponding contrast in Italian, we do not want to put the distinction
+in the abstract syntax. Yet there is a stylistic difference between
+these two forms. In particular, if we are doing generation rather
+than parsing, we will want to choose the one or
+the other depending on the kind of language we want to generate.
+</P>
+<P>
+A limiting case of free variation is an empty variant list
+</P>
+<PRE>
+    variants {}
+</PRE>
+<P>
+It can be used e.g. if a word lacks a certain inflection form. 
+</P>
+<P>
+Free variation works for all types in concrete syntax; all terms in
+a <CODE>variants</CODE> list must be of the same type.
+</P>
+<P>
+<B>Exercise</B>. Modify <CODE>FoodIta</CODE> in such a way that a quality can
+be assigned to an item by using two different word orders, exemplified
+by <I>questo vino � delizioso</I> and <I>� delizioso questo vino</I>
+(a real variation in Italian),
+and that it is impossible to say that something is boring
+(a rather contrived example).
+</P>
+<A NAME="toc26"></A>
+<H2>More application of multilingual grammars</H2>
+<A NAME="toc27"></A>
+<H3>Multilingual treebanks</H3>
+<P>
+<a name="sectreebank"></a>
+</P>
+<P>
+A <B>multilingual treebank</B> is a set of trees with their
+translations in different languages:
+</P>
+<PRE>
+    &gt; gr -number=2 | tree_bank
+  
+    Is (That Cheese) (Very Boring)
+    quello formaggio � molto noioso
+    that cheese is very boring
+  
+    Is (That Cheese) Fresh
+    quello formaggio � fresco
+    that cheese is fresh
+</PRE>
+<P>
+There is also an XML format for treebanks and a set of commands
+suitable for regression testing; see <CODE>help tb</CODE> for more details.
+</P>
+<A NAME="toc28"></A>
+<H3>Translation session</H3>
+<P>
+If translation is what you want to do with a set of grammars, a convenient
+way to do it is to open a <CODE>translation_session = ts</CODE>. In this session,
+you can translate between all the languages that are in scope.
+A dot <CODE>.</CODE> terminates the translation session.
+</P>
+<PRE>
+    &gt; ts
+  
+    trans&gt; that very warm cheese is boring
+    quello formaggio molto caldo � noioso
+    that very warm cheese is boring
+  
+    trans&gt; questo vino molto italiano � molto delizioso
+    questo vino molto italiano � molto delizioso
+    this very Italian wine is very delicious
+  
+    trans&gt; .
+    &gt;
+</PRE>
+<P></P>
+<A NAME="toc29"></A>
+<H3>Translation quiz</H3>
+<P>
+This is a simple language exercise that can be automatically
+generated from a multilingual grammar. The system generates a set of
+random sentences, displays them in one language, and checks the user's
+answer given in another language. The command <CODE>translation_quiz = tq</CODE>
+makes this in a subshell of GF.
+</P>
+<PRE>
+    &gt; translation_quiz FoodEng FoodIta
+  
+    Welcome to GF Translation Quiz.
+    The quiz is over when you have done at least 10 examples
+    with at least 75 % success.
+    You can interrupt the quiz by entering a line consisting of a dot ('.').
+  
+    this fish is warm
+    questo pesce � caldo
+    &gt; Yes.
+    Score 1/1
+  
+    this cheese is Italian
+    questo formaggio � noioso
+    &gt; No, not questo formaggio � noioso, but
+    questo formaggio � italiano
+  
+    Score 1/2
+    this fish is expensive
+</PRE>
+<P>
+You can also generate a list of translation exercises and save it in a
+file for later use, by the command <CODE>translation_list = tl</CODE>
+</P>
+<PRE>
+    &gt; translation_list -number=25 FoodEng FoodIta | write_file transl.txt
+</PRE>
+<P>
+The <CODE>number</CODE> flag gives the number of sentences generated.
+</P>
+<A NAME="toc30"></A>
+<H3>Multilingual syntax editing</H3>
+<P>
+<a name="secediting"></a>
+</P>
+<P>
+Any multilingual grammar can be used in the graphical syntax editor, which is
+opened by the shell 
+command <CODE>gfeditor</CODE> followed by the names of the grammar files.
+Thus
+</P>
+<PRE>
+    % gfeditor FoodEng.gf FoodIta.gf 
+</PRE>
+<P>
+opens the editor for the two <CODE>Food</CODE> grammars. 
+</P>
+<P>
+The editor supports commands for manipulating an abstract syntax tree.
+The process is started by choosing a category from the "New" menu.
+Choosing <CODE>Phrase</CODE> creates a new tree of type <CODE>Phrase</CODE>. A new tree
+is in general completely unknown: it consists of a <B>metavariable</B>
+<CODE>?1</CODE>. However, since the category <CODE>Phrase</CODE> in <CODE>Food</CODE> has
+only one possible constructor, <CODE>Is</CODE>, the tree is readily
+given the form <CODE>Is ?1 ?2</CODE>. Here is what the editor looks like at
+this stage:
+</P>
+<P>
+  <IMG ALIGN="right" SRC="food1.png" BORDER="0" ALT="">
+</P>
+<P>
+Editing goes on by <B>refinements</B>, i.e. choices of constructors from
+the menu, until no metavariables remain. Here is a tree resulting from the
+current editing session:
+</P>
+<P>
+  <IMG ALIGN="right" SRC="food2.png" BORDER="0" ALT="">
+</P>
+<P>
+Editing can be continued even when the tree is finished. The user can shift
+the <B>focus</B> to some of the subtrees by clicking at it or the corresponding
+part of a linearization. In the picture, the focus is on "fish".
+Since there are no metavariables, 
+the menu shows no refinements, but some other possible actions:
+</P>
+<UL>
+<LI>to <B>change</B> "fish" to "cheese" or "wine"
+<LI>to <B>delete</B> "fish", i.e. change it to a metavariable
+<LI>to <B>wrap</B> "fish" in a qualification, i.e. change it to
+  <CODE>QKind ? Fish</CODE>, where the quality can be given in a later refinement
+</UL>
+
+<P>
+In addition to menu-based editing, the tool supports refinement by parsing,
+which is accessible by middle-clicking in the tree or in the linearization field.
+</P>
+<P>
+<B>Exercise</B>. Construct the sentence 
+<I>this very expensive cheese is very very delicious</I>
+and its Italian translation by using <CODE>gfeditor</CODE>.
+</P>
+<A NAME="toc31"></A>
+<H2>Context-free grammars and GF</H2>
+<P>
+Readers not familar with context-free grammars, also known as BNF grammars, can
+skip this section. Those that are familar with them will find here the exact
+relation between GF and context-free grammars. We will moreover show how
+the BNF format can be used as input to the GF program; it is often more
+concise than GF proper, but also more restricted in expressive power.
+</P>
+<A NAME="toc32"></A>
+<H3>The "cf" grammar format</H3>
+<P>
+The grammar <CODE>FoodEng</CODE> could be written in a BNF format as follows:
+</P>
+<PRE>
+    Is.        Phrase  ::= Item "is" Quality ;
+    That.      Item    ::= "that" Kind ;
+    This.      Item    ::= "this" Kind ;
+    QKind.     Kind    ::= Quality Kind ;
+    Cheese.    Kind    ::= "cheese" ;
+    Fish.      Kind    ::= "fish" ;
+    Wine.      Kind    ::= "wine" ;
+    Italian.   Quality ::= "Italian" ;
+    Boring.    Quality ::= "boring" ;
+    Delicious. Quality ::= "delicious" ;
+    Expensive. Quality ::= "expensive" ;
+    Fresh.     Quality ::= "fresh" ;
+    Very.      Quality ::= "very" Quality ;
+    Warm.      Quality ::= "warm" ;
+</PRE>
+<P>
+In this format, each rule is prefixed by a <B>label</B> that gives
+the constructor function GF gives in its <CODE>fun</CODE> rules. In fact,
+each context-free rule is a fusion of a <CODE>fun</CODE> and a <CODE>lin</CODE> rule:
+it states simultaneously that
+</P>
+<UL>
+<LI>the label is a function from the nonterminal categories
+  on the right-hand side to the category on the left-hand side;
+  the first rule gives
+<PRE>
+    fun Is : Item -&gt; Quality -&gt; Phrase 
+</PRE>
+<LI>trees built by the label are linearized in the way indicated
+  by the right-hand side;
+  the first rule gives
+<PRE>
+    lin Is item quality = {s = item.s ++ "is" ++ quality.s}
+</PRE>
+</UL>
+
+<P>
+The translation from BNF to GF described above is in fact used in
+the GF system to convert BNF grammars into GF. BNF files are recognized
+by the file name suffix <CODE>.cf</CODE>; thus the grammar above can be
+put into a file named <CODE>food.cf</CODE> and read into GF by
+</P>
+<PRE>
+    &gt; import food.cf
+</PRE>
+<P></P>
+<A NAME="toc33"></A>
+<H3>Restrictions of context-free grammars</H3>
+<P>
+Even though we managed to write <CODE>FoodEng</CODE> in the context-free format,
+we cannot do this for GF grammars in general. It is enough to try this
+with <CODE>FoodIta</CODE> at the same time as <CODE>FoodEng</CODE>, 
+we lose an important aspect of multilinguality:
+that the order of constituents is defined only in concrete syntax.
+Thus we could not use context-free <CODE>FoodEng</CODE> and <CODE>FoodIta</CODE> in a multilingual
+grammar that supports translation via common abstract syntax: the
+qualification function <CODE>QKind</CODE> has different types in the two
+grammars.
+</P>
+<P>
+In general terms, the separation of concrete and abstract syntax allows
+three deviations from context-free grammar:
+</P>
+<UL>
+<LI><B>permutation</B>: changing the order of constituents
+<LI><B>suppression</B>: omitting constituents
+<LI><B>reduplication</B>: repeating constituents
+</UL>
+
+<P>
+The third property is the one that definitely shows that GF is 
+stronger than context-free: GF can define the <B>copy language</B>
+<CODE>{x x | x &lt;- (a|b)*}</CODE>, which is known not to be context-free.
+The other properties have more to do with the kind of trees that
+the grammar can associate with strings: permutation is important
+in multilingual grammars, and suppression is exploited in grammars
+where trees carry some hidden semantic information (see <a href="#chapsix">the sixth chapter</a>
+below).
+</P>
+<P>
+Of course, context-free grammars are also restricted from the
+grammar engineering point of view. They give no support to
+modules, functions, and parameters, which are so central
+for the productivity of GF. Despite the initial conciseness
+of context-free grammars, GF can easily produce grammars where
+30 lines of GF code would need hundreds of lines of
+context-free grammar code to produce; see exercises
+<a href="#secitalian">here</a> and <a href="#sectense">here</a>.
+</P>
+<P>
+<B>Exercise</B>. GF can also interpret unlabelled BNF grammars, by
+creating labels automatically. The right-hand sides of BNF rules
+can moreover be disjunctions, e.g.
+</P>
+<PRE>
+    Quality ::= "fresh" | "Italian" | "very" Quality ;
+</PRE>
+<P>
+Experiment with this format in GF, possibly with a grammar that
+you import from some other source, such as a programming language
+document.
+</P>
+<P>
+<B>Exercise</B>. Define the copy language <CODE>{x x | x &lt;- (a|b)*}</CODE> in GF.
+</P>
+<A NAME="toc34"></A>
+<H2>Modules and files</H2>
+<P>
+GF uses suffixes to recognize different file formats. The most
+important ones are:
+</P>
+<UL>
+<LI>Source files: <I>Modulename</I><CODE>.gf</CODE>
+<LI>Target files: <I>Modulename</I><CODE>.gfc</CODE>
+</UL>
+
+<P>
+When you import <CODE>FoodEng.gf</CODE>, you see the target files being
+generated:
+</P>
+<PRE>
+    &gt; i FoodEng.gf
+    - compiling Food.gf...   wrote file Food.gfc 16 msec
+    - compiling FoodEng.gf...   wrote file FoodEng.gfc 20 msec
+</PRE>
+<P>
+You also see that the GF program does not only read the file 
+<CODE>FoodEng.gf</CODE>, but also all other files that it
+depends on --- in this case, <CODE>Food.gf</CODE>.
+</P>
+<P>
+For each file that is compiled, a <CODE>.gfc</CODE> file
+is generated. The GFC format (="GF Canonical") is the
+"machine code" of GF, which is faster to process than
+GF source files. When reading a module, GF decides whether
+to use an existing <CODE>.gfc</CODE> file or to generate
+a new one, by looking at modification times.
+</P>
+<P>
+<I>In GF version 3, the</I> <CODE>gfc</CODE> <I>format is replaced by the format suffixed</I>
+<CODE>gfo</CODE>, <I>"GF object"</I>.
+</P>
+<P>
+<B>Exercise</B>.  What happens when you import <CODE>FoodEng.gf</CODE> for 
+a second time? Try this in different situations:
+</P>
+<UL>
+<LI>Right after importing it the first time (the modules are kept in
+  the memory of GF and need no reloading).
+<LI>After issuing the command <CODE>empty</CODE> (<CODE>e</CODE>), which clears the memory
+  of GF.
+<LI>After making a small change in <CODE>FoodEng.gf</CODE>, be it only an added space.
+<LI>After making a change in <CODE>Food.gf</CODE>.
+</UL>
+
+<A NAME="toc35"></A>
+<H2>Using operations and resource modules</H2>
+<A NAME="toc36"></A>
+<H3>The golden rule of functional programming</H3>
+<P>
+When writing a grammar, you have to type lots of 
+characters. You have probably
+done this by the copy-and-paste method, which is a universally
+available way to avoid repeating work.
+</P>
+<P>
+However, there is a more elegant way to avoid repeating work than 
+the copy-and-paste
+method. The <B>golden rule of functional programming</B> says that
+</P>
+<UL>
+<LI>whenever you find yourself programming by copy-and-paste, 
+  write a function instead.
+</UL>
+
+<P>
+A function separates the shared parts of different computations from the
+changing parts, its <B>arguments</B>, or <B>parameters</B>. 
+In functional programming languages, such as
+Haskell, it is possible to share much more 
+code with functions than in languages such as C and Java, because
+of higher-order functions (functions that takes functions as arguments).
+</P>
+<A NAME="toc37"></A>
+<H3>Operation definitions</H3>
+<P>
+GF is a functional programming language, not only in the sense that
+the abstract syntax is a system of functions (<CODE>fun</CODE>), but also because
+functional programming can be used when defining concrete syntax. This is
+done by using a new form of judgement, with the keyword <CODE>oper</CODE> (for
+<B>operation</B>), distinct from <CODE>fun</CODE> for the sake of clarity.
+Here is a simple example of an operation:
+</P>
+<PRE>
+    oper ss : Str -&gt; {s : Str} = \x -&gt; {s = x} ;
+</PRE>
+<P>
+The operation can be <B>applied</B> to an argument, and GF will
+<B>compute</B> the application into a value. For instance,
+</P>
+<PRE>
+    ss "boy" ===&gt; {s = "boy"}
+</PRE>
+<P>
+We use the symbol <CODE>===</CODE> to indicate how an expression is 
+computed into a value; this symbol is not a part of GF.
+</P>
+<P>
+Thus an <CODE>oper</CODE> judgement includes the name of the defined operation,
+its type, and an expression defining it. As for the syntax of the defining
+expression, notice the <B>lambda abstraction</B> form <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>t</I> of
+the function. It reads: function with variable <I>x</I> and <B>function body</B>
+<I>t</I>. Any occurrence of <I>x</I> in <I>t</I> is said to be <B>bound</B> in <I>t</I>.
+</P>
+<P>
+For lambda abstraction with multiple arguments, we have the shorthand
+</P>
+<PRE>
+    \x,y -&gt; t   ===  \x -&gt; \y -&gt; t
+</PRE>
+<P>
+The notation we have used for linearization rules, where
+variables are bound on the left-hand side, is actually syntactic
+sugar for abstraction:
+</P>
+<PRE>
+    lin f x = t   ===  lin f = \x -&gt; t
+</PRE>
+<P></P>
+<A NAME="toc38"></A>
+<H3>The ``resource`` module type</H3>
+<P>
+Operator definitions can be included in a concrete syntax. 
+But they are usually not really tied to a particular 
+set of linearization rules.
+They should rather be seen as <B>resources</B>
+usable in many concrete syntaxes.
+</P>
+<P>
+The <CODE>resource</CODE> module type is used to package
+<CODE>oper</CODE> definitions into reusable resources. Here is
+an example, with a handful of operations to manipulate
+strings and records.
+</P>
+<PRE>
+    resource StringOper = {
+      oper
+        SS : Type = {s : Str} ;
+        ss : Str -&gt; SS = \x -&gt; {s = x} ;
+        cc : SS -&gt; SS -&gt; SS = \x,y -&gt; ss (x.s ++ y.s) ;
+        prefix : Str -&gt; SS -&gt; SS = \p,x -&gt; ss (p ++ x.s) ;
+    }
+</PRE>
+<P></P>
+<A NAME="toc39"></A>
+<H3>Opening a resource</H3>
+<P>
+Any number of <CODE>resource</CODE> modules can be
+<B>open</B>ed in a <CODE>concrete</CODE> syntax, which
+makes definitions contained
+in the resource usable in the concrete syntax. Here is
+an example, where the resource <CODE>StringOper</CODE> is
+opened in a new version of <CODE>FoodEng</CODE>.
+</P>
+<PRE>
+    concrete FoodEng of Food = open StringOper in {
+  
+      lincat
+        S, Item, Kind, Quality = SS ;
+  
+      lin
+        Is item quality = cc item (prefix "is" quality) ;
+        This k = prefix "this" k ;
+        That k = prefix "that" k ;
+        QKind k q = cc k q ;
+        Wine = ss "wine" ;
+        Cheese = ss "cheese" ;
+        Fish = ss "fish" ;
+        Very = prefix "very" ;
+        Fresh = ss "fresh" ;
+        Warm = ss "warm" ;
+        Italian = ss "Italian" ;
+        Expensive = ss "expensive" ;
+        Delicious = ss "delicious" ;
+        Boring = ss "boring" ;
+    }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Use the same string operations to write <CODE>FoodIta</CODE>
+more concisely.
+</P>
+<A NAME="toc40"></A>
+<H3>Partial application</H3>
+<P>
+<a name="secpartapp"></a>
+</P>
+<P>
+GF, like Haskell, permits <B>partial application</B> of
+functions. An example of this is the rule
+</P>
+<PRE>
+    lin This k = prefix "this" k ;
+</PRE>
+<P>
+which can be written more concisely
+</P>
+<PRE>
+    lin This = prefix "this" ;
+</PRE>
+<P>
+The first form is perhaps more intuitive to write
+but, once you get used to partial application, you will appreciate its
+conciseness and elegance. The logic of partial application 
+is known as <B>currying</B>, with a reference to Haskell B. Curry. 
+The idea is that any <I>n</I>-place function can be seen as a 1-place
+function whose value is an <I>n-</I>1 -place function. Thus
+</P>
+<PRE>
+    oper prefix : Str -&gt; SS -&gt; SS ;
+</PRE>
+<P>
+can be used as a 1-place function that takes a <CODE>Str</CODE> into a
+function <CODE>SS -&gt; SS</CODE>. The expected linearization of <CODE>This</CODE> is exactly
+a function of such a type, operating on an argument of type <CODE>Kind</CODE>
+whose linearization is of type <CODE>SS</CODE>. Thus we can define the
+linearization directly as <CODE>prefix "this"</CODE>.
+</P>
+<P>
+An important part of the art of functional programming is to decide the order
+of arguments in a function, so that partial application can be used as much 
+as possible. For instance, of the operation <CODE>prefix</CODE> we know that it
+will be typically applied to linearization variables with constant strings.
+This is the reason to put the <CODE>Str</CODE> argument before the <CODE>SS</CODE> argument --- not
+the prefixity. A <CODE>postfix</CODE> function would have exactly the same order of arguments.
+</P>
+<P>
+<B>Exercise</B>. Define an operation <CODE>infix</CODE> analogous to <CODE>prefix</CODE>,
+such that it allows you to write
+</P>
+<PRE>
+    lin Is = infix "is" ;
+</PRE>
+<P></P>
+<A NAME="toc41"></A>
+<H3>Testing resource modules</H3>
+<P>
+To test a <CODE>resource</CODE> module independently, you must import it
+with the flag <CODE>-retain</CODE>, which tells GF to retain <CODE>oper</CODE> definitions
+in the memory; the usual behaviour is that <CODE>oper</CODE> definitions
+are just applied to compile linearization rules 
+(this is called <B>inlining</B>) and then thrown away.
+</P>
+<PRE>
+    &gt; import -retain StringOper.gf
+</PRE>
+<P>
+The command <CODE>compute_concrete = cc</CODE> computes any expression
+formed by operations and other GF constructs. For example,
+</P>
+<PRE>
+    &gt; compute_concrete prefix "in" (ss "addition")
+    {
+      s : Str = "in" ++ "addition"
+    }
+</PRE>
+<P></P>
+<A NAME="toc42"></A>
+<H2>Grammar architecture</H2>
+<P>
+<a name="secarchitecture"></a>
+</P>
+<A NAME="toc43"></A>
+<H3>Extending a grammar</H3>
+<P>
+The module system of GF makes it possible to write a new module that <B>extend</B>s
+an old one. The syntax of extension is
+shown by the following example. We extend <CODE>Food</CODE> into <CODE>MoreFood</CODE> by
+adding a category of questions and two new functions.
+</P>
+<PRE>
+    abstract Morefood = Food ** {
+      cat
+        Question ;
+      fun
+        QIs : Item -&gt; Quality -&gt; Question ;
+        Pizza : Kind ;
+        
+    }
+</PRE>
+<P>
+Parallel to the abstract syntax, extensions can
+be built for concrete syntaxes:
+</P>
+<PRE>
+    concrete MorefoodEng of Morefood = FoodEng ** {
+      lincat
+        Question = {s : Str} ;
+      lin
+        QIs item quality = {s = "is" ++ item.s ++ quality.s} ;
+        Pizza = {s = "pizza"} ;
+    }
+</PRE>
+<P>
+The effect of extension is that all of the contents of the extended
+and extending module are put together. We also say that the new 
+module <B>inherits</B> the contents of the old module.
+</P>
+<P>
+At the same time as extending a module of the same type, a concrete
+syntax module may open resources. Since <CODE>open</CODE> takes effect in
+the module body and not in the extended module, its logical place
+in the module header is after the extend part:
+</P>
+<PRE>
+    concrete MorefoodIta of Morefood = FoodIta ** open StringOper in {
+      lincat
+        Question = SS ;
+      lin
+        QIs item quality = ss (item.s ++ "�" ++ quality.s) ;
+        Pizza = ss "pizza" ;
+    }
+</PRE>
+<P>
+Resource modules can extend other resource modules, in the
+same way as modules of other types can extend modules of the
+same type. Thus it is possible to build resource hierarchies.
+</P>
+<A NAME="toc44"></A>
+<H3>Multiple inheritance</H3>
+<P>
+Specialized vocabularies can be represented as small grammars that
+only do "one thing" each. For instance, the following are grammars
+for fruit and mushrooms
+</P>
+<PRE>
+    abstract Fruit = {
+      cat Fruit ;
+      fun Apple, Peach : Fruit ;
+    }
+  
+    abstract Mushroom = {
+      cat Mushroom ;
+      fun Cep, Agaric : Mushroom ;
+    }
+</PRE>
+<P>
+They can afterwards be combined into bigger grammars by using
+<B>multiple inheritance</B>, i.e. extension of several grammars at the
+same time:
+</P>
+<PRE>
+    abstract Foodmarket = Food, Fruit, Mushroom ** {
+      fun 
+        FruitKind    : Fruit    -&gt; Kind ;
+        MushroomKind : Mushroom -&gt; Kind ;
+      }
+</PRE>
+<P>
+The main advantages with splitting a grammar to modules are
+<B>reusability</B>, <B>separate compilation</B>, and <B>division of labour</B>. 
+Reusability means
+that one and the same module can be put into different uses; for instance,
+a module with mushroom names might be used in a mycological information system
+as well as in a restaurant phrasebook. Separate compilation means that a module
+once compiled into <CODE>.gfc</CODE> need not be compiled again unless changes have
+taken place.
+Division of labour means simply that programmers that are experts in
+special areas can work on modules belonging to those areas.
+</P>
+<P>
+<B>Exercise</B>. Refactor <CODE>Food</CODE> by taking apart <CODE>Wine</CODE> into a special
+<CODE>Drink</CODE> module.
+</P>
+<A NAME="toc45"></A>
+<H3>Visualizing module structure</H3>
+<P>
+When you have created all the abstract syntaxes and
+one set of concrete syntaxes needed for <CODE>Foodmarket</CODE>,
+your grammar consists of eight GF modules. To see how their
+dependences look like, you can use the command 
+<CODE>visualize_graph = vg</CODE>,
+</P>
+<PRE>
+    &gt; visualize_graph
+</PRE>
+<P>
+and the graph will pop up in a separate window:
+</P>
+<P>
+<IMG ALIGN="middle" SRC="foodmarket.png" BORDER="0" ALT="">
+</P>
+<P>
+The graph uses
+</P>
+<UL>
+<LI>oval boxes for abstract modules
+<LI>square boxes  for concrete modules
+<LI>black-headed arrows for inheritance
+<LI>white-headed arrows for the concrete-of-abstract relation
+</UL>
+
+<P>
+Just as the <CODE>visualize_tree = vt</CODE> command, the freely available tools
+Ghostview and Graphviz are needed. As an alternative, you can again print
+the graph into a <CODE>.dot</CODE> file by using the command <CODE>print_multi = pm</CODE>:
+</P>
+<PRE>
+    &gt; print_multi -printer=graph | write_file Foodmarket.dot
+    &gt; ! dot -Tpng Foodmarket.dot &gt; Foodmarket.png
+</PRE>
+<P></P>
+<A NAME="toc46"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc47"></A>
+<H3>Modules</H3>
+<P>
+The general form of a module is
+<center>
+  <I>Moduletype</I> <I>M</I> <I>Of</I> <CODE>=</CODE> (<I>Extends</I> <CODE>**</CODE>)? (<CODE>open</CODE> <I>Opens</I> <CODE>in</CODE>)? <I>Body</I>
+</center>
+where <I>Moduletype</I> is one of <CODE>abstract</CODE>, <CODE>concrete</CODE>, and <CODE>resource</CODE>.
+</P>
+<P>
+If <I>Moduletype</I> is <CODE>concrete</CODE>, the <I>Of</I>-part has the form <CODE>of</CODE> <I>A</I>,
+where <I>A</I> is the name of an abstract module. Otherwise it is empty.
+</P>
+<P>
+The name of the module is given by the identifier <I>M</I>.
+</P>
+<P>
+The optional <I>Extends</I> part is a comma-separated 
+list of module names, which have to be modules of
+the same <I>Moduletype</I>. The contents of these modules are <B>inherited</B> by 
+<I>M</I>. This means that they are both usable in <I>Body</I> and exported by <I>M</I>,
+i.e. inherited when <I>M</I> is inherited and available when <I>M</I> is opened.
+(Exception: <CODE>oper</CODE> and <CODE>param</CODE> judgements are not inherited from
+<CODE>concrete</CODE> modules.)
+</P>
+<P>
+The optional <I>Opens</I> part is a comma-separated
+list of resource module names. The contents of these
+modules are usable in the <I>Body</I>, but they are not exported.
+</P>
+<P>
+Opening can be <B>qualified</B>, e.g.
+</P>
+<PRE>
+    concrete C of A = open (P = Prelude) in ...
+</PRE>
+<P>
+This means that the names from <CODE>Prelude</CODE> are only available in the form
+<CODE>P.name</CODE>. This form of qualifying a name is always possible, and it can
+be used to resolve <B>name conflicts</B>, which result when the same name is
+declared in more than one module that is in scope.
+</P>
+<A NAME="toc48"></A>
+<H3>Judgements</H3>
+<P>
+The <I>Body</I> part consists of judgements. The judgement form table #secjment
+is extended with the following forms:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>form</TH>
+<TH>reading</TH>
+<TH COLSPAN="2">module type</TH>
+</TR>
+<TR>
+<TD ALIGN="center"><CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> <CODE>=</CODE> <I>t</I></TD>
+<TD>operation <I>h</I> of type <I>T</I> is defined as <I>t</I></TD>
+<TD>resource, concrete</TD>
+</TR>
+<TR>
+<TD ALIGN="right"><CODE>param</CODE> <I>P</I> <CODE>=</CODE> <I>C1</I> <CODE>|</CODE> ... <CODE>|</CODE> <I>Cn</I></TD>
+<TD>parameter type P has constructors <I>C1...Cn</I></TD>
+<TD>resource, concrete</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The <CODE>param</CODE> judgement will be explained in the next chapter.
+</P>
+<P>
+The type part of an <CODE>oper</CODE> judgement can be omitted, if the type can be inferred
+by the GF compiler.
+</P>
+<PRE>
+    oper hello = "hello" ++ "world" ;
+</PRE>
+<P>
+As a rule, type inference works for all terms except lambda abstracts.
+</P>
+<P>
+<B>Lambda abstracts</B> are expressions of the form <CODE>\</CODE><I>x</I> <CODE>-&gt;</CODE> <I>t</I>,
+where <I>x</I> is a variable <B>bound</B> in the expression <I>t</I>, which is the
+<B>body</B> of the lambda abstract. The type of the lambda abstract is
+<I>A</I> <CODE>-&gt;</CODE><I>B</I>, where <I>A</I> is the type of the variable <CODE>x</CODE> and
+<I>B</I> the type of the body <I>t</I>.
+</P>
+<P>
+For multiple lambda abstractions, there is a shorthand
+</P>
+<PRE>
+    \x,y -&gt; t   ===  \x -&gt; \y -&gt; t
+</PRE>
+<P>
+For <CODE>lin</CODE> judgements, there is the shorthand
+</P>
+<PRE>
+    lin f x = t   ===  lin f = \x -&gt; t
+</PRE>
+<P></P>
+<A NAME="toc49"></A>
+<H3>Free variation</H3>
+<P>
+The <CODE>variants</CODE> construct of GF can be used to give a list of 
+concrete syntax terms, of the same type, in free variation. For example,
+</P>
+<PRE>
+    variants {["does not"] ; "doesn't"}
+</PRE>
+<P>
+A limiting case is the empty variant list <CODE>variants {}</CODE>.
+</P>
+<A NAME="toc50"></A>
+<H3>The context-free grammar format</H3>
+<P>
+The <CODE>.cf</CODE> file format is used for <B>context-free grammars</B>, which are
+always interpretable as GF grammars. Files of this format consist of
+rules of the form
+<center>
+  (<I>Label</I> <CODE>.</CODE>)? <I>Cat</I> <CODE>::=</CODE> <I>RHS</I> <CODE>;</CODE>
+</center>
+where the <I>RHS</I> is a sequence of terminals (quoted strings) and
+nonterminals (identifiers). The optional <I>Label</I> gives the abstract
+syntax function created. If it is omitted, a function name is generated
+automatically. Then it is also possible to have more than one <I>RHS</I>,
+separated by <I>|</I>. An empty <I>RHS</I> is interpreted as an empty sequence
+of terminals, not as an empty disjunction.
+</P>
+<P>
+The <B>Extended BNF</B> format (<B>EBNF</B>) can also be used, in files suffixed <CODE>.ebnf</CODE>.
+This format does not allow user-written labels. The right-hand-side of a rule
+can contain everything that is possible in the <CODE>.cf</CODE> format, but also
+optional parts (<CODE>p ?</CODE>), sequences (<CODE>p *</CODE>) and non-empty sequences (<CODE>p +</CODE>).
+For example, the phrases in <CODE>FoodEng</CODE> could be recognized with the following 
+EBNF grammar:
+</P>
+<PRE>
+  Phrase  ::= 
+    ("this" | "that") Quality* ("wine" | "cheese" | "fish") "is" Quality ;
+  Quality ::= 
+    ("very"* ("fresh" | "warm" | "boring" | "Italian" | "expensive")) ;
+</PRE>
+<P></P>
+<A NAME="toc51"></A>
+<H3>Character encoding</H3>
+<P>
+The default encoding is iso-latin-1. UTF-8 can be set by the flag <CODE>coding=utf8</CODE>
+in the grammar. The resource grammar libraries are in iso-latin-1, except Russian
+and Arabic, which are in UTF-8. The resources may be changed to UTF-8 in future.
+Letters in identifiers must currently be iso-latin-1.
+</P>
+<A NAME="toc52"></A>
+<H1>Grammars with parameters</H1>
+<P>
+<a name="chapfour"></a>
+</P>
+<P>
+In this chapter, we will introduce the techniques needed for
+describing the inflection of words, as well as the rules by
+which correct word forms are selected in syntactic combinations.
+These techniques are already needed in a very slight extension
+of the Food grammar of the previous chapter. While explaining
+how the linguistic problems are solved for English and Italian,
+we also cover all the language constructs GF has for
+defining concrete syntax.
+</P>
+<P>
+It is in principle possible to skip this chapter and go directly
+to the next, since the use of the GF Resource Grammar library
+makes it unnecessary to use any more constructs of GF than we
+have already covered: parameters could be left to library implementors.
+</P>
+<A NAME="toc53"></A>
+<H2>The problem: words have to be inflected</H2>
+<P>
+Suppose we want to say, with the vocabulary included in
+<CODE>Food.gf</CODE>, things like
+<center>
+<I>these Italian wines are delicious</I>
+</center>
+The new grammatical facility we need are the plural forms
+of nouns and verbs (<I>wines, are</I>), as opposed to their
+singular forms.
+</P>
+<P>
+The introduction of plural forms requires two things:
+</P>
+<UL>
+<LI>the <B>inflection</B> of nouns and verbs in singular and plural
+<LI>the <B>agreement</B> of the verb to subject: 
+  the verb must have the same number as the subject
+</UL>
+
+<P>
+Different languages have different types of inflection and agreement.
+For instance, Italian has also agreement in gender (masculine vs. feminine).
+In a multilingual grammar,
+we want to express such differences between languages in the
+concrete syntax while ignoring them in the abstract syntax.
+</P>
+<P>
+To be able to do all this, we need one new judgement form
+and some new expression forms.
+We also need to generalize linearization types
+from strings to more complex types.
+</P>
+<P>
+<B>Exercise</B>. Make a list of the possible forms that nouns,
+adjectives, and verbs can have in some languages that you know.
+</P>
+<A NAME="toc54"></A>
+<H2>Parameters and tables</H2>
+<P>
+We define the <B>parameter type</B> of number in English by
+using a new form of judgement:
+</P>
+<PRE>
+    param Number = Sg | Pl ;
+</PRE>
+<P>
+This judgement defines the parameter type <CODE>Number</CODE> by listing
+its two <B>constructors</B>, <CODE>Sg</CODE> and <CODE>Pl</CODE> (common shorthands for
+singular and plural). 
+</P>
+<P>
+To state that <CODE>Kind</CODE> expressions in English have a linearization
+depending on number, we replace the linearization type <CODE>{s : Str}</CODE>
+with a type where the <CODE>s</CODE> field is a <B>table</B> depending on number:
+</P>
+<PRE>
+    lincat Kind = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+The <B>table type</B> <CODE>Number =&gt; Str</CODE> is in many respects similar to
+a function type (<CODE>Number -&gt; Str</CODE>). The main difference is that the
+argument type of a table type must always be a parameter type. This means
+that the argument-value pairs can be listed in a finite table. The following
+example shows such a table:
+</P>
+<PRE>
+    lin Cheese = {
+      s = table {
+        Sg =&gt; "cheese" ;
+        Pl =&gt; "cheeses"
+      }
+    } ;
+</PRE>
+<P>
+The table consists of <B>branches</B>, where a <B>pattern</B> on the
+left of the arrow <CODE>=&gt;</CODE> is assigned a <B>value</B> on the right.
+</P>
+<P>
+The application of a table to a parameter is done by the <B>selection</B>
+operator <CODE>!</CODE>, which is computed by <B>pattern matching</B>: it returns
+the value from the first branch whose pattern matches the
+selection argument. For instance,
+</P>
+<PRE>
+     table {Sg =&gt; "cheese" ; Pl =&gt; "cheeses"} ! Pl 
+     ===&gt; "cheeses"
+</PRE>
+<P>
+As syntactic sugar for table selections, we can define the
+<B>case expressions</B>, which are common in functional programming and also
+handy to use in GF. 
+</P>
+<PRE>
+    case e of {...} ===  table {...} ! e
+</PRE>
+<P></P>
+<P>
+A parameter type can have any number of constructors, and these can
+also take arguments from other parameter types. For instance, an accurate
+type system for English verbs (except <I>be</I>) is
+</P>
+<PRE>
+    param VerbForm = VPresent Number | VPast | VPastPart | VPresPart ;
+</PRE>
+<P>
+This system expresses accurately the fact that only the present tense has
+number variation. (Agreement also requires variation in person, but
+this can be defined in syntax rules, by picking the singular form for third person 
+singular subjects and the plural forms for all others). As an example of
+a table, here are the forms of the verb <I>drink</I>:
+</P>
+<PRE>
+    table {
+      VPresent Sg =&gt; "drinks" ;
+      VPresent Pl =&gt; "drink" ;
+      VPast       =&gt; "drank" ;
+      VPastPart   =&gt; "drunk" ;
+      VPresPart   =&gt; "drinking"
+      }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. In an earlier exercise (previous section), 
+you made a list of the possible 
+forms that nouns, adjectives, and verbs can have in some languages that 
+you know. Now take some of the results and implement them by
+using parameter type definitions and tables. Write them into a <CODE>resource</CODE>
+module, which you can test by using the command <CODE>compute_concrete</CODE>.
+</P>
+<A NAME="toc55"></A>
+<H2>Inflection tables and paradigms</H2>
+<P>
+All English common nouns are inflected for number, most of them in the
+same way: the plural form is obtained from the singular by adding the
+ending <I>s</I>. This rule is an example of 
+a <B>paradigm</B> --- a formula telling how a class of words is inflected.
+</P>
+<P>
+From the GF point of view, a paradigm is a function that takes 
+a <B>lemma</B> --- also known as a <B>dictionary form</B> or a <B>citation form</B> --- and 
+returns an inflection
+table of desired type. Paradigms are not functions in the sense of the
+<CODE>fun</CODE> judgements of abstract syntax (which operate on trees and not
+on strings), but operations defined in <CODE>oper</CODE> judgements.
+The following operation defines the regular noun paradigm of English:
+</P>
+<PRE>
+    oper regNoun : Str -&gt; {s : Number =&gt; Str} = \dog -&gt; {
+      s = table {
+        Sg =&gt; dog ;
+        Pl =&gt; dog + "s"
+        }
+      } ;
+</PRE>
+<P>
+The <B>gluing</B> operator <CODE>+</CODE> tells that
+the string held in the variable <CODE>dog</CODE> and the ending <CODE>"s"</CODE> 
+are written together to form one <B>token</B>. Thus, for instance,
+</P>
+<PRE>
+    (regNoun "cheese").s ! Pl  ===&gt; "cheese" + "s"  ===&gt;  "cheeses"
+</PRE>
+<P>
+A more complex example are regular verbs:
+</P>
+<PRE>
+    oper regVerb : Str -&gt; {s : VerbForm =&gt; Str} = \talk -&gt; {
+      s = table {
+        VPresent Sg =&gt; talk + "s" ;
+        VPresent Pl =&gt; talk ;
+        VPresPart   =&gt; talk + "ing" ;
+        _           =&gt; talk + "ed"
+        }
+      } ;
+</PRE>
+<P>
+Notice how a catch-all case for the past tense and the past participle
+is expressed by using a <B>wild card</B> pattern <CODE>_</CODE>. Here again, pattern matching
+tries all patterns in order until it finds a matching pattern;
+and it is the wild card that is the first match for both <CODE>VPast</CODE> and
+<CODE>VPastPart</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Identify cases in which the <CODE>regNoun</CODE> paradigm does not
+apply in English, and implement some alternative paradigms.
+</P>
+<P>
+<B>Exercise</B>. Implement some regular paradigms for other languages you have
+considered in earlier exercises.
+</P>
+<A NAME="toc56"></A>
+<H2>Using parameters in concrete syntax</H2>
+<P>
+We can now enrich the concrete syntax definitions to
+comprise morphology. This will permit a more radical
+variation between languages (e.g. English and Italian)
+than just the use of different words. In general,
+parameters and linearization types are different in
+different languages --- but this does not prevent using a
+the common abstract syntax.
+</P>
+<P>
+We consider a grammar <CODE>Foods</CODE>, which is similar to
+<CODE>Food</CODE>, with the addition two rules for forming plural items:
+</P>
+<PRE>
+    fun These, Those : Kind -&gt; Item ;
+</PRE>
+<P>
+We also add a noun which in Italian has the feminine case; all nouns in
+<CODE>Food</CODE> were carefully chosen to be masculine!
+</P>
+<PRE>
+    fun Pizza : Kind ;
+</PRE>
+<P>
+This noun will force us to deal with gender in the Italian grammar, 
+which is what we need for the grammar to scale up for larger applications.
+</P>
+<A NAME="toc57"></A>
+<H3>Agreement</H3>
+<P>
+In the English <CODE>Foods</CODE> grammar, we need just one type of parameters:
+<CODE>Number</CODE> as defined above. The phrase-forming rule
+</P>
+<PRE>
+    fun Is : Item -&gt; Quality -&gt; Phrase ;
+</PRE>
+<P>
+is affected by the number because of <B>subject-verb agreement</B>.
+In English, agreement says that the verb of a sentence
+must be inflected in the number of the subject. Thus we will linearize
+</P>
+<PRE>
+    Is (This Pizza) Warm   ===&gt;  "this pizza is warm"
+    Is (These Pizza) Warm  ===&gt;  "these pizzas are warm"
+</PRE>
+<P>
+Here it is the <B>copula</B>, i.e. the verb <I>be</I> that is affected. We define
+the copula as the operation
+</P>
+<PRE>
+    oper copula : Number -&gt; Str = \n -&gt; 
+      case n of {
+        Sg =&gt; "is" ;
+        Pl =&gt; "are"
+        } ;
+</PRE>
+<P>
+We don't need to inflect the copula for person and tense in this grammar.
+</P>
+<P>
+The form of the copula in a sentence depends on the 
+<B>subject</B> of the sentence, i.e. the item
+that is qualified. This means that an <CODE>Item</CODE> must have such a number to provide.
+The obvious way to guarantee this is by including a number field in
+the linearization type:
+</P>
+<PRE>
+    lincat Item = {s : Str ; n : Number} ;
+</PRE>
+<P>
+Now we can write precisely the <CODE>Is</CODE> rule that expresses agreement:
+</P>
+<PRE>
+    lin Is item qual = {s = item.s ++ copula item.n ++ qual.s} ;
+</PRE>
+<P>
+The copula receives the number that it needs from the subject item.
+</P>
+<A NAME="toc58"></A>
+<H3>Determiners</H3>
+<P>
+Let us turn to <CODE>Item</CODE> subjects and see how they receive their
+numbers. The two rules
+</P>
+<PRE>
+    fun This, These : Kind -&gt; Item ;
+</PRE>
+<P>
+form <CODE>Item</CODE>s from <CODE>Kind</CODE>s by adding <B>determiners</B>, either
+<I>this</I> or <I>these</I>. The determiners
+require different numbers of their <CODE>Kind</CODE> arguments: <CODE>This</CODE> 
+requires the singular (<I>this pizza</I>) and <CODE>These</CODE> the plural
+(<I>these pizzas</I>). The <CODE>Kind</CODE> is the same in both cases: <CODE>Pizza</CODE>.
+Thus a <CODE>Kind</CODE> must have both singular and plural forms.
+The obvious way to express this is by using a table:
+</P>
+<PRE>
+    lincat Kind = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+The linearization rules for <CODE>This</CODE> and <CODE>These</CODE> can now be written
+</P>
+<PRE>
+    lin This kind = {
+      s = "this" ++ kind.s ! Sg ; 
+      n = Sg
+    } ; 
+  
+    lin These kind = {
+      s = "these" ++ kind.s ! Pl ; 
+      n = Pl
+    } ; 
+</PRE>
+<P>
+The grammatical relation between the determiner and the noun is similar to
+agreement, but due to some differences into which we don't go here
+it is often called <B>government</B>.
+</P>
+<P>
+Since the same pattern for determination is used four times in 
+the <CODE>FoodsEng</CODE> grammar, we codify it as an operation,
+</P>
+<PRE>
+    oper det : 
+      Str -&gt; Number -&gt; {s : Number =&gt; Str} -&gt; {s : Str ; n : Number} = 
+        \det,n,kind -&gt; {
+        s = det ++ kind.s ! n ; 
+        n = n
+      } ; 
+</PRE>
+<P>
+Now we can write, for instance,
+</P>
+<PRE>
+    lin This  = det Sg "this" ;
+    lin These = det Pl "these" ;
+</PRE>
+<P>
+Notice the order of arguments that permits partial
+application (<a href="#secpartapp">here</a>).
+</P>
+<P>
+In a more <B>lexicalized</B> grammar, determiners would be made into a 
+category of their own and given an inherent number:
+</P>
+<PRE>
+    lincat Det = {s : Str ; n : Number} ;
+    fun Det : Det -&gt; Kind -&gt; Item ;
+    lin Det det kind = {
+        s = det.s ++ kind.s ! det.n ; 
+        n = det.n
+      } ; 
+</PRE>
+<P>
+Linguistically motivated grammars, such as the GF resource grammars,
+usually favour lexicalized treatments of words; see <a href="#seclexical">here</a> below.
+Notice that the fields of the record in <CODE>Det</CODE> are precisely the two
+arguments needed in the <CODE>det</CODE> operation.
+</P>
+<A NAME="toc59"></A>
+<H3>Parametric vs. inherent features</H3>
+<P>
+<CODE>Kind</CODE>s, as in general <B>common nouns</B> in English, have both singular
+and plural forms; what form is chosen is determined by the construction
+in which the noun is used. We say that the number is a
+<B>parametric feature</B> of nouns. In GF, parametric features
+appear as argument types of tables in linearization types.
+</P>
+<PRE>
+    lincat Kind = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+<CODE>Item</CODE>s, as in general <B>noun phrases</B> in English, don't
+have variation in number. The number is instead an <B>inherent feature</B>,
+which the noun phrase passes to the verb. In GF, inherent features
+appear as record fields in linearization types.
+</P>
+<PRE>
+    lincat Item = {s : Str ; n : Number} ;
+</PRE>
+<P>
+A category can have both parametric and inherent features. As we will see
+in the Italian <CODE>Foods</CODE> grammar, nouns have parametric number and
+inherent gender:
+</P>
+<PRE>
+    lincat Kind = {s : Number =&gt; Str ; g : Gender} ;
+</PRE>
+<P>
+Nothing prevents the same parameter type from appearing both
+as parametric and inherent feature, or the appearance of several inherent
+features of the same type, etc. Determining the linearization types
+of categories is one of the most crucial steps in the design of a GF
+grammar. These two conditions must be in balance:
+</P>
+<UL>
+<LI>existence: what forms are possible to build by morphological and
+  other means?
+<LI>need: what features are expected via agreement or government?
+</UL>
+
+<P>
+Grammar books and dictionaries give good advice on existence; for instance,
+an Italian dictionary has entries such as
+<center>
+<B>uomo</B>, pl. <I>uomini</I>, n.m. "man"
+</center>
+which tells that <I>uomo</I> is a masculine noun with the plural form <I>uomini</I>.
+From this alone, or with a couple more examples, we can generalize to the type 
+for all nouns in Italian: they have both singular and plural forms and thus
+a parametric number, and they have an inherent gender.
+</P>
+<P>
+The distinction between parametric and inherent features can be stated in
+object-oriented programming terms: a linearization type is like a <B>class</B>,
+which has a <B>method</B> for linearization and also some <B>attributes</B>.
+In this class, the parametric features appear as arguments to the
+linearization method, whereas the inherent features appear as attributes.
+</P>
+<P>
+For words, inherent features are usually given <I>ad hoc</I> as lexical information.
+For combinations, they are typically <I>inherited</I> from some part of the construction.
+For instance, qualified noun constructs in Italian inherit their gender from noun part
+(called the <B>head</B> of the construction in linguistics):
+</P>
+<PRE>
+    lin QKind qual kind = 
+      let gen = kind.g in {
+        s = table {n =&gt; kind.s ! n ++ qual.s ! gen ! n} ;
+        g = gen
+        } ;
+</PRE>
+<P>
+This rule uses a <B>local definition</B> (also known as a <B>let expression</B>) to
+avoid computing <CODE>kind.g</CODE> twice, and also to express the linguistic
+generalization that it is the same gender that is both passed to
+the adjective and inherited by the construct.
+The parametric number feature is in this rule passed to both the noun and
+the adjective. In the table, a <B>variable pattern</B> is used to match
+any possible number. Variables introduced in patterns are in scope in
+the right-hand sides of corresponding branches. Again, it is good to
+use a variable to express the linguistic generalization that the number
+is passed to the parts, rather than expand the table into <CODE>Sg</CODE> and <CODE>Pl</CODE>
+branches.
+</P>
+<P>
+Sometimes the puzzle of making agreement and government work in a grammar has
+several solutions. For instance, <B>precedence</B> in programming languages can
+be equivalently described by a parametric or an inherent feature 
+(see <a href="#secprecedence">here</a> below).
+</P>
+<P>
+In natural language applications that use the resource grammar library,
+all parameters are hidden from the user, who thereby does not need to bother
+about them. The only thing that she has to think about is what linguistic
+categories are given as linearization types to each semantic category.
+</P>
+<P>
+For instance, the GF resource grammar library has a category <CODE>NP</CODE> of
+noun phrases, <CODE>AP</CODE> of adjectival phrases, and <CODE>Cl</CODE> of sentence-like clauses.
+In the implementation of <CODE>Foods</CODE> <a href="#secenglish">here</a>, we will define
+</P>
+<PRE>
+    lincat Phrase = Cl ; Item = NP ; Quality = AP ;
+</PRE>
+<P>
+To express that an item has a quality, we will use a resource function
+</P>
+<PRE>
+    mkCl : NP -&gt; AP -&gt; Cl ;
+</PRE>
+<P>
+in the linearization rule:
+</P>
+<PRE>
+    lin Is = mkCl ;
+</PRE>
+<P>
+In this way, we have no need to think about parameters and agreement.
+<a href="#chapfive">the fifth chapter</a> will show a complete implementation of <CODE>Foods</CODE> by the
+resource grammar, port it to many new languages, and extend it with
+many new constructs.
+</P>
+<A NAME="toc60"></A>
+<H2>An English concrete syntax for Foods with parameters</H2>
+<P>
+We repeat some of the rules above by showing the entire
+module <CODE>FoodsEng</CODE>, equipped with parameters. The parameters and
+operations are, for the sake of brevity, included in the same module
+and not in a separate <CODE>resource</CODE>. However, some string operations
+from the library <CODE>Prelude</CODE> are used. 
+</P>
+<PRE>
+    --# -path=.:prelude
+  
+    concrete FoodsEng of Foods = open Prelude in {
+  
+    lincat
+      S, Quality = SS ; 
+      Kind = {s : Number =&gt; Str} ; 
+      Item = {s : Str ; n : Number} ; 
+  
+    lin
+      Is item quality = ss (item.s ++ copula item.n ++ quality.s) ;
+      This  = det Sg "this" ;
+      That  = det Sg "that" ;
+      These = det Pl "these" ;
+      Those = det Pl "those" ;
+      QKind quality kind = {s = table {n =&gt; quality.s ++ kind.s ! n}} ;
+      Wine = regNoun "wine" ;
+      Cheese = regNoun "cheese" ;
+      Fish = noun "fish" "fish" ;
+      Pizza = regNoun "pizza" ;
+      Very = prefixSS "very" ;
+      Fresh = ss "fresh" ;
+      Warm = ss "warm" ;
+      Italian = ss "Italian" ;
+      Expensive = ss "expensive" ;
+      Delicious = ss "delicious" ;
+      Boring = ss "boring" ;
+  
+    param
+      Number = Sg | Pl ;
+  
+    oper
+      det : Number -&gt; Str -&gt; {s : Number =&gt; Str} -&gt; {s : Str ; n : Number} = 
+        \n,d,cn -&gt; {
+          s = d ++ cn.s ! n ;
+          n = n
+        } ;
+      noun : Str -&gt; Str -&gt; {s : Number =&gt; Str} = 
+        \man,men -&gt; {s = table {
+          Sg =&gt; man ;
+          Pl =&gt; men 
+          }
+        } ;
+      regNoun : Str -&gt; {s : Number =&gt; Str} = 
+        \car -&gt; noun car (car + "s") ;
+      copula : Number -&gt; Str = 
+        \n -&gt; case n of {
+          Sg =&gt; "is" ;
+          Pl =&gt; "are"
+          } ;
+    }    
+</PRE>
+<P>
+To find the Prelude library --- or in general,
+GF files located in other directories, a <B>path directive</B> is needed
+either on the command line or as the first line of
+the topmost file compiled.
+The paths in the path list are separated by colons (<CODE>:</CODE>), and every item
+is interpreted primarily relative to the current directory and, secondarily,
+to the value of <CODE>GF_LIB_PATH</CODE> (<B>GF library path</B>). Hence it is a
+good idea to make <CODE>GF_LIB_PATH</CODE> to point into your <CODE>GF/lib/</CODE> whenever
+you start working in GF. For instance, in the Bash shell this is done by
+</P>
+<PRE>
+    % export GF_LIB_PATH=&lt;the location of GF/lib in your file system&gt;
+</PRE>
+<P></P>
+<A NAME="toc61"></A>
+<H2>More on inflection paradigms</H2>
+<P>
+<a name="secinflection"></a>
+</P>
+<P>
+Let us try to extend the English noun paradigms so that we can
+deal with all nouns, not just the regular ones. The goal is to
+provide a morphology module that is maximally easy to use when 
+words are added to the lexicon. In fact, we can think of a
+division of labour where a linguistically trained grammarian
+writes a morphology and hands it over to the lexicon writer
+who knows much less about the rules of inflection.
+</P>
+<P>
+In passing, we will introduce some new GF constructs: local definitions,
+regular expression patterns, and operation overloading.
+</P>
+<A NAME="toc62"></A>
+<H3>Worst-case functions</H3>
+<P>
+To start with, it is useful to perform <B>data abstraction</B> from the type
+of nouns by writing a constructor operation, a <B>worst-case function</B>:
+</P>
+<PRE>
+    oper mkNoun : Str -&gt; Str -&gt; Noun = \x,y -&gt; {
+      s = table {
+        Sg =&gt; x ;
+        Pl =&gt; y
+        }
+      } ;
+</PRE>
+<P>
+This presupposes that we have defined
+</P>
+<PRE>
+    oper Noun : Type = {s : Number =&gt; Str} ;
+</PRE>
+<P>
+Using <CODE>mkNoun</CODE>, we can define
+</P>
+<PRE>
+    lin Mouse = mkNoun "mouse" "mice" ;
+</PRE>
+<P>
+and
+</P>
+<PRE>
+    oper regNoun : Str -&gt; Noun = \x -&gt; mkNoun x (x + "s") ;
+</PRE>
+<P>
+instead of writing the inflection tables explicitly.
+</P>
+<P>
+Nouns like <I>mouse</I>-<I>mice</I>, are so irregular that
+it hardly makes sense to see them as instances of a 
+paradigm that forms the plural from the singular form. 
+But in general, as we will see, there can be different
+regular patterns in a language.
+</P>
+<P>
+The grammar engineering advantage of worst-case functions is that
+the author of the resource module may change the definitions of
+<CODE>Noun</CODE> and <CODE>mkNoun</CODE>, and still retain the
+interface (i.e. the system of type signatures) that makes it
+correct to use these functions in concrete modules. In programming
+terms, <CODE>Noun</CODE> is then treated as an <B>abstract datatype</B>:
+its definition is not available, but only an indirect way of constructing
+its objects.
+</P>
+<P>
+A case where a change of the <CODE>Noun</CODE> type could
+actually happen is if we introduces <B>case</B> (nominative or
+genitive) in the noun inflection:
+</P>
+<PRE>
+    param Case = Nom | Gen ;
+  
+    oper Noun : Type = {s : Number =&gt; Case =&gt; Str} ;
+</PRE>
+<P>
+Now we have to redefine the worst-case function
+</P>
+<PRE>
+    oper mkNoun : Str -&gt; Str -&gt; Noun = \x,y -&gt; {
+      s = table {
+        Sg =&gt; table {
+          Nom =&gt; x ;
+          Gen =&gt; x + "'s"
+          } ;
+        Pl =&gt; table {
+          Nom =&gt; y ;
+          Gen =&gt; y + case last y of {
+            "s" =&gt; "'" ;
+            _   =&gt; "'s"
+          }
+        }
+      } ;
+</PRE>
+<P>
+But up from this level, we can retain the old definitions
+</P>
+<PRE>
+    lin Mouse = mkNoun "mouse" "mice" ;
+    oper regNoun : Str -&gt; Noun = \x -&gt; mkNoun x (x + "s") ;
+</PRE>
+<P>
+which will just compute to different values now.
+</P>
+<P>
+In the last definition of <CODE>mkNoun</CODE>, we used a case expression
+on the last character of the plural form to decide if the genitive
+should be formed with an <CODE>'</CODE> (as in <I>dogs</I>-<I>dogs'</I>) or with
+<CODE>'s</CODE> (as in <I>mice</I>-<I>mice's</I>). The expression <CODE>last y</CODE>
+uses the <CODE>Prelude</CODE> operation
+</P>
+<PRE>
+    last : Str -&gt; Str ;
+</PRE>
+<P>
+The case expression uses <B>pattern matching over strings</B>, which
+is supported in GF, alongside with pattern matching over
+parameters.
+</P>
+<A NAME="toc63"></A>
+<H3>Intelligent paradigms</H3>
+<P>
+Between the completely regular <I>dog</I>-<I>dogs</I> and the completely
+irregular <I>mouse</I>-<I>mice</I>, there are some
+predictable variations:
+</P>
+<UL>
+<LI>nouns ending with an <I>y</I>: <I>fly</I>-<I>flies</I>, except if
+  a vowel precedes the <I>y</I>: <I>boy</I>-<I>boys</I>
+<LI>nouns ending with <I>s</I>, <I>ch</I>, and a number of
+  other endings: <I>bus</I>-<I>buses</I>, <I>leech</I>-<I>leeches</I>
+</UL>
+
+<P>
+One way to deal with them would be to provide alternative paradigms:
+</P>
+<PRE>
+    noun_y : Str -&gt; Noun = \fly -&gt; mkNoun fly (init fly + "ies") ;  
+    noun_s : Str -&gt; Noun = \bus -&gt; mkNoun bus (bus + "es") ;
+</PRE>
+<P>
+The Prelude function <CODE>init</CODE> drops the last character of a token.
+But this solution has some drawbacks: 
+</P>
+<UL>
+<LI>it can be difficult to select the correct paradigm
+<LI>it can be difficult to remember the names of the different paradigms
+</UL>
+
+<P>
+To help the lexicon builder in this task, the morphology programmer
+can put some intelligence in the regular noun paradigm. The easiest
+way to express this in GF is by the use of <B>regular expression patterns</B>:
+</P>
+<PRE>
+    regNoun : Str -&gt; Noun = \w -&gt; 
+      let 
+        ws : Str = case w of {
+          _ + ("a" | "e" | "i" | "o") + "o" =&gt; w + "s" ;  -- bamboo
+          _ + ("s" | "x" | "sh" | "o")      =&gt; w + "es" ; -- bus, hero
+          _ + "z"                           =&gt; w + "zes" ;-- quiz 
+          _ + ("a" | "e" | "o" | "u") + "y" =&gt; w + "s" ;  -- boy
+          x + "y"                           =&gt; x + "ies" ;-- fly
+          _                                 =&gt; w + "s"    -- car
+          } 
+      in 
+      mkNoun w ws
+</PRE>
+<P>
+In this definition, we have used a local definition just in order to
+structure the code, even though there is no multiple evaluation to eliminate.
+In the case expression itself, we have used
+</P>
+<UL>
+<LI><B>disjunctive patterns</B>  <I>P</I> <CODE>|</CODE> <I>Q</I>
+<LI><B>concatenation patterns</B> <I>P</I> <CODE>+</CODE> <I>Q</I>
+</UL>
+
+<P>
+The patterns are ordered in such a way that, for instance, 
+the suffix <CODE>"oo"</CODE> prevents <I>bamboo</I> from matching the suffix
+<CODE>"o"</CODE>.
+</P>
+<P>
+<B>Exercise</B>. The same rules that form plural nouns in English also
+apply in the formation of third-person singular verbs. 
+Write a regular verb paradigm that uses this idea, but first
+rewrite <CODE>regNoun</CODE> so that the analysis needed to build <I>s</I>-forms
+is factored out as a separate <CODE>oper</CODE>, which is shared with
+<CODE>regVerb</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Extend the verb paradigms to cover all verb forms
+in English, with special care taken of variations with the suffix
+<I>ed</I> (e.g. <I>try</I>-<I>tried</I>, <I>use</I>-<I>used</I>).
+</P>
+<P>
+<B>Exercise</B>. Implement the German <B>Umlaut</B> operation on word stems.
+The operation changes the vowel of the stressed stem syllable as follows:
+<I>a</I> to <I>�</I>, <I>au</I> to <I>�u</I>, <I>o</I> to <I>�</I>, and <I>u</I> to <I>�</I>. You
+can assume that the operation only takes syllables as arguments. Test the
+operation to see whether it correctly changes <I>Arzt</I> to <I>�rzt</I>,
+<I>Baum</I> to <I>B�um</I>, <I>Topf</I> to <I>T�pf</I>, and <I>Kuh</I> to <I>K�h</I>.
+</P>
+<A NAME="toc64"></A>
+<H3>Function types with variables</H3>
+<P>
+In <a href="#chapsix">the sixth chapter</a>, we will introduce <B>dependent function types</B>, where
+the value type depends on the argument. For this end, we need a notation
+that binds a variable to the argument type, as in
+</P>
+<PRE>
+    switchOff : (k : Kind) -&gt; Action k
+</PRE>
+<P>
+Function types <I>without</I>
+variables are actually a shorthand notation: writing
+</P>
+<PRE>
+    PredVP : NP -&gt; VP -&gt; S
+</PRE>
+<P>
+is shorthand for 
+</P>
+<PRE>
+    PredVP : (x : NP) -&gt; (y : VP) -&gt; S
+</PRE>
+<P>
+or any other naming of the variables. Actually the use of variables
+sometimes shortens the code, since they can share a type:
+</P>
+<PRE>
+    octuple : (x,y,z,u,v,w,s,t : Str) -&gt; Str
+</PRE>
+<P>
+If a bound variable is not used, it can here, as elsewhere in GF, be replaced by
+a wildcard:
+</P>
+<PRE>
+    octuple : (_,_,_,_,_,_,_,_ : Str) -&gt; Str
+</PRE>
+<P>
+A good practice for functions with many arguments of the same type
+is to indicate the number of arguments:
+</P>
+<PRE>
+    octuple : (x1,_,_,_,_,_,_,x8 : Str) -&gt; Str
+</PRE>
+<P>
+One can also use heuristic variable names to document what 
+information each argument is expected to provide.
+This is very handy in the types of inflection paradigms:
+</P>
+<PRE>
+    mkNoun : (mouse,mice : Str) -&gt; Noun
+</PRE>
+<P></P>
+<A NAME="toc65"></A>
+<H3>Separating operation types and definitions</H3>
+<P>
+In grammars intended as libraries, it is useful to separate oparation
+definitions from their type signatures. The user is only interested
+in the type, whereas the definition is kept for the implementor and
+the maintainer. This is possible by using separate <CODE>oper</CODE> fragments 
+for the two parts:
+</P>
+<PRE>
+    oper regNoun : Str -&gt; Noun ;
+    oper regNoun s = mkNoun s (s + "s") ;
+</PRE>
+<P>
+The type checker combines the two into one <CODE>oper</CODE> judgement to see
+if the definition matches the type. Notice that, in this syntax, it
+is moreover possible to bind the argument variables on the left hand side
+instead of using lambda abstration.
+</P>
+<P>
+In the library module, the type signatures are typically placed in
+the beginning and the definitions in the end. A more radical separation
+can be achieved by using the <CODE>interface</CODE> and <CODE>instance</CODE> module types
+(see <a href="#secinterface">here</a>): the type signatures are placed in the interface 
+and the definitions in the instance.
+</P>
+<A NAME="toc66"></A>
+<H3>Overloading of operations</H3>
+<P>
+Large libraries, such as the GF Resource Grammar Library, may define 
+hundreds of names. This can be unpractical
+for both the library author and the user: the author has to invent longer 
+and longer names which are not always intuitive,
+and the author has to learn or at least be able to find all these names. 
+A solution to this problem, adopted by languages such as C++, 
+is <B>overloading</B>: one and the same name can be used for several functions. 
+When such a name is used, the
+compiler performs <B>overload resolution</B> to find out which of 
+the possible functions is meant. Overload resolution is based on 
+the types of the functions: all functions that
+have the same name must have different types.
+</P>
+<P>
+In C++, functions with the same name can be scattered everywhere in the program.
+In GF, they must be grouped together in <CODE>overload</CODE> groups. Here is an example
+of an overload group, giving the different ways to define nouns in English:
+</P>
+<PRE>
+    oper mkN : overload {
+      mkN : (dog : Str) -&gt; Noun ;         -- regular nouns
+      mkN : (mouse,mice : Str) -&gt; Noun ;  -- irregular nouns
+    }
+</PRE>
+<P>
+Intuitively, the function comes very close to the way in which
+regular and irregular words are given in most dictionaries. If the
+word is regular, just one form is needed. If it is irregular,
+more forms are given. There is no need to use explicit paradigm
+names.
+</P>
+<P>
+The <CODE>mkN</CODE> example gives only the possible types of the overloaded
+operation. Their definitions can be given separately, possibly in another module. 
+Here is a definition of the above overload group:
+</P>
+<PRE>
+    oper mkN = overload {
+      mkN : (dog : Str) -&gt; Noun = regNoun ;
+      mkN : (mouse,mice : Str) -&gt; Noun = mkNoun ;
+    }
+</PRE>
+<P>
+Notice that the types of the branches must be repeated so that they can be
+associated with proper definitions; the order of the branches has no
+significance.
+</P>
+<P>
+<B>Exercise</B>. Design a system of English verb paradigms presented by
+an overload group.
+</P>
+<A NAME="toc67"></A>
+<H3>Morphological analysis and morphology quiz</H3>
+<P>
+Even though morphology is in GF
+mostly used as an auxiliary for syntax, it
+can also be useful on its own right. The command <CODE>morpho_analyse = ma</CODE>
+can be used to read a text and return for each word the analyses that
+it has in the current concrete syntax.
+</P>
+<PRE>
+    &gt; read_file bible.txt | morpho_analyse
+</PRE>
+<P>
+In the same way as translation exercises, morphological exercises can
+be generated, by the command <CODE>morpho_quiz = mq</CODE>. Usually,
+the category is then set to some lexical category. For instance,
+French irregular verbs in the resource grammar library can be trained as
+follows:
+</P>
+<PRE>
+    % gf -path=alltenses:prelude $GF_LIB_PATH/alltenses/IrregFre.gfc
+  
+    &gt; morpho_quiz -cat=V
+  
+    Welcome to GF Morphology Quiz.
+    ...
+  
+    r�appara�tre : VFin VCondit  Pl  P2
+    r�apparaitriez
+    &gt; No, not r�apparaitriez, but
+    r�appara�triez
+    Score 0/1
+</PRE>
+<P>
+Just like translation exercises, a list of morphological exercises can be generated
+off-line and saved in a
+file for later use, by the command <CODE>morpho_list = ml</CODE>
+</P>
+<PRE>
+    &gt; morpho_list -number=25 -cat=V | write_file exx.txt
+</PRE>
+<P>
+The <CODE>number</CODE> flag gives the number of exercises generated.
+</P>
+<A NAME="toc68"></A>
+<H2>The Italian Foods grammar</H2>
+<P>
+<a name="secitalian"></a>
+</P>
+<P>
+We conclude the parametrization of the Food grammar by presenting an
+Italian variant, now complete with parameters, inflection, and 
+agreement.
+</P>
+<P>
+The header part is similar to English:
+</P>
+<PRE>
+  --# -path=.:prelude
+  
+  concrete FoodsIta of Foods = open Prelude in {
+</PRE>
+<P>
+Parameters include not only number but also gender.
+</P>
+<PRE>
+    param
+      Number = Sg | Pl ;
+      Gender = Masc | Fem ;
+</PRE>
+<P>
+Qualities are inflected for gender and number, whereas kinds
+have a parametric number (as in English) and an inherent gender.
+Items have an inherent number (as in English) but also gender.
+</P>
+<PRE>
+    lincat
+      Phr = SS ; 
+      Quality = {s : Gender =&gt; Number =&gt; Str} ; 
+      Kind = {s : Number =&gt; Str ; g : Gender} ; 
+      Item = {s : Str ; g : Gender ; n : Number} ; 
+</PRE>
+<P>
+A Quality is expressed by an adjective, which in Italian has one form for each
+gender-number combination.
+</P>
+<PRE>
+    oper
+      adjective : (_,_,_,_ : Str) -&gt; {s : Gender =&gt; Number =&gt; Str} = 
+        \nero,nera,neri,nere -&gt; {
+          s = table {
+            Masc =&gt; table {
+              Sg =&gt; nero ;
+              Pl =&gt; neri
+              } ; 
+            Fem =&gt; table {
+              Sg =&gt; nera ;
+              Pl =&gt; nere
+              }
+            }
+        } ;
+</PRE>
+<P>
+The very common case of regular adjectives works by adding
+endings to the stem.
+</P>
+<PRE>
+      regAdj : Str -&gt; {s : Gender =&gt; Number =&gt; Str} = \nero -&gt;
+        let ner = init nero 
+        in adjective nero (ner + "a") (ner + "i") (ner + "e") ;
+</PRE>
+<P></P>
+<P>
+For noun inflection, there are several paradigms; since only two forms
+are ever needed, we will just give them explicitly (the resource grammar
+library also has a paradigm that takes the singular form and infers the
+plural and the gender from it).
+</P>
+<PRE>
+      noun : Str -&gt; Str -&gt; Gender -&gt; {s : Number =&gt; Str ; g : Gender} = 
+        \vino,vini,g -&gt; {
+          s = table {
+            Sg =&gt; vino ;
+            Pl =&gt; vini
+            } ;
+          g = g
+        } ;
+</PRE>
+<P>
+As in <CODE>FoodEng</CODE>, we need only number variation for the copula.
+</P>
+<PRE>
+      copula : Number -&gt; Str = 
+        \n -&gt; case n of {
+          Sg =&gt; "�" ;
+          Pl =&gt; "sono"
+          } ;
+</PRE>
+<P>
+Determination is more complex than in English, because of gender:
+it uses separate determiner forms for the two genders, and selects
+one of them as function of the noun determined.
+</P>
+<PRE>
+      det : Number -&gt; Str -&gt; Str -&gt; {s : Number =&gt; Str ; g : Gender} -&gt; 
+          {s : Str ; g : Gender ; n : Number} = 
+        \n,m,f,cn -&gt; {
+          s = case cn.g of {Masc =&gt; m ; Fem =&gt; f} ++ cn.s ! n ;
+          g = cn.g ;
+          n = n
+        } ;
+</PRE>
+<P>
+Here is, finally, the complete set of linearization rules.
+</P>
+<PRE>
+    lin
+      Is item quality = 
+        ss (item.s ++ copula item.n ++ quality.s ! item.g ! item.n) ;
+      This  = det Sg "questo" "questa" ;
+      That  = det Sg "quello" "quella" ;
+      These = det Pl "questi" "queste" ;
+      Those = det Pl "quelli" "quelle" ;
+      QKind quality kind = {
+        s = \\n =&gt; kind.s ! n ++ quality.s ! kind.g ! n ;
+        g = kind.g
+        } ;
+      Wine = noun "vino" "vini" Masc ;
+      Cheese = noun "formaggio" "formaggi" Masc ;
+      Fish = noun "pesce" "pesci" Masc ;
+      Pizza = noun "pizza" "pizze" Fem ;
+      Very qual = {s = \\g,n =&gt; "molto" ++ qual.s ! g ! n} ;
+      Fresh = adjective "fresco" "fresca" "freschi" "fresche" ;
+      Warm = regAdj "caldo" ;
+      Italian = regAdj "italiano" ;
+      Expensive = regAdj "caro" ;
+      Delicious = regAdj "delizioso" ;
+      Boring = regAdj "noioso" ;
+    }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Experiment with multilingual generation and translation in the
+<CODE>Foods</CODE> grammars.
+</P>
+<P>
+<B>Exercise</B>. Add items, qualities, and determiners to the grammar, and try to get
+their inflection and inherent features right.
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax of <CODE>Food</CODE> for a language of your choice,
+now aiming for complete grammatical correctness by the use of parameters.
+</P>
+<P>
+<B>Exercise</B>. Measure the size of the context-free grammar corresponding to
+<CODE>FoodsIta</CODE>. You can do this by printing the grammar in the context-free format
+(<CODE>print_grammar -printer=cfg</CODE>) and counting the lines.
+</P>
+<A NAME="toc69"></A>
+<H2>Discontinuous constituents</H2>
+<P>
+A linearization type may contain more strings than one. 
+An example of where this is useful are English particle
+verbs, such as <I>switch off</I>. The linearization of
+a sentence may place the object between the verb and the particle:
+<I>he switched it off</I>.
+</P>
+<P>
+The following judgement defines transitive verbs as
+<B>discontinuous constituents</B>, i.e. as having a linearization
+type with two strings and not just one. 
+</P>
+<PRE>
+    lincat TV = {s : Number =&gt; Str ; part : Str} ;
+</PRE>
+<P>
+In the abstract syntax, we can now have a rule that combines a subject and an
+object item with a transitive verb to form a sentence:
+</P>
+<PRE>
+    fun AppTV : Item -&gt; TV -&gt; Item -&gt; Phrase ;
+</PRE>
+<P>
+The linearization rule places the object between the two parts of the verb:
+</P>
+<PRE>
+    lin AppTV subj tv obj = 
+      {s = subj.s ++ tv.s ! subj.n ++ obj.s ++ tv.part} ;
+</PRE>
+<P>
+There is no restriction in the number of discontinuous constituents
+(or other fields) a  <CODE>lincat</CODE> may contain. The only condition is that
+the fields must be built from records, tables,
+parameters, and <CODE>Str</CODE>, but not functions. 
+</P>
+<P>
+Notice that the parsing and linearization commands only give accurate
+results for categories whose linearization type has a unique <CODE>Str</CODE> 
+valued field labelled <CODE>s</CODE>. Therefore, discontinuous constituents
+are not a good idea in top-level categories accessed by the users
+of a grammar application.
+</P>
+<P>
+<B>Exercise</B>. Define the language <CODE>a^n b^n c^n</CODE> in GF, i.e.
+any number of <I>a</I>'s followed by the same number of <I>b</I>'s and
+the same number of <I>c</I>'s. This language is not context-free,
+but can be defined in GF by using discontinuous constituents.
+</P>
+<A NAME="toc70"></A>
+<H2>Strings at compile time vs. run time</H2>
+<P>
+A common difficulty in GF are the conditions under which tokens
+can be created. Tokens are created in the following ways:
+</P>
+<UL>
+<LI>quoted string: <CODE>"foo"</CODE>
+<LI>gluing : <CODE>t + s</CODE>
+<LI>predefined operations <CODE>init, tail, tk, dp</CODE>
+<LI>pattern matching over strings
+</UL>
+
+<P>
+The general principle is that 
+<I>tokens must be known at compile time</I>. This means that the above operations
+may not have <B>run-time variables</B> in their arguments. Run-time variables, in
+turn, are the variables that stand for function arguments in linearization rules.
+</P>
+<P>
+Hence it is not legal to write
+</P>
+<PRE>
+    cat Noun ;
+    fun Plural : Noun -&gt; Noun ;
+    lin Plural n = {s = n.s + "s"} ;
+</PRE>
+<P>
+because <CODE>n</CODE> is a run-time variable. Also
+</P>
+<PRE>
+    lin Plural n = {s = (regNoun n).s ! Pl} ; 
+</PRE>
+<P>
+is incorrect with <CODE>regNoun</CODE> as defined <a href="#secinflection">here</a>, because the run-time 
+variable is eventually sent to string pattern matching and gluing.
+</P>
+<P>
+Writing tokens together without a space is an often-wanted behaviour, for instance,
+with punctuation marks. Thus one might try to write
+</P>
+<PRE>
+    lin Question p = {s = p + "?"} ;
+</PRE>
+<P>
+which is incorrect. The way to go is to use an <B>unlexer</B> that creates correct spacing
+after linearization. Correspondingly, a <B>lexer</B> that e.g. analyses <CODE>"warm?"</CODE> into
+to tokens is needed before parsing. This can be done by using flags:
+</P>
+<PRE>
+    flags lexer=text ; unlexer=text ;
+</PRE>
+<P>
+works in the desired way for English text. More on lexers and unlexers will be
+told <a href="#seclexing">here</a>.
+</P>
+<A NAME="toc71"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc72"></A>
+<H3>Parameter and table types</H3>
+<P>
+A judgement of the form 
+<center>
+  <CODE>param</CODE> <I>P</I> <CODE>=</CODE> <I>C1</I> <I>X1</I> <CODE>|</CODE> ...  <CODE>|</CODE> <I>Cn</I> <I>Xn</I>
+</center>
+defines a <B>parameter type</B> <I>P</I> with <B>constructors</B> <I>C1</I> ... <I>Cn</I>.
+Each constructor has a <B>context</B> <I>X</I>, which is a (possibly empty)
+sequence of parameter types. A <B>parameter value</B> is an application
+of a constructor to a sequence of parameter values from each type in
+its context. 
+</P>
+<P>
+In addition to types defined in <CODE>param</CODE> judgements, also 
+records of parameter types are parameter types. Their values are records
+of corresponding field values.
+</P>
+<P>
+Moreover, the type <CODE>Ints</CODE> <I>n</I> is a parameter type for any positive
+integer <I>n</I>, and its values are <CODE>0</CODE>, ..., <I>n-1</I>.
+</P>
+<P>
+A <B>table type</B> <I>P</I> <CODE>=&gt;</CODE> <I>T</I> must have a parameter type <I>P</I> as
+its argument type. The normal form of an object of this type is a <B>table</B>
+<center>
+  <CODE>table {</CODE>  <I>V1</I> <CODE>=&gt;</CODE> <I>t1</I> <CODE>;</CODE> ...  <CODE>;</CODE> <I>Vm</I> <CODE>=&gt;</CODE> <I>tm</I> <CODE>}</CODE> 
+</center>
+which has a <B>branch</B> for every parameter value <I>Vi</I> of type <I>P</I>.
+A table can be given in many other ways by using pattern matching.
+</P>
+<P>
+Tables with only one branch are a common special case.
+GF provides syntactic sugar for writing one-branch tables concisely:
+</P>
+<PRE>
+    \\P,...,Q =&gt; t  ===  table {P =&gt; ... table {Q =&gt; t} ...}
+</PRE>
+<P></P>
+<A NAME="toc73"></A>
+<H3>Pattern matching</H3>
+<P>
+<a name="secmatching"></a>
+</P>
+<P>
+We will list all forms of patterns that can be used in table branches.
+the following are available for any parameter types, as well
+as for the types <CODE>Int</CODE> and <CODE>Str</CODE>
+</P>
+<UL>
+<LI>a constructor pattern <I>C P1 ... Pn</I> matches any value <I>C V1 ... Vn</I> where
+  each <I>Vi</I> matches <I>Pi</I>, 
+  and binds the union of all variables bound in the subpatterns <I>Pi</I>
+<LI>a record pattern 
+  <CODE>{</CODE> <I>r1</I> <CODE>=</CODE> <I>P1</I> <CODE>;</CODE> ... <CODE>;</CODE> <I>r1</I> <CODE>=</CODE> <I>P1</I> <CODE>}</CODE>
+  matches any record that has values of the corresponding fields.
+  and binds the union of all variables bound in the subpatterns <I>Pi</I>
+<LI>a variable pattern <I>x</I> 
+  (identifier other than constant parameter) matches any value, and
+  binds <I>x</I> to this value
+<LI>the wild card <CODE>_</CODE> matches any value
+<LI>a disjunctive pattern <I>P</I> <CODE>|</CODE> <I>Q</I> matches anything that 
+  either <I>P</I> or <I>Q</I> matches; bindings must be the same in both
+<LI>a negative pattern <CODE>-</CODE><I>P</I> matches anything that <I>P</I> does not match;
+  no bindings are returned
+<LI>an alias pattern <I>x</I> <CODE>@</CODE> <I>P</I> matches whatever value <I>P</I> matches and
+  binds <I>x</I> to this value; also the bindings in <I>P</I> are returned
+</UL>
+
+<P>
+The following patterns are only available for the type <CODE>Str</CODE>:
+</P>
+<UL>
+<LI>a string literal pattern, e.g. <CODE>"s"</CODE>, matches the same string 
+<LI>a concatenation pattern <I>P</I> <CODE>+</CODE> <I>Q</I> matches any string that consists
+  of a prefix matching <I>P</I> and a suffix matching <I>Q</I>;
+  the union of bindings is returned
+<LI>a repetition pattern <I>P</I><CODE>*</CODE> matches any string that can be decomposed
+  into strings that match <I>P</I>; no bindings are returned
+</UL>
+
+<P>
+The following pattern is only available for the types <CODE>Int</CODE> and <CODE>Ints</CODE> <I>n</I>:
+</P>
+<UL>
+<LI>an integer literal pattern, e.g. <CODE>214</CODE>, matches the same integer
+</UL>
+
+<P>
+Pattern matching is performed in the order in which the branches
+appear in the table: the branch of the first matching pattern is followed.
+The type checker reject sets of patterns that are not exhaustive, and
+warns for completely overshadowed patterns.
+To guarantee exhaustivity when the infinite types <CODE>Int</CODE> and <CODE>Str</CODE> are
+used as argument types, the last pattern must be a "catch-all" variable
+or wild card.
+</P>
+<P>
+It follows from the definition of record pattern matching 
+that it can utilize partial records: the branch
+</P>
+<PRE>
+    {g = Fem} =&gt; t
+</PRE>
+<P>
+in a table of type <CODE>{g : Gender ; n : Number} =&gt; T</CODE> means the same as
+</P>
+<PRE>
+    {g = Fem ; n = _} =&gt; t
+</PRE>
+<P>
+Variables in regular expression patterns
+are always bound to the <B>first match</B>, which is the first
+in the sequence of binding lists. For example:
+</P>
+<UL>
+<LI><CODE>x + "e" + y</CODE> matches <CODE>"peter"</CODE> with <CODE>x = "p", y = "ter"</CODE>
+<LI><CODE>x + "er"*</CODE> matches <CODE>"burgerer"</CODE> with ``x = "burg"
+</UL>
+
+<A NAME="toc74"></A>
+<H3>Overloading</H3>
+<P>
+Judgements of the <CODE>oper</CODE> form can introduce overloaded functions.
+The syntax is record-like, but all fields must have the same
+name and different types.
+</P>
+<PRE>
+    oper mkN = overload {
+      mkN : (dog : Str) -&gt; Noun = regNoun ;
+      mkN : (mouse,mice : Str) -&gt; Noun = mkNoun ;
+    }
+</PRE>
+<P>
+To give just the type of an overloaded operation, the record type
+syntax is used.
+</P>
+<PRE>
+    oper mkN : overload {
+      mkN : (dog : Str) -&gt; Noun ;         -- regular nouns
+      mkN : (mouse,mice : Str) -&gt; Noun ;  -- irregular nouns
+    }
+</PRE>
+<P>
+Overloading is not possible in other forms of judgement.
+</P>
+<A NAME="toc75"></A>
+<H3>Local definitions</H3>
+<P>
+Local definitions ("<CODE>let</CODE> expressions") can appear in groups:
+</P>
+<PRE>
+    oper regNoun : Str -&gt; Noun = \vino -&gt; 
+      let 
+        vin : Str = init vino ;
+        o   = last vino
+      in
+      ...
+</PRE>
+<P>
+The type can be omitted if it can be inferred. Later definitions may
+refer to earlier ones.
+</P>
+<A NAME="toc76"></A>
+<H3>Supplementary constructs</H3>
+<P>
+The rest of the GF language constructs are presented for the sake
+of completeness. They will not be used in the rest of this tutorial.
+</P>
+<H4>Record extension and subtyping</H4>
+<P>
+Record types and records can be <B>extended</B> with new fields. For instance,
+in German it is natural to see transitive verbs as verbs with a case, which
+is usually accusative or dative, and is passed to the object of the verb. 
+The symbol <CODE>**</CODE> is used for both record types and record objects.
+</P>
+<PRE>
+    lincat TV = Verb ** {c : Case} ;
+  
+    lin Follow = regVerb "folgen" ** {c = Dative} ; 
+</PRE>
+<P>
+To extend a record type or a record with a field whose label it
+already has is a type error. It is also an error to extend a type or
+object that is not a record.
+</P>
+<P>
+A record type <I>T</I> is a <B>subtype</B> of another one <I>R</I>, if <I>T</I> has
+all the fields of <I>R</I> and possibly other fields. For instance,
+an extension of a record type is always a subtype of it.
+If <I>T</I> is a subtype of <I>R</I>, then <I>R</I> is a <B>supertype</B> of <I>T</I>.
+</P>
+<P>
+If <I>T</I> is a subtype of <I>R</I>, an object of <I>T</I> can be used whenever
+an object of <I>R</I> is required. 
+For instance, a transitive verb can be used whenever a verb is required.
+</P>
+<P>
+<B>Covariance</B> means that a function returning a record <I>T</I> as value can
+also be used to return a value of a supertype <I>R</I>.
+<B>Contravariance</B> means that a function taking an <I>R</I> as argument
+can also be applied to any object of a subtype <I>T</I>.
+</P>
+<H4>Tuples and product types</H4>
+<P>
+Product types and tuples are syntactic sugar for record types and records:
+</P>
+<PRE>
+    T1 * ... * Tn   ===   {p1 : T1 ; ... ; pn : Tn}
+    &lt;t1, ...,  tn&gt;  ===   {p1 = T1 ; ... ; pn = Tn}
+</PRE>
+<P>
+Thus the labels <CODE>p1, p2,...</CODE> are hard-coded.
+As patterns, tuples are translated to record patterns in the
+same way as tuples to records; partial patterns make it
+possible to write, slightly surprisingly,
+</P>
+<PRE>
+    case &lt;g,n,p&gt; of {
+      &lt;Fem&gt; =&gt; t
+      ...
+      }
+</PRE>
+<P></P>
+<H4>Prefix-dependent choices</H4>
+<P>
+Sometimes a token has different forms depending on the token
+that follows. An example is the English indefinite article,
+which is <I>an</I> if a vowel follows, <I>a</I> otherwise.
+Which form is chosen can only be decided at run time, i.e.
+when a string is actually build. GF has a special construct for
+such tokens, the <CODE>pre</CODE> construct exemplified in
+</P>
+<PRE>
+    oper artIndef : Str = 
+      pre {"a" ; "an" / strs {"a" ; "e" ; "i" ; "o"}} ;
+</PRE>
+<P>
+Thus
+</P>
+<PRE>
+    artIndef ++ "cheese"  ---&gt;  "a" ++ "cheese"
+    artIndef ++ "apple"   ---&gt;  "an" ++ "apple"
+</PRE>
+<P>
+This very example does not work in all situations: the prefix
+<I>u</I> has no general rules, and some problematic words are
+<I>euphemism, one-eyed, n-gram</I>. Since the branches are matched in 
+order, it is possible to write
+</P>
+<PRE>
+    oper artIndef : Str = 
+      pre {"a" ; 
+           "a"  / strs {"eu" ; "one"} ;
+           "an" / strs {"a" ; "e" ; "i" ; "o" ; "n-"}
+          } ;
+</PRE>
+<P>
+Somewhat illogically, the default value is given as the first element in the list.
+</P>
+<P>
+<I>Prefix-dependent choice may be deprecated in GF version 3.</I>
+</P>
+<A NAME="toc77"></A>
+<H1>Using the resource grammar library</H1>
+<P>
+<a name="chapfive"></a>
+</P>
+<P>
+In this chapter, we will take a look at the GF resource grammar library.
+We will use the library to implement the <CODE>Foods</CODE> grammar of the 
+previous chapter
+and port it to some new languages. Some new concepts of GF's module system
+are also introduced, most notably the technique of <B>parametrized modules</B>,
+which has become an important "design pattern" for multilingual grammars.
+</P>
+<A NAME="toc78"></A>
+<H2>The coverage of the library</H2>
+<P>
+The GF Resource Grammar Library contains grammar rules for
+10 languages. In addition, 2 languages are available as yet incomplete
+implementations, and a few more are under construction. The purpose
+of the library is to define the low-level morphological and syntactic
+rules of languages, and thereby enable application programmers
+to concentrate on the semantic and stylistic
+aspects of their grammars. The guiding principle is that
+<center>
+grammar checking becomes type checking
+</center>
+that is, whatever is type-correct in the resource grammar is also
+grammatically correct. 
+</P>
+<P>
+The intended level of application grammarians
+is that of a skilled programmer with
+a practical knowledge of the target languages, but without
+theoretical knowledge about their grammars.
+Such a combination of
+skills is typical of programmers who, for instance, want to localize
+language software to new languages.
+</P>
+<P>
+The current resource languages are
+</P>
+<UL>
+<LI><CODE>Ara</CODE>bic (incomplete)
+<LI><CODE>Cat</CODE>alan (incomplete)
+<LI><CODE>Dan</CODE>ish
+<LI><CODE>Eng</CODE>lish
+<LI><CODE>Fin</CODE>nish
+<LI><CODE>Fre</CODE>nch
+<LI><CODE>Ger</CODE>man
+<LI><CODE>Ita</CODE>lian
+<LI><CODE>Nor</CODE>wegian
+<LI><CODE>Rus</CODE>sian
+<LI><CODE>Spa</CODE>nish
+<LI><CODE>Swe</CODE>dish
+</UL>
+
+<P>
+The first three letters (<CODE>Eng</CODE> etc) are used in grammar module names.
+We use the three-letter codes for languages from the ISO 639 standard.
+</P>
+<P>
+The incomplete Arabic and Catalan implementations are 
+sufficient for use in some applications; they both contain, amoung other 
+things, complete inflectional morphology.
+</P>
+<A NAME="toc79"></A>
+<H2>The structure of the library</H2>
+<P>
+<a name="seclexical"></a>
+</P>
+<A NAME="toc80"></A>
+<H3>Lexical vs. phrasal rules</H3>
+<P>
+So far we have looked at grammars from a semantic point of view:
+a grammar defines a system of meanings (specified in the abstract syntax) and
+tells how they are expressed in some language (as specified in the concrete syntax).
+In resource grammars, as often in the linguistic tradition, the goal is more modest:
+to specify the <B>grammatically correct combinations of words</B>, whatever their
+meanings are. With this more modest goal, it is possible to achieve a much
+wider coverage than with semantic grammars.
+</P>
+<P>
+Given the focus on <I>words</I> and their combinations, 
+the resource grammar has two kinds of categories and two kinds of rules:
+</P>
+<UL>
+<LI>lexical:
+  <UL>
+  <LI>lexical categories, to classify words
+  <LI>lexical rules, to define words and their properties
+  </UL>
+</UL>
+
+<UL>
+<LI>phrasal (combinatorial, syntactic):
+  <UL>
+  <LI>phrasal categories, to classify phrases of arbitrary size
+  <LI>phrasal rules, to combine phrases into larger phrases
+  </UL>
+</UL>
+
+<P>
+Some grammar formalisms make a formal distinction between 
+the lexical and syntactic
+components; sometimes it is necessary to use separate formalisms for these
+two kinds of rules. GF has no such restrictions. 
+Nevertheless, it has turned out
+to be a good discipline to maintain a distinction between 
+the lexical and syntactic components in the resource grammar. This fits
+also well with what is needed in applications: while syntactic structures
+are more or less the same across applications, vocabularies can be
+very different.
+</P>
+<A NAME="toc81"></A>
+<H3>Lexical categories</H3>
+<P>
+Within lexical categories, there is a further classification 
+into <B>closed</B> and <B>open</B> categories. The definining property 
+of closed categories is that the
+words in them can easily be enumerated; it is very seldom that any
+new words are introduced in them. In general, closed categories
+contain <B>structural words</B>, also known as <B>function words</B>.
+Examples of closed categories are
+</P>
+<PRE>
+    QuantSg ;  -- singular quantifier   e.g. "this"
+    QuantPl ;  -- plural quantifier     e.g. "those"
+    AdA ;      -- adadjective           e.g. "very"
+</PRE>
+<P>
+We have already used words of all these categories in the <CODE>Food</CODE>
+examples; they have just not been assigned a category, but
+treated as <B>syncategorematic</B>. In GF, a syncategoramatic
+word is one that is introduced in a linearization rule of
+some construction alongside with some other expressions that
+are combined; there is no abstract syntax tree for that word
+alone. Thus in the rules
+</P>
+<PRE>
+    fun That : Kind -&gt; Item ;
+    lin That k = {"that" ++ k.s} ;
+</PRE>
+<P>
+the word <I>that</I> is syncategoramatic. In linguistically motivated
+grammars, syncategorematic words are avoided, whereas in
+semantically motivated grammars, structural words are typically treated
+as syncategoramatic. This is partly so because the function expressed
+by a structural word in one language is often expressed by some other
+means than an individual word in another. For instance, the definite 
+article <I>the</I> is a determiner word in English, whereas Swedish expresses
+determination by inflecting the determined noun: <I>the wine</I> is <I>vinet</I>
+in Swedish.
+</P>
+<P>
+As for open categories, we will start with these two:
+</P>
+<PRE>
+    N ;    -- noun                  e.g. "pizza"
+    A ;    -- adjective             e.g. "good"
+</PRE>
+<P>
+Later in this chapter we will also need verbs of different kinds.
+</P>
+<P>
+<I>Note</I>. Having adadjectives as a closed category is not quite right, because
+one can form adadjectives from adjectives: <I>incredibly warm</I>.
+</P>
+<A NAME="toc82"></A>
+<H3>Lexical rules</H3>
+<P>
+The words of closed categories can be listed once and for all in a 
+library. In the first example, the <CODE>Foods</CODE> grammar of the previous section,
+we will use the following structural words from the <CODE>Syntax</CODE> module:
+</P>
+<PRE>
+    this_QuantSg, that_QuantSg : QuantSg ; 
+    these_QuantPl, those_QuantPl : QuantPl ; 
+    very_AdA  : AdA ;
+</PRE>
+<P>
+The naming convention for lexical rules is that we use a word followed by
+the category. In this way we can for instance distinguish the quantifier
+<I>that</I> from the conjunction <I>that</I>. 
+</P>
+<P>
+Open lexical categories have no objects in <CODE>Syntax</CODE>. Such objects
+will be built as they are needed in applications. The abstract
+syntax of words in applications is already familiar, e.g.
+</P>
+<PRE>
+    fun Wine : Kind ;
+</PRE>
+<P>
+The concrete syntax can be given directly, e.g.
+</P>
+<PRE>
+    lin Wine = mkN "wine" ;
+</PRE>
+<P>
+by using the morphological paradigm library <CODE>ParadigmsEng</CODE>.
+However, there are some advantages in giving the concrete syntax 
+indirectly, via the creation of a <B>resource lexicon</B>. In this lexicon,
+there will be entries such as
+</P>
+<PRE>
+    oper wine_N : N = mkN "wine" ;
+</PRE>
+<P>
+which can then be used in the linearization rules,
+</P>
+<PRE>
+    lin Wine = wine_N ;
+</PRE>
+<P>
+One advantage of this indirect method is that each new word gives
+an addition to a reusable resource lexicon, instead of just doing
+the job of implementing the application. Another advantage will
+be shown <a href="#secfunctor">here</a>: the possibility to write functors over
+lexicon interfaces.
+</P>
+<A NAME="toc83"></A>
+<H3>Phrasal categories</H3>
+<P>
+There are just four phrasal categories needed in the first application: 
+</P>
+<PRE>
+    Cl ;   -- clause             e.g. "this pizza is good"
+    NP ;   -- noun phrase        e.g. "this pizza"
+    CN ;   -- common noun        e.g. "warm pizza"
+    AP ;   -- adjectival phrase  e.g. "very warm"
+</PRE>
+<P>
+Clauses are, roughly, the same as declarative sentences; we will
+define in <a href="#secextended">here</a> a sentence <CODE>S</CODE> as a clause that has a fixed tense.
+The distinction between common nouns and noun phrases is that common nouns
+cannot generally be used alone as subjects (?<I>dog sleeps</I>), 
+whereas noun phrases can (<I>the dog sleeps</I>).
+Noun phrases can be built from common nouns by adding determiners,
+such as quantifiers; but there are also other kinds of noun phrases, e.g.
+pronouns.
+</P>
+<P>
+The syntactic combinations we need are the following:
+</P>
+<PRE>
+    mkCl : NP -&gt; AP -&gt; Cl ;      -- e.g. "this pizza is very warm"
+    mkNP : QuantSg -&gt; CN -&gt; NP ; -- e.g. "this pizza" 
+    mkNP : QuantPl -&gt; CN -&gt; NP ; -- e.g. "these pizzas"
+    mkCN : AP -&gt; CN -&gt; CN ;      -- e.g. "warm pizza"
+    mkAP : AdA -&gt; AP -&gt; AP ;     -- e.g. "very warm" 
+</PRE>
+<P>
+To start building phrases, we need rules of <B>lexical insertion</B>, which
+form phrases from single words:
+</P>
+<PRE>
+    mkCN : N -&gt; NP ;
+    mkAP : A -&gt; AP ;
+</PRE>
+<P>
+Notice that all (or, as many as possible) operations in the resource library
+have the name <CODE>mk</CODE><I>C</I>, where <I>C</I> is the value category of the operation.
+This means of course heavy overloading. For instance, the current library
+(version 1.2) has no less than 23 operations named <CODE>mkNP</CODE>!
+</P>
+<P>
+Now the sentence
+<center>
+<I>these very warm pizzas are Italian</I>
+</center>
+can be built as follows:
+</P>
+<PRE>
+    mkCl 
+      (mkNP these_QuantPl 
+         (mkCN (mkAP very_AdA (mkAP warm_A)) (mkCN pizza_CN)))
+      (mkAP italian_AP) 
+</PRE>
+<P>
+The task we are facing now is to define the concrete syntax of <CODE>Foods</CODE> so that
+this syntactic tree gives the value of linearizing the semantic tree
+</P>
+<PRE>
+    Is (These (QKind (Very Warm) Pizza)) Italian
+</PRE>
+<P></P>
+<A NAME="toc84"></A>
+<H2>The resource API</H2>
+<P>
+The resource library API is divided into language-specific 
+and language-independent parts. To put it roughly,
+</P>
+<UL>
+<LI>the syntax API is language-independent, i.e. has the same types and 
+  functions for all languages.
+  Its name is <CODE>Syntax</CODE><I>L</I> for each language <I>L</I>
+<LI>the morphology API is language-specific, i.e. has partly 
+  different types and functions
+  for different languages.
+  Its name is <CODE>Paradigms</CODE><I>L</I> for each language <I>L</I>
+</UL>
+
+<P>
+A full documentation of the API is available on-line in the
+<B>resource synopsis</B>. 
+For the examples of this chapter, we will only need a 
+fragment of the full API. The fragment needed for <CODE>Foods</CODE> has
+already been introduced, but let us summarize the descriptions
+by giving tables of the same form as used in the resource synopsis.
+</P>
+<P>
+Thus we will make use of the following categories from the module <CODE>Syntax</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Category</TH>
+<TH>Explanation</TH>
+<TH COLSPAN="2">Example</TH>
+</TR>
+<TR>
+<TD><CODE>Cl</CODE></TD>
+<TD>clause (sentence), with all tenses</TD>
+<TD><I>she looks at this</I></TD>
+</TR>
+<TR>
+<TD><CODE>AP</CODE></TD>
+<TD>adjectival phrase</TD>
+<TD><I>very warm</I></TD>
+</TR>
+<TR>
+<TD><CODE>CN</CODE></TD>
+<TD>common noun (without determiner)</TD>
+<TD><I>red house</I></TD>
+</TR>
+<TR>
+<TD><CODE>NP</CODE></TD>
+<TD>noun phrase (subject or object)</TD>
+<TD><I>the red house</I></TD>
+</TR>
+<TR>
+<TD><CODE>AdA</CODE></TD>
+<TD>adjective-modifying adverb,</TD>
+<TD><I>very</I></TD>
+</TR>
+<TR>
+<TD><CODE>QuantSg</CODE></TD>
+<TD>singular quantifier</TD>
+<TD><I>these</I></TD>
+</TR>
+<TR>
+<TD><CODE>QuantPl</CODE></TD>
+<TD>plural quantifier</TD>
+<TD><I>these</I></TD>
+</TR>
+<TR>
+<TD><CODE>A</CODE></TD>
+<TD>one-place adjective</TD>
+<TD><I>warm</I></TD>
+</TR>
+<TR>
+<TD><CODE>N</CODE></TD>
+<TD>common noun</TD>
+<TD><I>house</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+We will use the following syntax rules from <CODE>Syntax</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH>Type</TH>
+<TH COLSPAN="2">Example</TH>
+</TR>
+<TR>
+<TD><CODE>mkCl</CODE></TD>
+<TD><CODE>NP -&gt; AP -&gt; Cl</CODE></TD>
+<TD><I>John is very old</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkNP</CODE></TD>
+<TD><CODE>QuantSg -&gt; CN -&gt; NP</CODE></TD>
+<TD><I>this old man</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkNP</CODE></TD>
+<TD><CODE>QuantPl -&gt; CN -&gt; NP</CODE></TD>
+<TD><I>these old man</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkCN</CODE></TD>
+<TD><CODE>N -&gt; CN</CODE></TD>
+<TD><I>house</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkCN</CODE></TD>
+<TD><CODE>AP -&gt; CN -&gt; CN</CODE></TD>
+<TD><I>very big blue house</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkAP</CODE></TD>
+<TD><CODE>A -&gt; AP</CODE></TD>
+<TD><I>old</I></TD>
+</TR>
+<TR>
+<TD><CODE>mkAP</CODE></TD>
+<TD><CODE>AdA -&gt; AP -&gt; AP</CODE></TD>
+<TD><I>very very old</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+We will use the following structural words from <CODE>Syntax</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH>Type</TH>
+<TH COLSPAN="2">In English</TH>
+</TR>
+<TR>
+<TD><CODE>this_QuantSg</CODE></TD>
+<TD><CODE>QuantSg</CODE></TD>
+<TD><I>this</I></TD>
+</TR>
+<TR>
+<TD><CODE>that_QuantSg</CODE></TD>
+<TD><CODE>QuantSg</CODE></TD>
+<TD><I>that</I></TD>
+</TR>
+<TR>
+<TD><CODE>these_QuantPl</CODE></TD>
+<TD><CODE>QuantPl</CODE></TD>
+<TD><I>this</I></TD>
+</TR>
+<TR>
+<TD><CODE>those_QuantPl</CODE></TD>
+<TD><CODE>QuantPl</CODE></TD>
+<TD><I>that</I></TD>
+</TR>
+<TR>
+<TD><CODE>very_AdA</CODE></TD>
+<TD><CODE>AdA</CODE></TD>
+<TD><I>very</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For English, we will use the following part of <CODE>ParadigmsEng</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(dog : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(man,men : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(cold : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For Italian, we need just the following part of <CODE>ParadigmsIta</CODE>
+(Exercise). The "smart" paradigms will take care of variations
+such as <I>formaggio</I>-<I>formaggi</I>, and also infer the genders
+correctly.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(vino : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(caro : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For German, we will use the following part of <CODE>ParadigmsGer</CODE>.
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>Gender</CODE></TD>
+<TD><CODE>Type</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>masculine</CODE></TD>
+<TD><CODE>Gender</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>feminine</CODE></TD>
+<TD><CODE>Gender</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>neuter</CODE></TD>
+<TD><CODE>Gender</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(Stufe : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(Bild,Bilder : Str) -&gt; Gender -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(klein : Str) -&gt; A</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(gut,besser,beste : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+For Finnish, we only need the smart regular paradigms:
+</P>
+<TABLE CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Function</TH>
+<TH COLSPAN="2">Type</TH>
+</TR>
+<TR>
+<TD><CODE>mkN</CODE></TD>
+<TD><CODE>(talo : Str) -&gt; N</CODE></TD>
+</TR>
+<TR>
+<TD><CODE>mkA</CODE></TD>
+<TD><CODE>(hieno : Str) -&gt; A</CODE></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+<B>Exercise</B>. Try out the morphological paradigms in different languages. Do 
+as follows:
+</P>
+<PRE>
+    &gt; i -path=alltenses:prelude -retain alltenses/ParadigmsGer.gfr
+    &gt; cc mkN "Farbe"
+    &gt; cc mkA "gut" "besser" "beste"
+</PRE>
+<P></P>
+<A NAME="toc85"></A>
+<H2>Example: English</H2>
+<P>
+<a name="secenglish"></a>
+</P>
+<P>
+We work with the abstract syntax <CODE>Foods</CODE> from <a href="#chaptwo">the fourth chapter</a>, and
+build first an English implementation. Now we can do it without
+thinking about inflection and agreement, by just picking appropriate
+functions from the resource grammar library.
+</P>
+<P>
+The concrete syntax opens <CODE>SyntaxEng</CODE> and <CODE>ParadigmsEng</CODE>
+to get access to the resource libraries needed. In order to find
+the libraries, a <CODE>path</CODE> directive is prepended. It contains
+two resource subdirectories --- <CODE>present</CODE> and <CODE>prelude</CODE> --- 
+which are found relative to the environment variable <CODE>GF_LIB_PATH</CODE>.
+It also contains the current directory <CODE>.</CODE> and the directory <CODE>../foods</CODE>,
+in which <CODE>Foods.gf</CODE> resides.
+</P>
+<PRE>
+    --# -path=.:../foods:present:prelude
+  
+    concrete FoodsEng of Foods = open SyntaxEng,ParadigmsEng in {
+</PRE>
+<P>
+As linearization types, we will use clauses for <CODE>Phrase</CODE>, noun phrases
+for <CODE>Item</CODE>, common nouns for <CODE>Kind</CODE>, and adjectival phrases for <CODE>Quality</CODE>.
+</P>
+<PRE>
+    lincat
+      Phrase = Cl ; 
+      Item = NP ;
+      Kind = CN ;
+      Quality = AP ;
+</PRE>
+<P>
+These types fit perfectly with the way we have used the categories 
+in the application; hence
+the combination rules we need almost write themselves automatically:
+</P>
+<PRE>
+    lin
+      Is item quality = mkCl item quality ;
+      This kind = mkNP this_QuantSg kind ;
+      That kind = mkNP that_QuantSg kind ;
+      These kind = mkNP these_QuantPl kind ;
+      Those kind = mkNP those_QuantPl kind ;
+      QKind quality kind = mkCN quality kind ;
+      Very quality = mkAP very_AdA quality ;
+</PRE>
+<P>
+We write the lexical part of the grammar by using resource paradigms directly.
+Notice that we have to apply the lexical insertion rules to get type-correct
+linearizations. Notice also that we need to use the two-place noun paradigm for
+<I>fish</I>, but everythins else is regular.
+</P>
+<PRE>
+      Wine = mkCN (mkN "wine") ;
+      Pizza = mkCN (mkN "pizza") ;
+      Cheese = mkCN (mkN "cheese") ;
+      Fish = mkCN (mkN "fish" "fish") ;
+      Fresh = mkAP (mkA "fresh") ;
+      Warm = mkAP (mkA "warm") ;
+      Italian = mkAP (mkA "Italian") ;
+      Expensive = mkAP (mkA "expensive") ;
+      Delicious = mkAP (mkA "delicious") ;
+      Boring = mkAP (mkA "boring") ;
+    }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Compile the grammar <CODE>FoodsEng</CODE> and generate 
+and parse some sentences.
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax of <CODE>Foods</CODE> for Italian 
+or some other language included in the resource library. You can 
+compare the results with the hand-written
+grammars presented earlier in this tutorial.
+</P>
+<A NAME="toc86"></A>
+<H2>Functor implementation of multilingual grammars</H2>
+<P>
+<a name="secfunctor"></a>
+</P>
+<P>
+If you did the exercise of writing  a concrete syntax of <CODE>Foods</CODE> for some other
+language, you probably noticed that much of the code looks exactly the same
+as for English. The reason for this is that the <CODE>Syntax</CODE> API is the
+same for all languages. This is in turn possible because
+all languages (at least those in the resource package) 
+implement the same syntactic structures. Moreover, languages tend to use the
+syntactic structures in similar ways, even though this is not exceptionless.
+But usually, it is only the lexical parts of a concrete syntax that
+we need to write anew for a new language. Thus, to port a grammar to
+a new language, you
+</P>
+<OL>
+<LI>copy the concrete syntax of a given language
+<LI>change the words (strings and inflection paradigms)
+</OL>
+
+<P>
+Now, programming by copy-and-paste is not worthy of a functional programmer!
+So, can we write a <I>function</I> that takes care of the shared parts of grammar modules?
+Yes, we can. It is not a function in the <CODE>fun</CODE> or <CODE>oper</CODE> sense, but 
+a function operating on modules, called a <B>functor</B>. This construct
+is familiar from the functional programming 
+languages ML and OCaml, but it does not
+exist in Haskell. It also bears some resemblance to templates in C++.
+Functors are also known as <B>parametrized modules</B>.
+</P>
+<P>
+In GF, a functor is a module that <CODE>open</CODE>s one or more <B>interfaces</B>.
+An <CODE>interface</CODE> is a module similar to a <CODE>resource</CODE>, but it only
+contains the <I>types</I> of <CODE>oper</CODE>s, not their definitions. You can think
+of an interface as a kind of a record type. The <CODE>oper</CODE> names are the
+labels of this record type. The corresponding <I>record</I> is called an
+<B>instance</B> of the interface.
+Thus a functor is a module-level function taking instances as
+arguments and producing modules as values.
+</P>
+<P>
+Let us now write a functor implementation of the <CODE>Food</CODE> grammar.
+Consider its module header first:
+</P>
+<PRE>
+    incomplete concrete FoodsI of Foods = open Syntax, LexFoods in
+</PRE>
+<P>
+A functor is distinguished from an ordinary module by the leading
+keyword <CODE>incomplete</CODE>.
+</P>
+<P>
+In the functor-function analogy, <CODE>FoodsI</CODE> would be presented as a function
+with the following type signature:
+</P>
+<PRE>
+    FoodsI : 
+      instance of Syntax -&gt; instance of LexFoods -&gt; concrete of Foods
+</PRE>
+<P>
+It takes as arguments instances of two interfaces: 
+</P>
+<UL>
+<LI><CODE>Syntax</CODE>, the resource grammar interface
+<LI><CODE>LexFoods</CODE>, the domain-specific lexicon interface
+</UL>
+
+<P>
+Functors opening <CODE>Syntax</CODE> and a domain lexicon interface are in fact
+so typical in GF applications, that this structure could be called 
+a <B>design pattern</B>
+for GF grammars. What makes this pattern so useful is, again, that 
+languages tend to use the same syntactic structures and only differ in words.
+</P>
+<P>
+We will show the exact syntax of interfaces and instances in next Section.
+Here it is enough to know that we have
+</P>
+<UL>
+<LI><CODE>SyntaxGer</CODE>, an instance of <CODE>Syntax</CODE>
+<LI><CODE>LexFoodsGer</CODE>, an instance of <CODE>LexFoods</CODE>
+</UL>
+
+<P>
+Then we can complete the German implementation by "applying" the functor:
+</P>
+<PRE>
+    FoodI SyntaxGer LexFoodsGer : concrete of Foods
+</PRE>
+<P>
+The GF syntax for doing so is
+</P>
+<PRE>
+    concrete FoodsGer of Foods = FoodsI with 
+      (Syntax = SyntaxGer),
+      (LexFoods = LexFoodsGer) ;
+</PRE>
+<P>
+Notice that this is the <I>whole</I> module, not just a header of it.
+The module body is received from <CODE>FoodsI</CODE>, by instantiating the
+interface constants with their definitions given in the German
+instances. A module of this form, characterized by the keyword <CODE>with</CODE>, is
+called a <B>functor instantiation</B>.
+</P>
+<P>
+Here is the complete code for the functor <CODE>FoodsI</CODE>:
+</P>
+<PRE>
+    --# -path=.:../foods:present:prelude
+  
+    incomplete concrete FoodsI of Foods = open Syntax, LexFoods in {
+    lincat
+      Phrase = Cl ; 
+      Item = NP ;
+      Kind = CN ;
+      Quality = AP ;
+    lin
+      Is item quality = mkCl item quality ;
+      This kind = mkNP this_QuantSg kind ;
+      That kind = mkNP that_QuantSg kind ;
+      These kind = mkNP these_QuantPl kind ;
+      Those kind = mkNP those_QuantPl kind ;
+      QKind quality kind = mkCN quality kind ;
+      Very quality = mkAP very_AdA quality ;
+  
+      Wine = mkCN wine_N ;
+      Pizza = mkCN pizza_N ;
+      Cheese = mkCN cheese_N ;
+      Fish = mkCN fish_N ;
+      Fresh = mkAP fresh_A ;
+      Warm = mkAP warm_A ;
+      Italian = mkAP italian_A ;
+      Expensive = mkAP expensive_A ;
+      Delicious = mkAP delicious_A ;
+      Boring = mkAP boring_A ;
+    }
+</PRE>
+<P></P>
+<A NAME="toc87"></A>
+<H2>Interfaces and instances</H2>
+<P>
+<a name="secinterface"></a>
+</P>
+<P>
+Let us now define the <CODE>LexFoods</CODE> interface:
+</P>
+<PRE>
+    interface LexFoods = open Syntax in {
+    oper
+      wine_N : N ;
+      pizza_N : N ;
+      cheese_N : N ;
+      fish_N : N ;
+      fresh_A : A ;
+      warm_A : A ;
+      italian_A : A ;
+      expensive_A : A ;
+      delicious_A : A ;
+      boring_A : A ;
+    }
+</PRE>
+<P>
+In this interface, only lexical items are declared. In general, an
+interface can declare any functions and also types. The <CODE>Syntax</CODE>
+interface does so.
+</P>
+<P>
+Here is a German instance of the interface.
+</P>
+<PRE>
+    instance LexFoodsGer of LexFoods = open SyntaxGer, ParadigmsGer in {
+    oper
+      wine_N = mkN "Wein" ;
+      pizza_N = mkN "Pizza" "Pizzen" feminine ;
+      cheese_N = mkN "K�se" "K�sen" masculine ;
+      fish_N = mkN "Fisch" ;
+      fresh_A = mkA "frisch" ;
+      warm_A = mkA "warm" "w�rmer" "w�rmste" ;
+      italian_A = mkA "italienisch" ;
+      expensive_A = mkA "teuer" ;
+      delicious_A = mkA "k�stlich" ;
+      boring_A = mkA "langweilig" ;
+    }
+</PRE>
+<P>
+Notice that when an interface opens an interface, such as <CODE>Syntax</CODE>, 
+here, then its instance has to open an instance of it. But the instance 
+may also open some other resources --- very typically, like here,
+a domain lexicon instance opens a <CODE>Paradigms</CODE> module.
+</P>
+<P>
+Just to complete the picture, we repeat the German functor instantiation
+for <CODE>FoodsI</CODE>, this time with a path directive that makes it compilable.
+</P>
+<PRE>
+    --# -path=.:../foods:present:prelude
+  
+    concrete FoodsGer of Foods = FoodsI with 
+      (Syntax = SyntaxGer),
+      (LexFoods = LexFoodsGer) ;
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Compile and test <CODE>FoodsGer</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Refactor <CODE>FoodsEng</CODE> into a functor instantiation.
+</P>
+<A NAME="toc88"></A>
+<H2>Adding languages to a functor implementation</H2>
+<P>
+Once we have an application grammar defined by using a functor,
+adding a new language is simple. Just two modules need to be written:
+</P>
+<UL>
+<LI>a domain lexicon instance
+<LI>a functor instantiation
+</UL>
+
+<P>
+The functor instantiation is completely mechanical to write.
+Here is one for Finnish:
+</P>
+<PRE>
+    --# -path=.:../foods:present:prelude
+  
+    concrete FoodsFin of Foods = FoodsI with 
+      (Syntax = SyntaxFin),
+      (LexFoods = LexFoodsFin) ;
+</PRE>
+<P>
+The domain lexicon instance requires some knowledge of the words of the
+language: what words are used for which concepts, how the words are
+inflected, plus features such as genders. Here is a lexicon instance for
+Finnish:
+</P>
+<PRE>
+    instance LexFoodsFin of LexFoods = open SyntaxFin, ParadigmsFin in {
+    oper
+      wine_N = mkN "viini" ;
+      pizza_N = mkN "pizza" ;
+      cheese_N = mkN "juusto" ;
+      fish_N = mkN "kala" ;
+      fresh_A = mkA "tuore" ;
+      warm_A = mkA "l�mmin" ;
+      italian_A = mkA "italialainen" ;
+      expensive_A = mkA "kallis" ;
+      delicious_A = mkA "herkullinen" ;
+      boring_A = mkA "tyls�" ;
+    }
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Instantiate the functor <CODE>FoodsI</CODE> to some language of
+your choice.
+</P>
+<A NAME="toc89"></A>
+<H2>Division of labour revisited</H2>
+<P>
+One purpose with the resource grammars was stated to be a division
+of labour between linguists and application grammarians. We can now
+reflect on what this means more precisely, by asking ourselves what
+skills are required of grammarians working on different components.
+</P>
+<P>
+Building a GF application starts from the abstract syntax. Writing
+an abstract syntax requires
+</P>
+<UL>
+<LI>understanding of the semantic structure of the application domain
+<LI>knowledge of the GF fragment with categories and functions
+</UL>
+
+<P>
+If the concrete syntax is written by using a functor, the programmer
+has to decide what parts of the implementation are put to the interface
+and what parts are shared in the functor. This requires
+</P>
+<UL>
+<LI>knowing how the domain concepts are expressed in natural language
+<LI>knowledge of the resource grammar library --- the categories and combinators
+<LI>understanding what parts are likely to be expressed in language-dependent
+  ways, so that they are put to an interface and not the functor
+<LI>knowledge of the GF fragment with function applications and strings
+</UL>
+
+<P>
+Instantiating a ready-made functor to a new language is less demanding.
+It requires essentially
+</P>
+<UL>
+<LI>knowing how the domain words are expressed in the language
+<LI>knowing, roughly, how these words are inflected
+<LI>knowledge of the paradigms available in the library
+<LI>knowledge of the GF fragment with function applications and strings
+</UL>
+
+<P>
+Notice that none of these tasks requires the use of GF records, tables,
+or parameters. Thus only a small fragment of GF is needed; the rest of
+GF is only relevant for those who write the libraries. Essentially,
+all the machinery introduced in <a href="#chaptwo">the fourth chapter</a> is unnecessary!
+</P>
+<P>
+Of course, grammar writing is not always just straightforward usage of libraries.
+For example, GF can be used for other languages than just those in the
+libraries --- for both natural and formal languages. A knowledge of records
+and tables can, unfortunately, also be needed for understanding GF's error 
+messages.
+</P>
+<P>
+<B>Exercise</B>. Design a small grammar that can be used for controlling
+an MP3 player. The grammar should be able to recognize commands such
+as <I>play this song</I>, with the following variations:
+</P>
+<UL>
+<LI>verbs: <I>play</I>, <I>remove</I>
+<LI>objects: <I>song</I>, <I>artist</I>
+<LI>determiners: <I>this</I>, <I>the previous</I>
+<LI>verbs without arguments: <I>stop</I>, <I>pause</I>
+</UL>
+
+<P>
+The implementation goes in the following phases:
+</P>
+<OL>
+<LI>abstract syntax
+<LI>functor and lexicon interface
+<LI>lexicon instance for the first language
+<LI>functor instantiation for the first language
+<LI>lexicon instance for the second language
+<LI>functor instantiation for the second language
+<LI>...
+</OL>
+
+<A NAME="toc90"></A>
+<H2>Restricted inheritance</H2>
+<P>
+A functor implementation using the resource <CODE>Syntax</CODE> interface
+works well as long as all concepts are expressed by using the same structures
+in all languages. If this is not the case, the deviant linearization can
+be made into a parameter and moved to the domain lexicon interface.
+</P>
+<P>
+The <CODE>Foods</CODE> grammar works so well that we have to  
+take a contrived example: assume that English has
+no word for <CODE>Pizza</CODE>, but has to use the paraphrase <I>Italian pie</I>.
+This paraphrase is no longer a noun <CODE>N</CODE>, but a complex phrase
+in the category <CODE>CN</CODE>. An obvious way to solve this problem is
+to change interface <CODE>LexFoods</CODE> so that the constant declared for
+<CODE>Pizza</CODE> gets a new type:
+</P>
+<PRE>
+    oper pizza_CN : CN ;
+</PRE>
+<P>
+But this solution is unstable: we may end up changing the interface
+and the function with each new language, and we must every time also
+change the interface instances for the old languages to maintain
+type correctness.
+</P>
+<P>
+A better solution is to use <B>restricted inheritance</B>: the English 
+instantiation inherits the functor implementation except for the 
+constant <CODE>Pizza</CODE>. This is how we write:
+</P>
+<PRE>
+    --# -path=.:../foods:present:prelude
+  
+    concrete FoodsEng of Foods = FoodsI - [Pizza] with 
+      (Syntax = SyntaxEng),
+      (LexFoods = LexFoodsEng) ** 
+        open SyntaxEng, ParadigmsEng in {
+  
+      lin Pizza = mkCN (mkA "Italian") (mkN "pie") ;
+    }
+</PRE>
+<P>
+Restricted inheritance is available for all inherited modules. One can for
+instance exclude some mushrooms and pick up just some fruit in 
+the <CODE>FoodMarket</CODE> example "Rsecarchitecture:
+</P>
+<PRE>
+    abstract Foodmarket = Food, Fruit [Peach], Mushroom - [Agaric]
+</PRE>
+<P>
+A concrete syntax of <CODE>Foodmarket</CODE> must then have the same inheritance
+restrictions, in order to be well-typed with respect to the abstract syntax.
+</P>
+<A NAME="toc91"></A>
+<H2>Grammar reuse</H2>
+<P>
+The alert reader has certainly noticed an analogy between <CODE>abstract</CODE>
+and <CODE>concrete</CODE>, on the one hand, and <CODE>interface</CODE> and <CODE>instance</CODE>,
+on the other. Why are these two pairs of module types kept separate
+at all? There is, in fact, a very close correspondence between
+judgements in the two kinds of modules:
+</P>
+<PRE>
+    cat C         &lt;---&gt;  oper C : Type
+  
+    fun f : A     &lt;---&gt;  oper f : A
+  
+    lincat C = T  &lt;---&gt;  oper C : Type = T
+  
+    lin f = t     &lt;---&gt;  oper f : A = t
+</PRE>
+<P>
+But there are also some differences:
+</P>
+<UL>
+<LI><CODE>abstract</CODE> and <CODE>concrete</CODE> modules define <B>top-level grammars</B>, i.e.
+  grammars that can be used for parsing and linearization; this is because
+<LI>the types and terms in <CODE>concrete</CODE> modules are restricted to a subset
+  of those available in <CODE>interface</CODE>, <CODE>instance</CODE>, and <CODE>resource</CODE>
+<LI><CODE>param</CODE> judgements have no counterparts in top-level grammars
+</UL>
+
+<P>
+The term that can be used for interfaces, instances, and resources is
+<B>resource-level grammars</B>.
+From these explanations and the above translations it follows that top-level 
+grammars are, in a sense, a special case of resource-level grammars. 
+</P>
+<P>
+Thus, indeed, abstract syntax modules can be used like interfaces, and concrete syntaxes
+as their instances. The use of top-level grammars as resources 
+is called <B>grammar reuse</B>. Whether a library module is a top-level or a
+resource-level module is mostly invisible to application programmers 
+(see the Summary <a href="#seclock">here</a>
+for an exception to this).  The GF resource grammar
+library itself is in fact built in two layers:
+</P>
+<UL>
+<LI>the <B>ground resource</B>: a set of top-level grammars for syntactic structures 
+<LI>the <B>surface resource</B>: a resource-level grammar with overloaded operations
+  defined in terms of the ground resource
+</UL>
+
+<P>
+Both the ground 
+resource and the surface resource can be used by application programmers,
+but it is the surface resource that we use in this book. Because of overloading,
+it has much fewer function names and also flatter trees. For instance, the clause
+<center>
+<I>these very warm pizzas are Italian</I>
+</center>
+which in the surface resource can be built as
+</P>
+<PRE>
+    mkCl 
+      (mkNP these_QuantPl 
+        (mkCN (mkAP very_AdA (mkAP warm_A)) (mkCN pizza_CN)))
+      (mkAP italian_AP) 
+</PRE>
+<P>
+has in the ground resource the much more complex tree
+</P>
+<PRE>
+    PredVP 
+      (DetCN (DetPl (PlQuant this_Quant) NoNum NoOrd) 
+        (AdjCN (AdAP very_AdA (PositA warm_A)) (UseN pizza_N))) 
+      (UseComp (CompAP (PositA italian_A)))
+</PRE>
+<P>
+The main advantage of using the ground resource is that the trees can then be found
+by using the parser, as shown in the next section. Otherwise, the overloaded surface
+resource constants are much easier to use.
+</P>
+<P>
+Needless to say, once a library has been defined in some way, it is easy to
+build layers of <B>derived libraries</B> on top of it, by using grammar reuse
+and, in the case of multilingual libraries, functors. This is indeed how
+the surface resource has been implemented: as a functored parametrized on 
+the abstract syntax of the ground resource.
+</P>
+<A NAME="toc92"></A>
+<H2>Browsing the resource with GF commands</H2>
+<P>
+<a name="secbrowsing"></a>
+</P>
+<P>
+In addition to reading the 
+<A HREF="../../lib/resource-1.0/synopsis.html">resource synopsis</A>, you
+can find resource function combinations by using the parser. This
+is so because the resource library is in the end implemented as
+a top-level <CODE>abstract-concrete</CODE> grammar, on which parsing
+and linearization work.
+</P>
+<P>
+Unfortunately, currently (GF 2.8) 
+only English and the Scandinavian languages can be
+parsed within acceptable computer resource limits when the full
+resource is used. 
+</P>
+<P>
+To look for a syntax tree in the overload API by parsing, do like this:
+</P>
+<PRE>
+    % gf -path=alltenses:prelude $GF_LIB_PATH/alltenses/OverLangEng.gfc
+  
+    &gt; p -cat=S -overload "this grammar is too big"
+    mkS (mkCl (mkNP this_QuantSg grammar_N) (mkAP too_AdA big_A))
+</PRE>
+<P>
+The <CODE>-overload</CODE> option given to the parser is a directive to find the
+shallowest overloaded term that matches the parse tree.
+</P>
+<P>
+To view linearizations in all languages by parsing from English:
+</P>
+<PRE>
+    % gf $GF_LIB_PATH/alltenses/langs.gfcm
+  
+    &gt; p -cat=S -lang=LangEng "this grammar is too big" | tb
+    UseCl TPres ASimul PPos (PredVP (DetCN (DetSg (SgQuant this_Quant) 
+      NoOrd) (UseN grammar_N)) (UseComp (CompAP (AdAP too_AdA (PositA big_A)))))
+    Den h�r grammatiken �r f�r stor
+    Esta gram�tica es demasiado grande
+    (Cyrillic: eta grammatika govorit des'at' jazykov)
+    Denne grammatikken er for stor
+    Questa grammatica � troppo grande
+    Diese Grammatik ist zu gro�
+    Cette grammaire est trop grande
+    T�m� kielioppi on liian suuri
+    This grammar is too big
+    Denne grammatik er for stor
+</PRE>
+<P>
+This method shows the unambiguous ground resource functions and not
+the overloaded ones. It uses a precompiled grammar package of the GFCM or GFCC
+format; see <a href="#chapeight">the eighth chapter</a> for more information on this.
+</P>
+<P>
+Unfortunately, the Russian grammar uses at the moment a different
+character encoding than the rest and is therefore not displayed correctly
+in a terminal window. However, the GF syntax editor does display all
+examples correctly --- again, using the ground resource:
+</P>
+<PRE>
+    % gfeditor $GF_LIB_PATH/alltenses/langs.gfcm
+</PRE>
+<P>
+When you have constructed the tree, you will see the following screen:
+</P>
+<P>
+<center>
+</P>
+<P>
+ <IMG ALIGN="right" SRC="10lang-small.png" BORDER="0" ALT="">
+</P>
+<P>
+</center>
+</P>
+<P>
+<B>Exercise</B>. Find the resource grammar translations for the following
+English phrases (parse in the category <CODE>Phr</CODE>). You can first try to
+build the terms manually.
+</P>
+<P>
+<I>every man loves a woman</I>
+</P>
+<P>
+<I>this grammar speaks more than ten languages</I>
+</P>
+<P>
+<I>which languages aren't in the grammar</I>
+</P>
+<P>
+<I>which languages did you want to speak</I>
+</P>
+<A NAME="toc93"></A>
+<H2>An extended Foods grammar</H2>
+<P>
+<a name="secextended"></a>
+</P>
+<P>
+Now that we know how to find information in the resource grammar,
+we can easily extend the <CODE>Foods</CODE> fragment considerably. We shall enable
+the following new expressions:
+</P>
+<UL>
+<LI>questions: <I>Is this pizza Italian?</I> <I>Which pizza do you want to eat?</I>
+<LI>imperatives: <I>Eat that pizza please!</I>
+<LI>denials: <I>These pizzas are not Italian.</I>
+<LI>verbs: <I>eat</I>, <I>pay</I>
+<LI>guests, in addition to food items: <I>I, you, this lady</I>
+</UL>
+
+<A NAME="toc94"></A>
+<H3>Abstract syntax</H3>
+<P>
+Since we don't want to change the already existing <CODE>Foods</CODE> module, 
+we build an extension of it, <CODE>ExtFoods</CODE>:
+</P>
+<PRE>
+    abstract ExtFoods = Foods ** {
+  
+    flags startcat=Move ;
+  
+    cat
+      Move ;      -- dialogue move: declarative, question, or imperative
+      Verb ;      -- transitive verb
+      Guest ;     -- guest in restaurant
+      GuestKind ; -- type of guest
+  
+    fun
+      MAssert : Phrase -&gt; Move ;  -- This pizza is warm.
+      MDeny : Phrase -&gt; Move ;    -- This pizza isn't warm.
+      MAsk : Phrase -&gt; Move ;     -- Is this pizza warm?
+  
+      PVerb : Guest -&gt; Verb -&gt; Item -&gt; Phrase ;     -- we eat this pizza
+      PVerbWant : Guest -&gt; Verb -&gt; Item -&gt; Phrase ; -- we want to eat this pizza
+  
+      WhichVerb : 
+        Kind -&gt; Guest -&gt; Verb -&gt; Move ;   -- Which pizza do you eat?
+      WhichVerbWant : 
+        Kind -&gt; Guest -&gt; Verb -&gt; Move ;   -- Which pizza do you want to eat?
+      WhichIs : Kind -&gt; Quality -&gt; Move ; -- Which wine is Italian? 
+  
+      Do : Verb -&gt; Item -&gt; Move ;         -- Pay this wine!
+      DoPlease : Verb -&gt; Item -&gt; Move ;   -- Pay this wine please!
+  
+      I, You, We : Guest ;
+  
+      GThis, GThat, GThese, GThose : GuestKind -&gt; Guest ;
+      
+      Eat, Drink, Pay : Verb ;
+  
+      Lady, Gentleman : GuestKind ;    
+    }
+</PRE>
+<P>
+The concrete syntax is implemented by a functor that extends the
+already defined functor <CODE>FoodsI</CODE>.
+</P>
+<PRE>
+    incomplete concrete ExtFoodsI of ExtFoods = 
+              FoodsI ** open Syntax, LexFoods in {
+  
+      flags lexer=text ; unlexer=text ;
+</PRE>
+<P>
+The flags set up a lexer and unlexer that can deal with sentence-initial
+capital letters and proper spacing with punctuation (see <a href="#seclexing">here</a>
+for more information on lexers and unlexers).
+</P>
+<A NAME="toc95"></A>
+<H3>Linearization types</H3>
+<P>
+If we look at the resource documentation, we find several categories
+that are above the clause level and can thus host different kinds
+of dialogue moves:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Category</TH>
+<TH>Explanation</TH>
+<TH COLSPAN="2">Example</TH>
+</TR>
+<TR>
+<TD><CODE>Text</CODE></TD>
+<TD>text consisting of phrases</TD>
+<TD><I>He is here. Why?</I></TD>
+</TR>
+<TR>
+<TD><CODE>Phr</CODE></TD>
+<TD>phrase in a text</TD>
+<TD><I>but be quiet please</I></TD>
+</TR>
+<TR>
+<TD><CODE>Utt</CODE></TD>
+<TD>sentence, question, word...</TD>
+<TD><I>be quiet</I></TD>
+</TR>
+<TR>
+<TD><CODE>S</CODE></TD>
+<TD>declarative sentence</TD>
+<TD><I>she lived here</I></TD>
+</TR>
+<TR>
+<TD><CODE>QS</CODE></TD>
+<TD>question</TD>
+<TD><I>where did she live</I></TD>
+</TR>
+<TR>
+<TD><CODE>Imp</CODE></TD>
+<TD>imperative</TD>
+<TD><I>look at this</I></TD>
+</TR>
+<TR>
+<TD><CODE>QCl</CODE></TD>
+<TD>question clause, with all tenses</TD>
+<TD><I>why does she walk</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+We also find that only the category <CODE>Text</CODE> contains punctuation marks.
+So we choose this as the linearization type of <CODE>Move</CODE>. The other types
+are quite obvious.
+</P>
+<PRE>
+    lincat
+      Move = Text ;
+      Verb = V2 ;
+      Guest = NP ;
+      GuestKind = CN ;
+</PRE>
+<P>
+The category <CODE>V2</CODE> of <B>two-place verbs</B> includes both 
+<B>transitive verbs</B> that take <B>direct objects</B> (e.g. <I>we watch him</I>)
+and verbs that take other kinds of <B>complements</B>, often with
+prepositions (<I>we look at him</I>). In a multilingual grammar, it is
+not guaranteed that transitive verbs are transitive in all languages,
+so the more general notion of two-place verb is more appropriate.
+</P>
+<A NAME="toc96"></A>
+<H3>Linearization rules</H3>
+<P>
+Now we need to find constructors that combine the new categories in
+appropriate ways. To form a text from a clause, we first make it into
+a sentence with <CODE>mkS</CODE>, and then apply <CODE>mkText</CODE>:
+</P>
+<PRE>
+    lin MAssert p = mkText (mkS p) ;
+</PRE>
+<P>
+The function <CODE>mkS</CODE> has in the resource synopsis been given the type
+</P>
+<PRE>
+    mkS : (Tense) -&gt; (Ant) -&gt; (Pol) -&gt; Cl -&gt; S
+</PRE>
+<P>
+Parentheses around type names do not make any difference for the GF compiler,
+but in the synopsis notation they indicate <B>optionality</B>: any of the
+optional arguments can be omitted, and there is an instance of <CODE>mkS</CODE>
+available. For each optional type, it uses the <B>default value</B> for that
+type, which for the <B>polarity</B> <CODE>Pol</CODE> is positive i.e. unnegated.
+To build a negative sentence, we use an explicit polarity constructor:
+</P>
+<PRE>
+      MDeny p = mkText (mkS negativePol p) ;
+</PRE>
+<P>
+Of course, we could have used <CODE>positivePol</CODE> in the first rule, instead of
+relying on the default. (The types <CODE>Tense</CODE> and <CODE>Ant</CODE> will be explained
+<a href="#sectense">here</a>.)
+</P>
+<P>
+Phrases can be made into <B>question sentences</B>, which in turn can be
+made into texts in a similar way as sentences; the default
+punctuation mark is not the full stop but the question mark.
+</P>
+<PRE>
+      MAsk p = mkText (mkQS p) ;
+</PRE>
+<P>
+There is an <CODE>mkCl</CODE> instance that directly builds a clause from a noun phrase,
+a two-place verb, and another noun phrase.
+</P>
+<PRE>
+      PVerb = mkCl ;
+</PRE>
+<P>
+The auxiliary verb <I>want</I> requires a <B>verb phrase</B> (<CODE>VP</CODE>) as its complement. It
+can be built from a two-place verb and its noun phrase complement.
+</P>
+<PRE>
+      PVerbWant guest verb item = mkCl guest want_VV (mkVP verb item) ;
+</PRE>
+<P>
+The <B>interrogative determiner</B> (<CODE>IDet</CODE>) <I>which</I> can be combined with
+a common noun to form an <B>interrogative phrase</B> (<CODE>IP</CODE>). This <CODE>IP</CODE> can then
+be used as a subject in a <B>question clause</B> (<CODE>QCl</CODE>), which in turn is
+made into a <CODE>QS</CODE> and finally to a <CODE>Text</CODE>.
+</P>
+<PRE>
+      WhichIs kind quality = 
+        mkText (mkQS (mkQCl (mkIP whichSg_IDet kind) (mkVP quality))) ;
+</PRE>
+<P>
+When interrogative phrases are used as <I>objects</I>, the resource library
+uses a category named <CODE>Slash</CODE> of
+objectless sentences. The name cames from the <B>slash categories</B> of  the
+GPSG grammar formalism
+(Gazdar &amp; al. 1985). Slashes can be formed from subjects and two-place verbs,
+also with an intervening auxiliary verb.
+</P>
+<PRE>
+      WhichVerb kind guest verb = 
+        mkText (mkQS (mkQCl (mkIP whichSg_IDet kind)
+          (mkSlash guest verb))) ;
+      WhichVerbWant kind guest verb = 
+        mkText (mkQS (mkQCl (mkIP whichSg_IDet kind) 
+          (mkSlash guest want_VV verb))) ;
+</PRE>
+<P>
+Finally, we form the <B>imperative</B> (<CODE>Imp</CODE>) of a transitive verb
+and its object. We make it into a <B>polite</B> form utterance, and finally
+into a <CODE>Text</CODE> with an exclamation mark.
+</P>
+<PRE>
+      Do verb item = 
+        mkText 
+          (mkPhr (mkUtt politeImpForm (mkImp verb item))) exclMarkPunct ;
+      DoPlease verb item = 
+        mkText 
+          (mkPhr (mkUtt politeImpForm (mkImp verb item)) please_Voc) 
+          exclMarkPunct ;
+</PRE>
+<P>
+The rest of the concrete syntax is straightforward use of structural words,
+</P>
+<PRE>
+      I = mkNP i_Pron ;
+      You = mkNP youPol_Pron ;
+      We = mkNP we_Pron ;
+      GThis = mkNP this_QuantSg ;
+      GThat = mkNP that_QuantSg ;
+      GThese = mkNP these_QuantPl ;
+      GThose = mkNP those_QuantPl ;
+</PRE>
+<P>
+and of the food lexicon,
+</P>
+<PRE>
+      Eat = eat_V2 ;
+      Drink = drink_V2 ;
+      Pay = pay_V2 ;
+      Lady = lady_N ;
+      Gentleman = gentleman_N ;
+  }
+</PRE>
+<P>
+Notice that we have no reason to build an extension of <CODE>LexFoods</CODE>, but we just
+add words to the old one. Since <CODE>LexFoods</CODE> instances are resource modules,
+the superfluous definitions that they contain have no effect on the
+modules that just <CODE>open</CODE> them, and thus the smaller <CODE>Foods</CODE> grammars
+don't suffer from the additions we make.
+</P>
+<P>
+<B>Exercise</B>. Port the <CODE>ExtFoods</CODE> grammars to some new languages, building
+on the <CODE>Foods</CODE> implementations from previous sections, and using the functor
+defined in this section.
+</P>
+<A NAME="toc97"></A>
+<H2>Tenses</H2>
+<P>
+<a name="sectense"></a>
+</P>
+<P>
+When compiling the <CODE>ExtFoods</CODE> grammars, we have used the path
+</P>
+<PRE>
+    --# -path=.:../foods:present:prelude
+</PRE>
+<P>
+where the library subdirectory <CODE>present</CODE> refers to a restricted version
+of the resource that covers only the present tense of verbs and sentences.
+Having this version available is motivatad by efficiency reasons: tenses
+produce in many languages a manifold of forms and combinations, which
+multiply the size of the grammar; at the same time, many applications,
+both technical ones and spoken dialogues, only need the present tense.
+</P>
+<P>
+But it is easy change the grammars so that they admit of the full set
+of tenses. It is enough to change the path to
+</P>
+<PRE>
+    --# -path=.:../foods:alltenses:prelude
+</PRE>
+<P>
+and recompile the grammars from source (flag <CODE>-src</CODE>); the libraries are
+not recompiled, because their sources cannot be found on the path list. 
+Then it is possible to see all the tenses of
+phrases, by using the <CODE>-all</CODE> flag in linearization:
+</P>
+<PRE>
+    &gt; gr -cat=Phrase | l -all
+    This wine is delicious
+    Is this wine delicious
+    This wine isn't delicious
+    Isn't this wine delicious
+    This wine is not delicious
+    Is this wine not delicious
+    This wine has been delicious
+    Has this wine been delicious
+    This wine hasn't been delicious
+    Hasn't this wine been delicious
+    This wine has not been delicious
+    Has this wine not been delicious
+    This wine was delicious
+    Was this wine delicious
+    This wine wasn't delicious
+    Wasn't this wine delicious
+    This wine was not delicious
+    Was this wine not delicious
+    This wine had been delicious
+    Had this wine been delicious
+    This wine hadn't been delicious
+    Hadn't this wine been delicious
+    This wine had not been delicious
+    Had this wine not been delicious
+    This wine will be delicious
+    Will this wine be delicious
+    This wine won't be delicious
+    Won't this wine be delicious
+    This wine will not be delicious
+    Will this wine not be delicious
+    This wine will have been delicious
+    Will this wine have been delicious
+    This wine won't have been delicious
+    Won't this wine have been delicious
+    This wine will not have been delicious
+    Will this wine not have been delicious
+    This wine would be delicious
+    Would this wine be delicious
+    This wine wouldn't be delicious
+    Wouldn't this wine be delicious
+    This wine would not be delicious
+    Would this wine not be delicious
+    This wine would have been delicious
+    Would this wine have been delicious
+    This wine wouldn't have been delicious
+    Wouldn't this wine have been delicious
+    This wine would not have been delicious
+    Would this wine not have been delicious
+</PRE>
+<P>
+In addition to tenses, the linearization writes all parametric 
+variations --- polarity and word order (direct vs. inverted) --- as
+well as the variation between contracted and full negation words.
+Of course, the list is even longer in languages that have more
+tenses and moods, e.g. the Romance languages.
+</P>
+<P>
+In the <CODE>ExtFoods</CODE> grammar, tenses never find their way to the
+top level of <CODE>Move</CODE>s. Therefore it is useless to carry around
+the clause and verb tenses given in the <CODE>alltenses</CODE> set of libraries.
+But with the library, it is easy to add tenses to <CODE>Move</CODE>s. For
+instance, one can add the rules 
+</P>
+<PRE>
+    fun MAssertFut      : Phrase -&gt; Move ;  -- I will pay this wine
+    fun MAssertPastPerf : Phrase -&gt; Move ;  -- I had paid that wine
+    lin MAssertFut p = mkText (mkS futureTense p) ;
+    lin MAssertPastPerf p = mkText (mkS pastTense anteriorAnt p) ;
+</PRE>
+<P>
+Comparison with <CODE>MAssert</CODE> above shows that the absence of the tense
+and anteriority features defaults to present simultaneous tenses.
+</P>
+<P>
+<B>Exercise</B>. Measure the size of the context-free grammar corresponding to
+some concrete syntax of <CODE>ExtFoods</CODE> with all tenses. 
+You can do this by printing the grammar in the context-free format
+(<CODE>print_grammar -printer=cfg</CODE>) and counting the lines.
+</P>
+<A NAME="toc98"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc99"></A>
+<H3>Interfaces and instances</H3>
+<P>
+An <B>interface module</B> (<CODE>interface</CODE> <I>I</I>) is like a <CODE>resource</CODE> module,
+the difference being that it does not need to give definitions in 
+its <CODE>oper</CODE> and <CODE>param</CODE> judgements. Definitions are, however,
+allowed, and they may use constants that appear undefined in the
+module. For example, here is an interface for predication, which
+is parametrized on NP case and agreement features, and on the constituent
+order:
+</P>
+<PRE>
+    interface Predication = {
+      param 
+        Case ;
+        Agreement ;
+      oper 
+        subject : Case ;
+        object : Case ;
+        order : (verb,subj,obj : String) -&gt; String ;
+  
+        NP : Type = {s : Case =&gt; Str ; a : Agreement} ;
+        TV : Type = {s : Agreement =&gt; Str} ;
+  
+        sentence : TV -&gt; NP -&gt; NP -&gt; {s : Str} = \verb,subj,obj -&gt; {
+          s = order (verb ! subj.a) (subj ! subject) (obj ! object) ;
+    }
+</PRE>
+<P>
+An <B>instance module</B> (<CODE>instance</CODE> <I>J</I> <CODE>of</CODE> <I>I</I>) is also like a
+<CODE>resource</CODE>, but it is compiled in union with the interface that it
+is an instance <CODE>of</CODE>. This means that the definitions given in the
+instance are type-checked with respect to the types given in the
+interface. Moreover, overwriting types or definitions given in the interface
+is not allowed. But it is legal for an instance to contain definitions
+not included in the corresponding interface. Here is an instance of
+<CODE>Predication</CODE>, suitable for languages like English.
+</P>
+<PRE>
+    instance PredicationSimpleSVO of Predication = {
+      param 
+        Case = Nom | Acc | Gen ;
+        Agreement = Agr Number Person ;
+  
+        -- two new types     
+        Number = Sg | Pl ;
+        Person = P1 | P2 | P3 ;
+  
+      oper 
+        subject = Nom ;
+        object = Acc ;
+        order = \verb,subj,obj -&gt; subj ++ verb ++ obj ;
+  
+        -- the rest of the definitions don't need repetition
+    }
+</PRE>
+<P></P>
+<A NAME="toc100"></A>
+<H3>Grammar reuse</H3>
+<P>
+<a name="seclock"></a>
+</P>
+<P>
+Abstract syntax modules can be used like interfaces, and concrete syntaxes
+as their instances. The following translations then take place:
+</P>
+<PRE>
+    cat C         ---&gt; oper C : Type
+  
+    fun f : A     ---&gt; oper f : A*
+  
+    lincat C = T  ---&gt; oper C : Type = T'
+  
+    lin f = t     ---&gt; oper f : A* = t'
+</PRE>
+<P>
+This translation is called <B>grammar reuse</B>. It uses a homomorphism
+from abstract types and terms to the concrete types and terms. For the
+sake of more type safety, the types are not exactly the same. Currently
+(GF 2.8), the type <I>T'</I> formed from the linearization type <I>T</I> of
+a category <I>C</I> is <I>T</I> extended with a dummy <B>lock field</B>. Thus
+</P>
+<PRE>
+    lincat C = T  ---&gt; oper C = T ** {lock_C : {}}
+</PRE>
+<P>
+and the linearization terms are lifted correspondingly. The user of
+a GF library should never see any lock fields; when they appear in
+the compiler's warnings, they indicate that some library category is
+constructed improperly by a user program.
+</P>
+<A NAME="toc101"></A>
+<H3>Functors</H3>
+<P>
+A <B>parametrized module</B>, aka. an <B>incomplete module</B>, or a 
+<B>functor</B>, is any module that <CODE>open</CODE>s an <CODE>interface</CODE>  (or
+an <CODE>abstract</CODE>). Several interfaces may be opened by one
+functor. The module header must be prefixed by the word <CODE>incomplete</CODE>.
+Here is a typical example, using the resource <CODE>Syntax</CODE> and
+a domain specific lexicon:
+</P>
+<PRE>
+    incomplete concrete DomainI of Domain = open Syntax, Lex in {...} ;
+</PRE>
+<P>
+A <B>functor instantiation</B> is a module that inherits a functor and
+provides an instance to each of its open interfaces. Here is an example:
+</P>
+<PRE>
+    concrete DomainSwe of Domain = DomainI with
+      (Syntax = SyntaxSwe),
+      (Lex = LexSwe) ;
+</PRE>
+<P></P>
+<A NAME="toc102"></A>
+<H3>Restricted inheritance</H3>
+<P>
+A module of any type can make <B>restricted inheritance</B>, which is
+either exclusion or inclusion:
+</P>
+<PRE>
+    module M = A[f,g], B-[k] ** ...
+</PRE>
+<P>
+A concrete syntax given to an abstract syntax that uses restricted inheritance
+must make the corresponding restrictions. In addition, the concrete syntax can
+make its own restrictions in order to redefine inherited linearization types and
+rules.
+</P>
+<P>
+Overriding old definitions without explicit restrictions is not allowed.
+</P>
+<A NAME="toc103"></A>
+<H1>Refining semantics in abstract syntax</H1>
+<P>
+<a name="chapsix"></a>
+</P>
+<P>
+While the concrete syntax constructs of GF have been already
+covered, there is much more that can be done in the abstract 
+syntax. The techniques of <B>dependent types</B> and 
+<B>higher order abstract syntax</B> are introduced in this chapter,
+which thereby concludes the presentation of the GF language.
+</P>
+<P>
+Many of the examples in this chapter are somewhat less close to
+applications than the ones shown before. Moreover, the tools for
+embedded grammars in <a href="#chapeight">the eighth chapter</a> do not yet fully support dependent
+types and higher order abstract syntax.
+</P>
+<A NAME="toc104"></A>
+<H2>GF as a logical framework</H2>
+<P>
+In this chapter, we will show how 
+to encode advanced semantic concepts in an abstract syntax.
+We use concepts inherited from <B>type theory</B>. Type theory
+is the basis of many systems known as <B>logical frameworks</B>, which are
+used for representing mathematical theorems and their proofs on a computer.
+In fact, GF has a logical framework as its proper part:
+this part is the abstract syntax.
+</P>
+<P>
+In a logical framework, the formalization of a mathematical theory
+is a set of type and function declarations. The following is an example
+of such a theory, represented as an <CODE>abstract</CODE> module in GF.
+</P>
+<PRE>
+  abstract Arithm = {
+    cat
+      Prop ;                        -- proposition
+      Nat ;                         -- natural number
+    fun
+      Zero : Nat ;                  -- 0
+      Succ : Nat -&gt; Nat ;           -- the successor of x
+      Even : Nat -&gt; Prop ;          -- x is even
+      And  : Prop -&gt; Prop -&gt; Prop ; -- A and B
+      } 
+</PRE>
+<P>
+This example does not show any new type-theoretical constructs yet, but
+it could nevertheless be used as a part of a proof system for arithmetic.
+</P>
+<P>
+<B>Exercise</B>. Give a concrete syntax of <CODE>Arithm</CODE>, preferably
+by using the resource library.
+</P>
+<A NAME="toc105"></A>
+<H2>Dependent types</H2>
+<P>
+<a name="secsmarthouse"></a>
+</P>
+<P>
+<B>Dependent types</B> are a characteristic feature of GF,
+inherited from the <B>constructive type theory</B> of Martin-L�f and
+distinguishing GF from most other grammar formalisms and
+functional programming languages.
+</P>
+<P>
+Dependent types can be used for stating stronger
+<B>conditions of well-formedness</B> than ordinary types.
+A simple example is a "smart house" system, which
+defines voice commands for household appliances.   This example
+is borrowed from the 
+Regulus Book
+(Rayner &amp; al. 2006).
+</P>
+<P>
+One who enters a smart house can use a spoken <CODE>Command</CODE> to dim lights, switch
+on the fan, etc. For <CODE>Device</CODE>s of each <CODE>Kind</CODE>, there is a set of
+<CODE>Action</CODE>s that can be performed on them; thus one can dim the lights but
+ not the fan, for example. These dependencies can be expressed 
+by making the type <CODE>Action</CODE> dependent on <CODE>Kind</CODE>. We express these
+dependencies in <CODE>cat</CODE> declarations by attaching argument types to
+categories:
+</P>
+<PRE>
+    cat
+      Command ;
+      Kind ; 
+      Device Kind ; -- argument type Kind 
+      Action Kind ; 
+</PRE>
+<P>
+The crucial use of the dependencies is made in the rule for forming commands:
+</P>
+<PRE>
+    fun CAction : (k : Kind) -&gt; Action k -&gt; Device k -&gt; Command ;
+</PRE>
+<P>
+In other words: an action and a device can be combined into a command only
+if they are of the same <CODE>Kind</CODE> <CODE>k</CODE>. If we have the functions
+</P>
+<PRE>
+    DKindOne  : (k : Kind) -&gt; Device k ;  -- the light
+  
+    light, fan : Kind ;
+    dim : Action light ;
+</PRE>
+<P>
+we can form the syntax tree
+</P>
+<PRE>
+    CAction light dim (DKindOne light)
+</PRE>
+<P>
+but we cannot form the trees
+</P>
+<PRE>
+    CAction light dim (DKindOne fan)
+    CAction fan   dim (DKindOne light)
+    CAction fan   dim (DKindOne fan)
+</PRE>
+<P>
+Linearization rules are written as usual: the concrete syntax does not
+know if a category is a dependent type. In English, one could write as follows:
+</P>
+<PRE>
+    lincat Action = {s : Str} ;
+    lin CAction _ act dev = {s = act.s ++ dev.s} ; 
+</PRE>
+<P>
+Notice that the argument for <CODE>Kind</CODE> does not appear in the linearization;
+therefore it is good practice to make this clear by
+using a wild card for it, rather than a real
+variable. 
+As we will show, 
+the type checker can reconstruct the kind from the <CODE>dev</CODE> argument.
+</P>
+<P>
+Parsing with dependent types is performed in two phases:
+</P>
+<OL>
+<LI>context-free parsing
+<LI>filtering through type checker
+</OL>
+
+<P>
+If you just parse in the usual way, you don't enter the second phase, and
+the <CODE>kind</CODE> argument is not found:
+</P>
+<PRE>
+    &gt; parse "dim the light"
+    CAction ? dim (DKindOne light)
+</PRE>
+<P>
+Moreover, type-incorrect commands are not rejected:
+</P>
+<PRE>
+    &gt; parse "dim the fan"
+    CAction ? dim (DKindOne fan)
+</PRE>
+<P>
+The question mark <CODE>?</CODE> is a <B>metavariable</B>, and is returned by the parser
+for any subtree that is suppressed by a linearization rule.
+These are exactly the same kind of metavariables as were used <a href="#secediting">here</a>
+to mark incomplete parts of trees in the syntax editor.
+</P>
+<P>
+To get rid of metavariables, we must feed the parse result into the
+second phase of <B>solving</B> them. The <CODE>solve</CODE> process uses the dependent
+type checker to restore the values of the metavariables. It is invoked by
+the command <CODE>put_tree = pt</CODE> with the flag <CODE>-transform=solve</CODE>:
+</P>
+<PRE>
+    &gt; parse "dim the light" | put_tree -transform=solve
+    CAction light dim (DKindOne light)
+</PRE>
+<P>
+The <CODE>solve</CODE> process may fail, in which case no tree is returned:
+</P>
+<PRE>
+    &gt; parse "dim the fan" | put_tree -transform=solve
+    no tree found
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Write an abstract syntax module with above contents
+and an appropriate English concrete syntax. Try to parse the commands
+<I>dim the light</I> and <I>dim the fan</I>, with and without <CODE>solve</CODE> filtering.
+</P>
+<P>
+<B>Exercise</B>. Perform random and exhaustive generation, with and without
+<CODE>solve</CODE> filtering.
+</P>
+<P>
+<B>Exercise</B>. Add some device kinds and actions to the grammar.
+</P>
+<A NAME="toc106"></A>
+<H2>Polymorphism</H2>
+<P>
+<a name="secpolymorphic"></a>
+</P>
+<P>
+Sometimes an action can be performed on all kinds of devices. It would be
+possible to introduce separate <CODE>fun</CODE> constants for each kind-action pair,
+but this would be tedious. Instead, one can use <B>polymorphic</B> actions,
+i.e. actions that take a <CODE>Kind</CODE> as an argument and produce an <CODE>Action</CODE>
+for that <CODE>Kind</CODE>:
+</P>
+<PRE>
+    fun switchOn, switchOff : (k : Kind) -&gt; Action k ;
+</PRE>
+<P>
+Functions that are not polymorphic are <B>monomorphic</B>. However, the 
+dichotomy into monomorphism and full polymorphism is not always sufficient
+for good semantic modelling: very typically, some actions are defined
+for a proper subset of devices, but not just one. For instance, both doors and
+windows can be opened, whereas lights cannot.
+We will return to this problem by introducing the
+concept of <B>restricted polymorphism</B> later,
+after a section on proof objects.
+</P>
+<P>
+<B>Exercise</B>. The grammar <CODE>ExtFoods</CODE> <a href="#secextended">here</a> permits the
+formation of phrases such as <I>we drink this fish</I> and <I>we eat this wine</I>.
+A way to prevent them is to distinguish between eatable and drinkable food items.
+Another, related problem is that there is some duplicated code
+due to a category distinction between guests and food items, for instance,
+two constructors for the determiner <I>this</I>. This problem can also
+be solved by dependent types. Rewrite the abstract syntax in <CODE>Foods</CODE> and
+<CODE>ExtFoods</CODE> by using such a type system, and also update the concrete syntaxes.
+If you do this right, you only have to change the functor modules
+<CODE>FoodsI</CODE> and <CODE>ExtFoodsI</CODE> in the concrete syntax.
+</P>
+<A NAME="toc107"></A>
+<H3>Digression: dependent types in concrete syntax</H3>
+<P>
+The <B>functional fragment</B> of GF
+terms and types comprises function types, applications, lambda
+abstracts, constants, and variables. This fragment is the same in
+abstract and concrete syntax. In particular,
+dependent types are also available in concrete syntax.
+We have not made use of them yet,
+but we will now look at one example of how they
+can be used.
+</P>
+<P>
+Those readers who are familiar with functional programming languages
+like ML and Haskell, may already have missed <B>polymorphic</B>
+functions. For instance, Haskell programmers have access to
+the functions
+</P>
+<PRE>
+    const :: a -&gt; b -&gt; a
+    const c _ = c
+  
+    flip :: (a -&gt; b -&gt; c) -&gt; b -&gt; a -&gt; c
+    flip f y x = f x y
+</PRE>
+<P>
+which can be used for any given types <CODE>a</CODE>,<CODE>b</CODE>, and <CODE>c</CODE>.
+</P>
+<P>
+The GF counterpart of polymorphic functions are <B>monomorphic</B>
+functions with explicit <B>type variables</B> --- a techniques that we already
+used in abstract syntax for modelling actions that can be performed
+on all kinds of devices. Thus the above definitions can be written
+</P>
+<PRE>
+    oper const :(a,b : Type) -&gt; a -&gt; b -&gt; a =
+      \_,_,c,_ -&gt; c ;
+  
+    oper flip : (a,b,c : Type) -&gt; (a -&gt; b -&gt;c) -&gt; b -&gt; a -&gt; c =
+      \_,_,_,f,x,y -&gt; f y x ;
+</PRE>
+<P>
+When the operations are used, the type checker requires
+them to be equipped with all their arguments; this may be a nuisance
+for a Haskell or ML programmer. They have not been used very much,
+except in the <CODE>Coordination</CODE> module of the resource library.
+</P>
+<A NAME="toc108"></A>
+<H2>Proof objects</H2>
+<P>
+Perhaps the most well-known idea in constructive type theory is
+the <B>Curry-Howard isomorphism</B>, also known as the
+<B>propositions as types principle</B>. Its earliest formulations
+were attempts to give semantics to the logical systems of
+propositional and predicate calculus. In this section, we will consider
+a more elementary example, showing how the notion of proof is useful
+outside mathematics, as well.
+</P>
+<P>
+We use the already shown category of unary (also known as Peano-style)
+natural numbers:
+</P>
+<PRE>
+    cat Nat ; 
+    fun Zero : Nat ;
+    fun Succ : Nat -&gt; Nat ;
+</PRE>
+<P>
+The <B>successor function</B> <CODE>Succ</CODE> generates an infinite
+sequence of natural numbers, beginning from <CODE>Zero</CODE>.
+</P>
+<P>
+We then define what it means for a number <I>x</I> to be <I>less than</I>
+a number <I>y</I>. Our definition is based on two axioms:
+</P>
+<UL>
+<LI><CODE>Zero</CODE> is less than <CODE>Succ</CODE> <I>y</I> for any <I>y</I>.
+<LI>If <I>x</I> is less than <I>y</I>, then <CODE>Succ</CODE> <I>x</I> is less than <CODE>Succ</CODE> <I>y</I>.
+</UL>
+
+<P>
+The most straightforward way of expressing these axioms in type theory
+is with a dependent type <CODE>Less</CODE> <I>x y</I>, and two functions constructing
+its objects:
+</P>
+<PRE>
+    cat Less Nat Nat ; 
+    fun lessZ : (y : Nat) -&gt; Less Zero (Succ y) ;
+    fun lessS : (x,y : Nat) -&gt; Less x y -&gt; Less (Succ x) (Succ y) ;
+</PRE>
+<P>
+Objects formed by <CODE>lessZ</CODE> and <CODE>lessS</CODE> are
+called <B>proof objects</B>: they establish the truth of certain
+mathematical propositions.
+For instance, the fact that 2 is less that
+4 has the proof object
+</P>
+<PRE>
+    lessS (Succ Zero) (Succ (Succ (Succ Zero)))
+          (lessS Zero (Succ (Succ Zero)) (lessZ (Succ Zero)))
+</PRE>
+<P>
+whose type is
+</P>
+<PRE>
+    Less (Succ (Succ Zero)) (Succ (Succ (Succ (Succ Zero))))
+</PRE>
+<P>
+which is the formalization of the proposition that 2 is less than 4.
+</P>
+<P>
+GF grammars can be used to provide a <B>semantic control</B> of
+well-formedness of expressions. We have already seen examples of this:
+the grammar of well-formed actions on household devices. By introducing proof objects
+we have now added an even more powerful technique of expressing semantic conditions.
+</P>
+<P>
+A simple example of the use of proof objects is the definition of
+well-formed <I>time spans</I>: a time span is expected to be from an earlier to
+a later time:
+</P>
+<PRE>
+    from 3 to 8
+</PRE>
+<P>
+is thus well-formed, whereas
+</P>
+<PRE>
+    from 8 to 3
+</PRE>
+<P>
+is not. The following rules for spans impose this condition
+by using the <CODE>Less</CODE> predicate:
+</P>
+<PRE>
+    cat Span ;
+    fun span : (m,n : Nat) -&gt; Less m n -&gt; Span ;
+</PRE>
+<P></P>
+<P>
+<B>Exercise</B>. Write an abstract and concrete syntax with the
+concepts of this section, and experiment with it in GF.
+</P>
+<P>
+<B>Exercise</B>. Define the notions of "even" and "odd" in terms
+of proof objects. <B>Hint</B>. You need one function for proving
+that 0 is even, and two other functions for propagating the
+properties.
+</P>
+<A NAME="toc109"></A>
+<H3>Proof-carrying documents</H3>
+<P>
+Another possible application of proof objects is <B>proof-carrying documents</B>:
+to be semantically well-formed, the abstract syntax of a document must contain a proof
+of some property, although the proof is not shown in the concrete document.
+Think, for instance, of small documents describing flight connections:
+</P>
+<P>
+<I>To fly from Gothenburg to Prague, first take LH3043 to Frankfurt, then OK0537 to Prague.</I>
+</P>
+<P>
+The well-formedness of this text is partly expressible by dependent typing: 
+</P>
+<PRE>
+    cat
+      City ;
+      Flight City City ;
+    fun
+      Gothenburg, Frankfurt, Prague : City ;
+      LH3043 : Flight Gothenburg Frankfurt ;
+      OK0537 : Flight Frankfurt Prague ;
+</PRE>
+<P>
+This rules out texts saying <I>take OK0537 from Gothenburg to Prague</I>. 
+However, there is a
+further condition saying that it must be possible to 
+change from LH3043 to OK0537 in Frankfurt.
+This can be modelled as a proof object of a suitable type, 
+which is required by the constructor
+that connects flights.
+</P>
+<PRE>
+    cat
+      IsPossible (x,y,z : City)(Flight x y)(Flight y z) ;
+    fun
+      Connect : (x,y,z : City) -&gt; 
+        (u : Flight x y) -&gt; (v : Flight y z) -&gt; 
+          IsPossible x y z u v -&gt; Flight x z ;
+</PRE>
+<P></P>
+<A NAME="toc110"></A>
+<H2>Restricted polymorphism</H2>
+<P>
+In the first version of the smart house grammar <CODE>Smart</CODE>, 
+all Actions were either of
+</P>
+<UL>
+<LI><B>monomorphic</B>: defined for one Kind
+<LI><B>polymorphic</B>: defined for all Kinds
+</UL>
+
+<P>
+To make this scale up for new Kinds, we can refine this to 
+<B>restricted polymorphism</B>: defined for Kinds of a certain <B>class</B>
+</P>
+<P>
+The notion of class can be expressed in abstract syntax 
+by using the Curry-Howard isomorphism as follows:
+</P>
+<UL>
+<LI>a class is a <B>predicate</B> of Kinds --- i.e. a type depending of Kinds
+<LI>a Kind is in a class if there is a proof object of this type
+</UL>
+
+<P>
+Here is an example with switching and dimming. The classes are called
+<CODE>switchable</CODE> and <CODE>dimmable</CODE>.
+</P>
+<PRE>
+  cat
+    Switchable Kind ;
+    Dimmable   Kind ;
+  fun
+    switchable_light : Switchable light ;
+    switchable_fan   : Switchable fan ;
+    dimmable_light   : Dimmable light ;
+  
+    switchOn : (k : Kind) -&gt; Switchable k -&gt; Action k ;
+    dim      : (k : Kind) -&gt; Dimmable k -&gt; Action k ;
+</PRE>
+<P>
+One advantage of this formalization is that classes for new 
+actions can be added incrementally. 
+</P>
+<P>
+<B>Exercise</B>. Write a new version of the <CODE>Smart</CODE> grammar with
+classes, and test it in GF.
+</P>
+<P>
+<B>Exercise</B>. Add some actions, kinds, and classes to the grammar.
+Try to port the grammar to a new language. You will probably find
+out that restricted polymorphism works differently in different languages.
+For instance, in Finnish not only doors but also TVs and radios 
+can be "opened", which means switching them on.
+</P>
+<A NAME="toc111"></A>
+<H2>Variable bindings</H2>
+<P>
+<a name="secbinding"></a>
+</P>
+<P>
+Mathematical notation and programming languages have
+expressions that <B>bind</B> variables. For instance,
+a universally quantifier proposition
+</P>
+<PRE>
+    (All x)B(x)
+</PRE>
+<P>
+consists of the <B>binding</B> <CODE>(All x)</CODE> of the variable <CODE>x</CODE>,
+and the <B>body</B> <CODE>B(x)</CODE>, where the variable <CODE>x</CODE> can have
+<B>bound occurrences</B>. 
+</P>
+<P>
+Variable bindings appear in informal mathematical language as well, for
+instance,
+</P>
+<PRE>
+    for all x, x is equal to x
+  
+    the function that for any numbers x and y returns the maximum of x+y
+    and x*y
+  
+    Let x be a natural number. Assume that x is even. Then x + 3 is odd.
+</PRE>
+<P>
+In type theory, variable-binding expression forms can be formalized
+as functions that take functions as arguments. The universal
+quantifier is defined
+</P>
+<PRE>
+    fun All : (Ind -&gt; Prop) -&gt; Prop
+</PRE>
+<P>
+where <CODE>Ind</CODE> is the type of individuals and <CODE>Prop</CODE>,
+the type of propositions. If we have, for instance, the equality predicate
+</P>
+<PRE>
+    fun Eq : Ind -&gt; Ind -&gt; Prop
+</PRE>
+<P>
+we may form the tree
+</P>
+<PRE>
+    All (\x -&gt; Eq x x)
+</PRE>
+<P>
+which corresponds to the ordinary notation
+</P>
+<PRE>
+    (All x)(x = x).
+</PRE>
+<P>
+An abstract syntax where trees have functions as arguments, as in
+the two examples above, has turned out to be precisely the right
+thing for the semantics and computer implementation of
+variable-binding expressions. The advantage lies in the fact that
+only one variable-binding expression form is needed, the lambda abstract
+<CODE>\x -&gt; b</CODE>, and all other bindings can be reduced to it.
+This makes it easier to implement mathematical theories and reason
+about them, since variable binding is tricky to implement and
+to reason about. The idea of using functions as arguments of
+syntactic constructors is known as <B>higher-order abstract syntax</B>.
+</P>
+<P>
+The question now arises: how to define linearization rules
+for variable-binding expressions?
+Let us first consider universal quantification,
+</P>
+<PRE>
+    fun All : (Ind -&gt; Prop) -&gt; Prop
+</PRE>
+<P>
+In GF, we write
+</P>
+<PRE>
+    lin All B = {s = "(" ++ "All" ++ B.$0 ++ ")" ++ B.s}
+</PRE>
+<P>
+to obtain the form shown above. 
+This linearization rule brings in a new GF concept --- the <CODE>$0</CODE>
+field of <CODE>B</CODE> containing a bound variable symbol.
+The general rule is that, if an argument type of a function is
+itself a function type <CODE>A -&gt; C</CODE>, the linearization type of
+this argument is the linearization type of <CODE>C</CODE>
+together with a new field <CODE>$0 : Str</CODE>. In the linearization rule
+for <CODE>All</CODE>, the argument <CODE>B</CODE> thus has the linearization
+type
+</P>
+<PRE>
+    {$0 : Str ; s : Str},
+</PRE>
+<P>
+since the linearization type of <CODE>Prop</CODE> is
+</P>
+<PRE>
+    {s : Str}
+</PRE>
+<P>
+In other words, the linearization of a function
+consists of a linearization of the body together with a
+field for a linearization of the bound variable.
+Those familiar with type theory or lambda calculus
+should notice that GF requires trees to be in
+<B>eta-expanded</B> form in order for this to make sense:
+for any function of type
+</P>
+<PRE>
+    A -&gt; B
+</PRE>
+<P>
+an eta-expanded syntax tree has the form
+</P>
+<PRE>
+    \x -&gt; b
+</PRE>
+<P>
+where <CODE>b : B</CODE> under the assumption <CODE>x : A</CODE>.
+It is in this form that an expression can be analysed
+as having a bound variable and a body, which can be put into
+a linearization record.
+</P>
+<P>
+Given the linearization rule
+</P>
+<PRE>
+    lin Eq a b = {s = "(" ++ a.s ++ "=" ++ b.s ++ ")"}
+</PRE>
+<P>
+the linearization of
+</P>
+<PRE>
+    \x -&gt; Eq x x
+</PRE>
+<P>
+is the record
+</P>
+<PRE>
+    {$0 = "x", s = ["( x = x )"]}
+</PRE>
+<P>
+Thus we can compute the linearization of the formula,
+</P>
+<PRE>
+    All (\x -&gt; Eq x x)  --&gt; {s = "[( All x ) ( x = x )]"}.
+</PRE>
+<P>
+But how did we get the linearization of the variable <CODE>x</CODE>
+into the string <CODE>"x"</CODE>? GF grammars have no rules for
+this: it is just hard-wired in GF that variable symbols are
+linearized into the same strings that represent them in
+the print-out of the abstract syntax. 
+</P>
+<P>
+To be able to <I>parse</I> variable symbols, however, GF needs to know what
+to look for (instead of e.g. trying to parse <I>any</I>
+string as a variable). What strings are parsed as variable symbols 
+is defined in the lexical analysis part of GF parsing 
+</P>
+<PRE>
+    &gt; p -cat=Prop -lexer=codevars "(All x)(x = x)"
+    All (\x -&gt; Eq x x)
+</PRE>
+<P>
+(see more details on lexers <a href="#seclexing">here</a>). If several variables are bound in the 
+same argument, the labels are <CODE>$0, $1, $2</CODE>, etc. 
+</P>
+<P>
+<B>Exercise</B>. Write an abstract syntax of the whole
+<B>predicate calculus</B>, with the
+<B>connectives</B> "and", "or", "implies", and "not", and the
+<B>quantifiers</B> "exists" and "for all". Use higher-order functions
+to guarantee that unbounded variables do not occur.
+</P>
+<P>
+<B>Exercise</B>. Write a concrete syntax for your favourite
+notation of predicate calculus. Use Latex as target language
+if you want nice output. You can also try producing boolean
+expressions of some programming language. Use as many parenthesis as you need to
+guarantee non-ambiguity.
+</P>
+<A NAME="toc112"></A>
+<H2>Semantic definitions</H2>
+<P>
+<a name="secdefdef"></a>
+</P>
+<P>
+Just like any functional programming language, abstract syntax in 
+GF has declarations of functions, telling what the type of a function is.
+But we have not yet shown how to <B>compute</B>
+these functions: all we can do is provide them with arguments
+and linearize the resulting terms.
+Since our main interest is the well-formedness of expressions,
+this has not yet bothered 
+us very much. As we will see, however, computation does play a role
+even in the well-formedness of expressions when dependent types are
+present.
+</P>
+<P>
+GF has a form of judgement for <B>semantic definitions</B>,
+marked by the key word <CODE>def</CODE>. At its simplest, it is just
+the definition of one constant, e.g.
+</P>
+<PRE>
+    fun one : Nat ;
+    def one = Succ Zero ;
+</PRE>
+<P>
+Notice a <CODE>def</CODE> definition can only be given to names declared by
+<CODE>fun</CODE> judgements in the same module; it is not possible to define
+an inherited name.
+</P>
+<P>
+We can also define a function with arguments,
+</P>
+<PRE>
+    fun twice : Nat -&gt; Nat ;
+    def twice x = plus x x ;
+</PRE>
+<P>
+which is still a special case of the most general notion of
+definition, that of a group of <B>pattern equations</B>:
+</P>
+<PRE>
+    fun plus : Nat -&gt; Nat -&gt; Nat ;
+    def 
+      plus x Zero = x ;
+      plus x (Succ y) = Succ (Sum x y) ;
+</PRE>
+<P>
+To compute a term is, as in functional programming languages,
+simply to follow a chain of reductions until no definition
+can be applied. For instance, we compute
+</P>
+<PRE>
+    Sum one one --&gt;
+    Sum (Succ Zero) (Succ Zero) --&gt;
+    Succ (sum (Succ Zero) Zero) --&gt;
+    Succ (Succ Zero)
+</PRE>
+<P>
+Computation in GF is performed with the <CODE>pt</CODE> command and the
+<CODE>compute</CODE> transformation, e.g.
+</P>
+<PRE>
+    &gt; p -tr "1 + 1" | pt -transform=compute -tr | l
+    sum one one
+    Succ (Succ Zero)
+    s(s(0))
+</PRE>
+<P></P>
+<P>
+The <CODE>def</CODE> definitions of a grammar induce a notion of
+<B>definitional equality</B> among trees: two trees are
+definitionally equal if they compute into the same tree.
+Thus, trivially, all trees in a chain of computation
+(such as the one above) are definitionally equal to each other. 
+In general, there can be infinitely many definitionally equal trees.
+</P>
+<P>
+An important property of definitional equality is that it is
+<B>extensional</B>, i.e. has to do with the sameness of semantic value. 
+Linearization, on the other hand, is an <B>intensional</B> operation,
+i.e. has to do with the sameness of expression. This means that 
+<CODE>def</CODE> definitions are <I>not</I> evaluated as linearization steps.
+Intensionality is a crucial property of linearization, since we want
+to use it for things like tracing a chain of evaluation.
+For instance, each of the steps of the computation above
+has a different linearization into standard arithmetic notation:
+</P>
+<PRE>
+    1 + 1
+    s(0) + s(0)
+    s(s(0) + 0)
+    s(s(0))
+</PRE>
+<P>
+In most programming languages, the operations that can be performed on
+expressions are extensional, i.e. give equal values to equal arguments.
+But GF has both extensional and intensional operations.  
+Type checking is extensional:
+in the type theory with dependent types, types may depend on terms,
+and types depending on definitionally equal terms are
+equal types. For instance,
+</P>
+<PRE>
+    Less Zero one
+    Less Zero (Succ Zero))
+</PRE>
+<P>
+are equal types. Hence, any tree that type checks as a proof that
+1 is odd also type checks as a proof that the successor of 0 is odd.
+(Recall, in this connection, that the
+arguments a category depends on never play any role
+in the linearization of trees of that category,
+nor in the definition of the linearization type.)
+</P>
+<P>
+When pattern matching is performed with <CODE>def</CODE> equations, it is
+crucial to distinguish between <B>constructors</B> and other functions
+(cf. <a href="#secmatching">here</a> on pattern matching in concrete syntax).
+GF has a judgement form <CODE>data</CODE> to tell that a category has 
+certain functions as constructors:
+</P>
+<PRE>
+    data Nat = Succ | Zero ;
+</PRE>
+<P>
+Unlike in Haskell and ML, new constructors can be added to
+a type with new <CODE>data</CODE> judgements. The type signatures of constructors
+are given separately, in ordinary <CODE>fun</CODE> judgements.
+One can also write directly
+</P>
+<PRE>
+    data Succ : Nat -&gt; Nat ;
+</PRE>
+<P>
+which is syntactic sugar for the pair of judgements
+</P>
+<PRE>
+    fun Succ : Nat -&gt; Nat ;
+    data Nat = Succ ;
+</PRE>
+<P>
+If we did not mark <CODE>Zero</CODE> as <CODE>data</CODE>, the definition
+</P>
+<PRE>
+    fun isZero : Nat -&gt; Bool ;
+    def isZero Zero = True ;
+    def isZero _  = False ;
+</PRE>
+<P>
+would return <CODE>True</CODE> for all arguments, because the pattern <CODE>Zero</CODE>
+would be treated as a variable and it would hence match all values!
+This is a common pitfall in GF.
+</P>
+<P>
+<B>Exercise</B>. Implement an interpreter of a small functional programming
+language with natural numbers, lists, pairs, lambdas, etc. Use higher-order
+abstract syntax with semantic definitions. As onject language, use
+your favourite programming language.
+</P>
+<A NAME="toc113"></A>
+<H2>Summary of GF language features</H2>
+<A NAME="toc114"></A>
+<H3>Judgements</H3>
+<P>
+We have generalized the <CODE>cat</CODE> judgement form and introduced two new forms
+for abstract syntax:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>form</TH>
+<TH COLSPAN="2">reading</TH>
+</TR>
+<TR>
+<TD><CODE>cat</CODE> <I>C</I> <I>G</I></TD>
+<TD><I>C</I> is a category in context <I>G</I></TD>
+</TR>
+<TR>
+<TD><CODE>def</CODE> <I>f</I> <I>P1</I> ... <I>Pn</I> <CODE>=</CODE> t</TD>
+<TD>function <I>f</I> applied to <I>P1</I>...<I>Pn</I> has value <I>t</I></TD>
+</TR>
+<TR>
+<TD><CODE>data</CODE> <I>C</I> <CODE>=</CODE> <I>C1</I> <CODE>|</CODE> ... <CODE>|</CODE> <I>Cn</I></TD>
+<TD>category <I>C</I> has constructors <I>C1</I>...<I>Cn</I></TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+The <B>context</B> in the <CODE>cat</CODE> judgement has the form
+</P>
+<PRE>
+    (x1 : T1) ... (xn : Tn)
+</PRE>
+<P>
+where the types <I>T1 ... Tn</I> may be increasingly dependent. To form a
+type, <I>C</I> must be equipped with arguments of each type in the
+context, satisfying the dependencies. As syntactic sugar, we have
+</P>
+<PRE>
+    T G  ===  (x : T) G 
+</PRE>
+<P>
+if <I>x</I> does not occur in <I>G</I>. The linearization type definition of a 
+category does not mention the context.
+</P>
+<P>
+In <CODE>def</CODE> judgements, the arguments <I>P1</I>...<I>Pn</I> can be constructor and
+variable patterns as well as wild cards, and the binding and
+evaluation rules are the same as <a href="#secmatching">here</a>.
+</P>
+<P>
+A <CODE>data</CODE> judgement states that the names on the right-hand side are constructors
+of the category on the left-hand side. The precise types of the constructors are
+given in the <CODE>fun</CODE> judgements introducing them; the value type of a constructor
+of <I>C</I> must be of the form <I>C a1 ... am</I>. As syntactic sugar,
+</P>
+<PRE>
+    data f : A1 ... An -&gt; C a1 ... am  ===   
+    fun f : A1 ... An -&gt; C a1 ... am  ; data C = f ;
+</PRE>
+<P></P>
+<A NAME="toc115"></A>
+<H3>Dependent function types</H3>
+<P>
+A <B>dependent function type</B> has the form
+</P>
+<PRE>
+    (x : A) -&gt; B
+</PRE>
+<P>
+where <I>B</I> depends on a variable <I>x</I> of type <I>A</I>. We have the
+following syntactic sugar:
+</P>
+<PRE>
+    (x,y : A) -&gt; B  ===  (x : A) -&gt; (y : A) -&gt; B
+    
+    (_ : A) -&gt; B    ===  (x : A) -&gt; B  if B does not depend on x
+  
+    A -&gt; B          ===  (_ : A) -&gt; B
+</PRE>
+<P>
+A <CODE>fun</CODE> function in abstract syntax may have function types as
+argument types. This is called <B>higher-order abstract syntax</B>.
+The linearization of an argument
+</P>
+<PRE>
+    \z0, ..., zn -&gt; b : (x0 : A1) -&gt; ... -&gt; (xn : An) -&gt; B
+</PRE>
+<P>
+if formed from the linearization of <I>b*</I> of <I>b</I> by adding
+fields that hold the variable symbols:
+</P>
+<PRE>
+    b* ** {$0 = "z0" ; ... ; $n = "zn"}
+</PRE>
+<P>
+If an argument function is itself a higher-order function, its
+bound variables cannot be reached in linearization. Thus, in a sense,
+the higher-order abstract syntax of GF is just <B>second-orde abstract syntax</B>.
+</P>
+<P>
+A <B>syntax tree</B> is a well-typed term in <B>beta-eta normal form</B>, which
+means that 
+</P>
+<UL>
+<LI>its type is a basic type, i.e. it is not a partial application;
+<LI>its arguments are in eta normal form, i.e. either full applications or
+  lambda abstractions with bodies that are full applications;
+<LI>it has no beta redexes, i.e. applications of abstractions.
+</UL>
+
+<P>
+Terms that are not in this form may occur as arguments of dependent types
+and in <CODE>def</CODE> judgements, but they cannot be linearized.
+</P>
+<A NAME="toc116"></A>
+<H1>Grammars of formal languages</H1>
+<P>
+<a name="chapseven"></a>
+</P>
+<P>
+In this chapter, we will write a grammar for arithmetic expressions as known
+from school mathematics and many programming languages. We will see how to
+define precedences in GF, how to include built-in integers in grammars, and 
+how to deal with spaces between tokens in desired ways. As an alternative concrete
+syntax, we will generate code for a JVM-like stack machine. We will conclude
+by extending the language with variable declarations and assignments, which
+are handled in a type-safe way by using higher-order abstract syntax.
+</P>
+<P>
+To write grammars for formal languages is usually less challenging than for
+natural languages. There are standard tools for this task, such as the YACC
+family of parser generators. Using GF would be overkill for many projects, 
+and come with a penalty in efficiency. However, it is still worth while to 
+look at this task. A typical application of GF are natural-language interfaces
+to formal systems: in such applications, the translation between natural and
+formal language can be defined as a multilingual grammar. The use of higher-order
+abstract syntax, together with dependent types, provides a way to define a
+complete compiler in GF.
+</P>
+<A NAME="toc117"></A>
+<H2>Arithmetic expressions</H2>
+<A NAME="toc118"></A>
+<H3>Abstract syntax</H3>
+<P>
+We want to write a grammar for what is usually called <B>expressions</B> 
+in programming languages. The expressions are built from integers by
+the binary operations of addition, subtraction, multiplication, and
+division. The abstract syntax is easy to write. We call it <CODE>Calculator</CODE>,
+since it can be used as the basis of a calculator.
+</P>
+<PRE>
+    abstract Calculator = {
+  
+    cat Exp ;
+  
+    fun
+      EPlus, EMinus, ETimes, EDiv : Exp -&gt; Exp -&gt; Exp ;
+      EInt : Int -&gt; Exp ;
+    }
+</PRE>
+<P>
+Notice the use of the category <CODE>Int</CODE>. It is a built-in category of
+integers. Its syntax trees are denoted by <B>integer literals</B>, which are
+sequences of digits. For instance,
+</P>
+<PRE>
+    5457455814608954681 : Int
+</PRE>
+<P>
+These are the only objects of type <CODE>Int</CODE>:
+grammars are not allowed to declare functions with <CODE>Int</CODE> as value type.
+</P>
+<A NAME="toc119"></A>
+<H3>Concrete syntax: a simple approach</H3>
+<P>
+Arithmetic expressions should be unambiguous. If we write
+</P>
+<PRE>
+    2 + 3 * 4
+</PRE>
+<P>
+it should be parsed as one, but not both, of
+</P>
+<PRE>
+    EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+    ETimes (EPlus (EInt 2) (EInt 3)) (EInt 4)
+</PRE>
+<P>
+Under normal conventions, the former is chosen, because 
+multiplication has <B>higher precedence</B> than addition.
+If we want to express the latter tree, we have to use
+parentheses:
+</P>
+<PRE>
+    (2 + 3) * 4
+</PRE>
+<P>
+However, it is not completely trivial to decide when to use 
+parentheses and when not. We will therefore begin with a
+concrete syntax that always uses parentheses around binary
+operator applications. 
+</P>
+<PRE>
+    concrete CalculatorP of Calculator = {
+  
+    lincat 
+      Exp = SS ;
+    lin
+      EPlus  = infix "+" ;
+      EMinus = infix "-" ;
+      ETimes = infix "*" ;
+      EDiv   = infix "/" ;
+      EInt i = i ;
+  
+    oper
+      infix : Str -&gt; SS -&gt; SS -&gt; SS = \f,x,y -&gt; 
+        ss ("(" ++ x.s ++ f ++ y.s ++ ")") ;
+    }
+</PRE>
+<P>
+Now we will obtain
+</P>
+<PRE>
+    &gt; linearize EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+    ( 2 + ( 3 * 4 ) )
+</PRE>
+<P>
+The first problem, even more urgent than superfluous parentheses, is
+to get rid of superfluous spaces and to recognize integer literals
+in the parser.
+</P>
+<A NAME="toc120"></A>
+<H2>Lexing and unlexing</H2>
+<P>
+<a name="seclexing"></a>
+</P>
+<P>
+The input of parsing in GF is not just a string, but a list of
+<B>tokens</B>. By default, a list of tokens is obtained from a string
+by analysing it into <B>words</B>, which means chunks separated by
+spaces. Thus for instance
+</P>
+<PRE>
+    "(12 + (3 * 4))"
+</PRE>
+<P>
+is split into the tokens
+</P>
+<PRE>
+    "(12", "+", "(3". "*". "4))"
+</PRE>
+<P>
+The parser then tries to find each of these tokens among the terminals
+of the grammar, i.e. among the strings that can appear in linearizations.
+In our example, only the tokens <CODE>"+"</CODE> and <CODE>"*"</CODE> can be found, and
+parsing therefore fails.
+</P>
+<P>
+The proper way to split the above string into tokens would be
+</P>
+<PRE>
+    "(", "12", "+", "(", "3", "*", "4", ")", ")"
+</PRE>
+<P>
+Moreover, the tokens <CODE>"12"</CODE>, <CODE>"3"</CODE>, and <CODE>"4"</CODE> should not be sought
+among the terminals in the grammar, but treated as integer tokens, which
+are defined outside the grammar. Since GF aims to be fully general, such
+conventions are not built in: it must be possible for a grammar to have
+tokens such as <CODE>"12"</CODE> and <CODE>"12)"</CODE>. Therefore, GF has a way to select
+a <B>lexer</B>, a function that splits strings into tokens and classifies 
+them into terminals, literalts, etc.
+</P>
+<P>
+A lexer can be given as a flag to the parsing command:
+</P>
+<PRE>
+    &gt; parse -cat=Exp -lexer=codelit "(2 + (3 * 4))"
+    EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+</PRE>
+<P>
+Since the lexer is usually a part of the language specification, it
+makes sense to put it in the concrete syntax by using the judgement
+</P>
+<PRE>
+    flags lexer = codelit ;
+</PRE>
+<P>
+The problem of getting correct spacing after linearization is likewise solved
+by an <B>unlexer</B>:
+</P>
+<PRE>
+    &gt; l -unlexer=code EPlus (EInt 2) (ETimes (EInt 3) (EInt 4))
+    (2 + (3 * 4))
+</PRE>
+<P>
+Also this flag is usually put into the concrete syntax file.
+</P>
+<P>
+The lexers and unlexers that are available in the GF system can be
+seen by
+</P>
+<PRE>
+    &gt; help -lexer
+    &gt; help -unlexer
+</PRE>
+<P>
+A table of the most common lexers and unlexers is given in the Summary
+section 7.8.
+</P>
+<A NAME="toc121"></A>
+<H2>Precedence and fixity</H2>
+<P>
+<a name="secprecedence"></a>
+</P>
+<P>
+Here is a summary of the usual
+precedence rules in mathematics and programming languages:
+</P>
+<UL>
+<LI>Integer constants and expressions in parentheses have the highest precedence.
+<LI>Multiplication and division have equal precedence, lower than the highest
+  but higher than addition and subtraction, which are again equal.
+<LI>All the four binary operations are <B>left-associative</B>, which means that
+  e.g. <CODE>1 + 2 + 3</CODE> means the same as <CODE>(1 + 2) + 3</CODE>.
+</UL>
+
+<P>
+One way of dealing with precedences in compiler books is by dividing expressions
+into three categories:
+</P>
+<UL>
+<LI>expressions: addition and subtraction
+<LI>terms: multiplication and division
+<LI>factors: constants and expressions in parentheses
+</UL>
+
+<P>
+The context-free grammar, also taking care of associativity, is the following:
+</P>
+<PRE>
+    Exp  ::= Exp  "+" Term | Exp  "-" Term | Term ;
+    Term ::= Term "*" Fact | Term "/" Fact | Fact ;
+    Fact ::= Int | "(" Exp ")" ;
+</PRE>
+<P>
+A compiler, however, does not want to make a semantic distinction between the
+three categories. Nor does it want to build syntax trees with the
+<B>coercions</B> that enable the use of a higher level expressions on a lower, and
+encode the use of parentheses. In compiler tools such as YACC, building abstract
+syntax trees is performed as a <B>semantic action</B>. For instance, if the parser
+recognizes an expression in parentheses, the action is to return only the
+expression, without encoding the parentheses.
+</P>
+<P>
+In GF, semantic actions could be encoded by using <CODE>def</CODE> definitions introduced
+<a href="#secdefdef">here</a>. But there is a more straightforward way of thinking about
+precedences: we introduce a parameter for precedence, and treat it as 
+an inherent feature of expressions:
+</P>
+<PRE>
+    oper
+      param Prec = Ints 2 ;
+      TermPrec : Type = {s : Str ; p : Prec} ;
+  
+      mkPrec : Prec -&gt; Str -&gt; TermPrec = \p,s -&gt; {s = s ; p = p} ;
+  
+    lincat 
+      Exp = TermPrec ;
+</PRE>
+<P>
+This example shows another way to use built-in integers in GF:
+the type <CODE>Ints 2</CODE> is a parameter type, whose values are the integers
+<CODE>0,1,2</CODE>. These are the three precedence levels we need. The main idea
+is to compare the inherent precedence of an expression with the context
+in which it is used. If the precedence is higher than or equal to 
+the expected, then
+no parentheses are needed. Otherwise they are. We encode this rule in 
+the operation
+</P>
+<PRE>
+    oper usePrec : TermPrec -&gt; Prec -&gt; Str = \x,p -&gt;
+      case lessPrec x.p p of {
+        True  =&gt; "(" x.s ")" ;
+        False =&gt; x.s
+      } ;
+</PRE>
+<P>
+With this operation, we can build another one, that can be used for
+defining left-associative infix expressions:
+</P>
+<PRE>
+    infixl : Prec -&gt; Str -&gt; (_,_ : TermPrec) -&gt; TermPrec = \p,f,x,y -&gt;
+      mkPrec p (usePrec x p ++ f ++ usePrec y (nextPrec p)) ;
+</PRE>
+<P>
+Constant-like expressions (the highest level) can be built simply by
+</P>
+<PRE>
+    constant : Str -&gt; TermPrec = mkPrec 2 ;
+</PRE>
+<P>
+All these operations can be found in the library module <CODE>lib/prelude/Formal</CODE>,
+so we don't have to define them in our own code. Also the auxiliary operations
+<CODE>nextPrec</CODE> and <CODE>lessPrec</CODE> used in their definitions are defined there.
+The library has 5 levels instead of 3.
+</P>
+<P>
+Now we can express the whole concrete syntax of <CODE>Calculator</CODE> compactly:
+</P>
+<PRE>
+    concrete CalculatorC of Calculator = open Formal, Prelude in {
+  
+    flags lexer = codelit ; unlexer = code ; startcat = Exp ;
+  
+    lincat Exp = TermPrec ;
+  
+    lin
+      EPlus  = infixl 0 "+" ;
+      EMinus = infixl 0 "-" ;
+      ETimes = infixl 1 "*" ;
+      EDiv   = infixl 1 "/" ;
+      EInt i = constant i.s ;
+    }
+</PRE>
+<P>
+Let us just take one more look at the operation <CODE>usePrec</CODE>, which decides whether
+to put parentheses around a term or not. The case where parentheses are not needed
+around a string was defined as the string itself.
+However, this would imply that superfluous parentheses
+are never correct. A more liberal grammar is obtained by using the operation
+</P>
+<PRE>
+    parenthOpt : Str -&gt; Str = \s -&gt; variants {s ; "(" ++ s ++ ")"} ;
+</PRE>
+<P>
+which is actually used in the <CODE>Formal</CODE> library.
+But even in this way, we can only allow one pair of superfluous parentheses.
+Thus the parameter-based grammar has not quite reached the goal
+of implementing the same language as the expression-term-factor grammar.
+But it has the advantage of eliminating precedence distinctions from the
+abstract syntax.
+</P>
+<P>
+<B>Exercise</B>. Define non-associative and right-associative infix operations
+analogous to <CODE>infixl</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Add a constructor that puts parentheses around expressions
+to raise their precedence, but that is eliminated by a <CODE>def</CODE> definition.
+Test parsing with and without a pipe to <CODE>pt -transform=compute</CODE>.
+</P>
+<A NAME="toc122"></A>
+<H2>Code generation as linearization</H2>
+<P>
+The classical use of grammars of programming languages is in <B>compilers</B>,
+which translate one language into another. Typically the source language of
+a compiler is a high-level language and the target language is a machine
+language. The hub of a compiler is abstract syntax: the <B>front end</B> of
+the compiler parses source language strings into abstract syntax trees, and
+the <B>back end</B> linearizes these trees into the target language. This processing
+model is of course what GF uses for natural language translation; the main
+difference is that, in GF, the compiler could run in the opposite direction as
+well, that is, function as a <B>decompiler</B>. (In full-size compilers, the
+abstract syntax is also transformed by several layers of semantic analysis
+and optimizations, before the target code is generated; this can destroy
+reversibility and hence decompilation.)
+</P>
+<P>
+More for the sake of illustration 
+than as a serious compiler, let us write a concrete 
+syntax of <CODE>Calculator</CODE> that generates machine code similar to JVM (Java Virtual
+Machine). JVM is a so-called <B>stack machine</B>, whose code follows the
+<B>postfix</B> notation, also known as <B>reverse Polish</B> notation. Thus the
+expression
+</P>
+<PRE>
+    2 + 3 * 4
+</PRE>
+<P>
+is translated to
+</P>
+<PRE>
+    iconst 2 : iconst 3 ; iconst 4 ; imul ; iadd
+</PRE>
+<P>
+The linearization rules are not difficult to give:
+</P>
+<PRE>
+    lin
+      EPlus  = postfix "iadd" ;
+      EMinus = postfix "isub" ;
+      ETimes = postfix "imul" ;
+      EDiv   = postfix "idiv" ;
+      EInt i = ss ("iconst" ++ i.s) ;
+    oper
+      postfix : Str -&gt; SS -&gt; SS -&gt; SS = \op,x,y -&gt; 
+        ss (x.s ++ ";" ++ y.s ++ ";" ++ op) ;
+</PRE>
+<P></P>
+<A NAME="toc123"></A>
+<H2>Speaking aloud arithmetic expressions</H2>
+<P>
+Natural languages have sometimes difficulties in expressing mathematical
+formulas unambiguously, because they have no universal device of parentheses. 
+For arithmetic formulas, a solution exists: 
+</P>
+<PRE>
+    2 + 3 * 4
+</PRE>
+<P>
+can be expressed
+</P>
+<PRE>
+    the sum of 2 and the product of 3 and 4
+</PRE>
+<P>
+However, this format is very verbose and unnatural, and becomes
+impossible to understand when the complexity of expressions grows. 
+Fortunately, spoken language
+has a very nice way of using <B>pauses</B> for disambiguation. This device was
+introduced by Torbj�rn Lager (personal communication, 2003) 
+as an input mechanism to a calculator dialogue
+system; it seems to correspond very closely to how we actually speak when we
+want to communicate arithmetic expressions. Another application would be as
+a part of a programming assistant that reads aloud code.
+</P>
+<P>
+The idea is that, after every completed operation, there is a pause. Try this
+by speaking aloud the following lines, making a pause instead of pronouncing the
+word <CODE>PAUSE</CODE>:
+</P>
+<PRE>
+    2 plus 3 times 4 PAUSE
+    2 plus 3 PAUSE times 4 PAUSE
+</PRE>
+<P>
+A grammar implementing this convention is again simple to write:
+</P>
+<PRE>
+    lin
+      EPlus  = infix "plus" ;
+      EMinus = infix "minus" ;
+      ETimes = infix "times" ;
+      EDiv   = infix ["divided by"] ;
+      EInt i = i ;
+    oper
+      infix : Str -&gt; SS -&gt; SS -&gt; SS = \op,x,y -&gt; 
+        ss (x.s ++ op ++ y.s ++ "PAUSE") ;
+</PRE>
+<P>
+Intuitively, a pause is taken to give the hearer time to compute an
+intermediate result.
+</P>
+<P>
+<B>Exercise</B>. Is the pause-based grammar unambiguous? Test with random examples!
+</P>
+<A NAME="toc124"></A>
+<H2>Programs with variables</H2>
+<P>
+A useful extension of arithmetic expressions is a <B>straight code</B> programming
+language. The programs of this language are <B>assignments</B> of the form <CODE>x = exp</CODE>, 
+which assign expressions to variables. Expressions can moreover contain variables 
+that have been given values in previous assignments. 
+</P>
+<P>
+In this language, we use two new categories: programs and variables.
+A program is a sequence of assignments, where a variable is given a value. 
+Logically, we want to distinguish <B>initializations</B> from other assignments:
+these are the assignments where a variable is given a value for the first time.
+Just like in C-like languages, 
+we prefix an initializing assignment with the type of the variable.
+Here is an example of a piece of code written in the language:
+</P>
+<PRE>
+    int x = 2 + 3 ;  
+    int y = x + 1 ; 
+    x = x + 9 * y ;
+</PRE>
+<P>
+We define programs by the following constructors:
+</P>
+<PRE>
+    fun
+      PEmpty : Prog ;
+      PInit  : Exp -&gt; (Var -&gt; Prog) -&gt; Prog ;
+      PAss   : Var -&gt; Exp  -&gt; Prog  -&gt; Prog ;
+</PRE>
+<P>
+The interesting constructor is <CODE>PInit</CODE>, which uses
+higher-order abstract syntax for making the initialized variable available in
+the <B>continuation</B> of the program. The abstract syntax tree for the above code
+is
+</P>
+<PRE>
+    PInit (EPlus (EInt 2) (EInt 3)) (\x -&gt; 
+      PInit (EPlus (EVar x) (EInt 1)) (\y -&gt; 
+        PAss x (EPlus (EVar x) (ETimes (EInt 9) (EVar y))) 
+          PEmpty))
+</PRE>
+<P>
+Since we want to prevent the use of uninitialized variables in programs, we
+don't give any constructors for <CODE>Var</CODE>! We just have a rule for using variables
+as expressions:
+</P>
+<PRE>
+    fun EVar : Var -&gt; Exp ;
+</PRE>
+<P>
+The rest of the grammar is just the same as for arithmetic expressions
+<a href="#secprecedence">here</a>. The best way to implement it is perhaps by writing a
+module that extends the expression module. The most natural start category
+of the extension is <CODE>Prog</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Extend the straight-code language to expressions of type <CODE>float</CODE>.
+To guarantee type safety, you can define a category <CODE>Typ</CODE> of types, and
+make <CODE>Exp</CODE> and <CODE>Var</CODE> dependent on <CODE>Typ</CODE>. Basic floating point expressions
+can be formed from literal of the built-in GF type <CODE>Float</CODE>. The arithmetic
+operations should be made polymorphic (as <a href="#secpolymorphic">here</a>).
+</P>
+<A NAME="toc125"></A>
+<H3>The concrete syntax of assignments</H3>
+<P>
+We can define a C-like concrete syntax by using GF's <CODE>$</CODE> variables, as explained 
+<a href="#secbinding">here</a>.
+</P>
+<P>
+In a JVM-like syntax, we need two more instructions: <CODE>iload</CODE> <I>x</I>, which
+loads (pushes on the stack) the value of the variable <I>x</I>, and <CODE>istore</CODE> <I>x</I>,
+which stores the value of the currently topmost expression in the variable <I>x</I>.
+Thus the code for the example in the previous section is
+</P>
+<PRE>
+    iconst 2 ; iconst 3 ; iadd ; istore x ;
+    iload x ; iconst 1 ; iadd ; istore y ;
+    iload x ; iconst 9 ; iload y ; imul ; iadd ; istore x ;
+</PRE>
+<P>
+Those familiar with JVM will notice that we are using <B>symbolic addresses</B>, i.e.
+variable names instead of integer offsets in the memory. Neither real JVM nor
+our variant makes any distinction between the initialization and reassignment
+of a variable. 
+</P>
+<P>
+<B>Exercise</B>. Finish the implementation of the 
+C-to-JVM compiler by extending the expression modules
+to straight code programs.
+</P>
+<P>
+<B>Exercise</B>. If you made the exercise of adding floating point numbers to
+the language, you can now cash out the main advantage of type checking
+for code generation: selecting type-correct JVM instructions. The floating
+point instructions are precisely the same as the integer one, except that
+the prefix is <CODE>f</CODE> instead of <CODE>i</CODE>, and that <CODE>fconst</CODE> takes floating
+point literals as arguments.
+</P>
+<A NAME="toc126"></A>
+<H3>A liberal syntax of variables</H3>
+<P>
+In many applications, the task of GF is just linearization and parsing; 
+keeping track of bound variables and other semantic constraints is
+the task of other parts of the program. For instance, if we want to
+write a natural language interface that reads aloud C code, we can
+quite as well use a context-free grammar of C, and leave it to the C
+compiler to check that variables make sense. In such a program, we may
+want to treat variables as <I>Strings</I>, i.e. to have a constructor
+</P>
+<PRE>
+    fun VString : String -&gt; Var ;
+</PRE>
+<P>
+The built-in category <CODE>String</CODE> has as its values <B>string literals</B>,
+which are strings in double quotes. The lexer and unlexer <CODE>codelit</CODE>
+restore and remove the quotes; when the lexer finds a token that is
+neither a terminal in the grammar nor an integer literal, it sends 
+it to the parser as a string literal.
+</P>
+<P>
+<B>Exercise</B>. Write a grammar for straight code without higher-order
+abstract syntax.
+</P>
+<P>
+<B>Exercise</B>. Extend the liberal straight code grammar to <CODE>while</CODE> loops and
+some other program constructs, and investigate if you can build a reasonable spoken
+language generator for this fragment.
+</P>
+<A NAME="toc127"></A>
+<H2>Conclusion</H2>
+<P>
+Since formal languages are syntactically simpler than natural languages, it
+is no wonder that their grammars can be defined in GF. Some thought is needed
+for dealing with precedences and spacing, but much of it is encoded in GF's
+libraries and built-in lexers and unlexers. If the sole purpose of a grammar
+is to implement a programming language, then the <B>BNF Converter</B> tool
+(BNFC) is more appropriate than GF:
+<center>
+<CODE>www.cs.chalmers.se/~markus/BNFC/</CODE>
+</center>
+BNFC uses standard YACC-like parser tools. GF has flags for printing 
+grammars in the BNFC format.
+</P>
+<P>
+The most common applications of GF grammars of formal languages 
+are in natural-language interfaces of various kinds. 
+These systems don't usually need semantic control in GF abstract 
+syntax. However, the situation can be different if the interface also comprises
+an interactive syntax editor, as in the GF-Key system 
+(Beckert &amp; al. 2006, Burke &amp; Johannisson 2005).
+In that system, the editor is used for guiding programmers only to write
+type-correct code.
+</P>
+<P>
+The technique of continuations in modelling programming languages has recently
+been applied to natural language, for processing <B>anaphoric reference</B>, 
+e.g. pronouns. It may be good to know that GF has the machinery available;
+for the time being, however (GF 2.8), dependent types and 
+higher-order abstract syntax are not supported by the embedded GF implementations
+in Haskell and Java.
+</P>
+<P>
+<B>Exercise</B>. The book <I>C programming language</I> by Kernighan and Ritchie
+(p. 123, 2nd edition, 1988) describes an English-like syntax for pointer and
+array declarations, and a C program for translating between English and C.
+The following example pair shows all the expression forms needed:
+</P>
+<PRE>
+    char (*(*x[3])())[5]
+  
+    x: array[3] of pointer to function returning 
+    pointer to array[5] of char
+</PRE>
+<P>
+Implement these translations by a GF grammar.
+</P>
+<P>
+<B>Exercise</B>. Design a natural-language interface to Unix command lines. 
+It should be able to express verbally commands such as 
+<CODE>cat, cd, grep, ls, mv, rm, wc</CODE> and also
+pipes built from them.
+</P>
+<A NAME="toc128"></A>
+<H2>Summary of GF language constructs</H2>
+<A NAME="toc129"></A>
+<H3>Lexers and unlexers</H3>
+<P>
+Lexers are set by the flag <CODE>lexer</CODE> and unlexers by the flag <CODE>unlexer</CODE>.
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>lexer</TH>
+<TH COLSPAN="2">description</TH>
+</TR>
+<TR>
+<TD><CODE>words</CODE></TD>
+<TD>(default) tokens are separated by spaces or newlines</TD>
+</TR>
+<TR>
+<TD><CODE>literals</CODE></TD>
+<TD>like words, but integer and string literals recognized</TD>
+</TR>
+<TR>
+<TD><CODE>chars</CODE></TD>
+<TD>each character is a token</TD>
+</TR>
+<TR>
+<TD><CODE>code</CODE></TD>
+<TD>program code conventions (uses Haskell's lex)</TD>
+</TR>
+<TR>
+<TD><CODE>text</CODE></TD>
+<TD>with conventions on punctuation and capital letters</TD>
+</TR>
+<TR>
+<TD><CODE>codelit</CODE></TD>
+<TD>like code, but recognize literals (unknown words as strings)</TD>
+</TR>
+<TR>
+<TD><CODE>textlit</CODE></TD>
+<TD>like text, but recognize literals (unknown words as strings)</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>unlexer</TH>
+<TH COLSPAN="2">description</TH>
+</TR>
+<TR>
+<TD><CODE>unwords</CODE></TD>
+<TD>(default) space-separated token list</TD>
+</TR>
+<TR>
+<TD><CODE>text</CODE></TD>
+<TD>format as text: punctuation, capitals, paragraph &lt;p&gt;</TD>
+</TR>
+<TR>
+<TD><CODE>code</CODE></TD>
+<TD>format as code (spacing, indentation)</TD>
+</TR>
+<TR>
+<TD><CODE>textlit</CODE></TD>
+<TD>like text, but remove string literal quotes</TD>
+</TR>
+<TR>
+<TD><CODE>codelit</CODE></TD>
+<TD>like code, but remove string literal quotes</TD>
+</TR>
+<TR>
+<TD><CODE>concat</CODE></TD>
+<TD>remove all spaces</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<A NAME="toc130"></A>
+<H3>Built-in abstract syntax types</H3>
+<P>
+There are three built-in types. Their syntax trees are literals of corresponding kinds:
+</P>
+<UL>
+<LI><CODE>Int</CODE>, with nonnegative integer literals e.g. <CODE>987031434</CODE>
+<LI><CODE>Float</CODE>, with nonnegative floating-point literals e.g. <CODE>907.219807</CODE>
+<LI><CODE>String</CODE>, with string literals e.g. <CODE>"foo"</CODE>
+</UL>
+
+<P>
+Their linearization type is uniformly <CODE>{s : Str}</CODE>.
+</P>
+<A NAME="toc131"></A>
+<H1>Embedded grammars</H1>
+<P>
+<a name="chapeight"></a>
+</P>
+<P>
+GF grammars can be used as parts of programs written in other programming
+languages. Haskell and Java.
+This facility is based on several components:
+</P>
+<UL>
+<LI>a portable format for multilingual GF grammars
+<LI>an interpreter for this format written in the host language
+<LI>an API that enables reading grammar files and calling the interpreter
+<LI>a way to manipulate abstract syntax trees in the host language
+</UL>
+
+<P>
+In this chapter, we will show the basic ways of producing such
+<B>embedded grammars</B> and using them in Haskell, Java, and JavaScript programs.
+We will build a simple example application in each language:
+</P>
+<UL>
+<LI>a question-answering system in Haskell
+<LI>a translator GUI in Java
+<LI>a multilingual syntax editor in JavaScript
+</UL>
+
+<P>
+Moreover, we will use how grammar applications can be extended to
+spoken language by generating <B>language models</B> for speech recognition
+in various standard formats.
+</P>
+<A NAME="toc132"></A>
+<H2>The portable grammar format</H2>
+<P>
+The portable format is called GFCC, "GF Canonical Compiled". A file
+of this form can be produced from GF by the command
+</P>
+<PRE>
+    &gt; print_multi -printer=gfcc | write_file FILE.gfcc
+</PRE>
+<P>
+Files written in this format can also be imported in the GF system, 
+which recognizes the suffix <CODE>.gfcc</CODE> and builds the multilingual
+grammar in memory. 
+</P>
+<P>
+<I>This applies to GF version 3 and upwards. Older GF used a format suffixed</I> 
+<CODE>.gfcm</CODE>.
+<I>At the moment of writing, also the Java interpreter still uses the GFCM format.</I>
+</P>
+<P>
+GFCC is, in fact, the recommended format in 
+which final grammar products are distributed, because they
+are stripped from superfluous information and can be started and applied
+faster than sets of separate modules.
+</P>
+<P>
+Application programmers have never any need to read or modify GFCC files. 
+Also in this sense, they play the same role as machine code in 
+general-purpose programming.
+</P>
+<A NAME="toc133"></A>
+<H2>The embedded interpreter and its API</H2>
+<P>
+The interpreter is a kind of a miniature GF system, which can parse and
+linearize with grammars. But it can only perform a subset of the commands of
+the GF system. For instance, it
+cannot compile source grammars into the GFCC format; the compiler is the most
+heavy-weight component of the GF system, and should not be carried around
+in end-user applications. 
+Since GFCC is much
+simpler than source GF, building an interpreter is relatively easy.
+Full-scale interpreters currently exist in Haskell and Java, and partial
+ones in C++, JavaScript, and Prolog. We will in this chapter focus
+on Haskell, Java, and JavaScript.
+</P>
+<P>
+Application programmers never need to read or modify the interpreter.
+They only need to access it via its API. 
+</P>
+<A NAME="toc134"></A>
+<H2>Embedded GF applications in Haskell</H2>
+<P>
+Readers unfamiliar with Haskell, or who just want to program in Java, can safely
+skip this section. Everything will be repeated in the corresponding Java
+section. However, seeing the Haskell code may still be helpful because
+Haskell is in many ways closer to GF than Java is. In particular, recursive
+types of syntax trees and pattern matching over them are very similar in 
+Haskell and GF,
+but require a complex encoding with classes and visitors in Java.
+</P>
+<A NAME="toc135"></A>
+<H3>The EmbedAPI module</H3>
+<P>
+The Haskell API contains (among other things) the following types and functions:
+</P>
+<PRE>
+  module EmbedAPI where
+  
+  type MultiGrammar 
+  type Language     
+  type Category     
+  type Tree         
+  
+  file2grammar :: FilePath -&gt; IO MultiGrammar
+  
+  linearize :: MultiGrammar -&gt; Language -&gt; Tree -&gt; String
+  parse     :: MultiGrammar -&gt; Language -&gt; Category -&gt; String -&gt; [Tree]
+  
+  linearizeAll     :: MultiGrammar -&gt; Tree -&gt; [String]
+  linearizeAllLang :: MultiGrammar -&gt; Tree -&gt; [(Language,String)]
+  
+  parseAll     :: MultiGrammar -&gt; Category -&gt; String -&gt; [[Tree]]
+  parseAllLang :: MultiGrammar -&gt; Category -&gt; String -&gt; [(Language,[Tree])]
+  
+  languages  :: MultiGrammar -&gt; [Language]
+  categories :: MultiGrammar -&gt; [Category]
+  startCat   :: MultiGrammar -&gt; Category
+</PRE>
+<P>
+This is the only module that needs to be imported in the Haskell application.
+It is available as a part of the GF distribution, in the file
+<CODE>src/GF/GFCC/API.hs</CODE>.
+</P>
+<A NAME="toc136"></A>
+<H3>First application: a translator</H3>
+<P>
+Let us first build a stand-alone translator, which can translate
+in any multilingual grammar between any languages in the grammar.
+The whole code for this translator is here:
+</P>
+<PRE>
+  module Main where
+  
+  import GF.GFCC.API
+  import System (getArgs)
+  
+  main :: IO () 
+  main = do
+    file:_ &lt;- getArgs
+    gr &lt;- file2grammar file
+    interact (translate gr)
+  
+  translate :: MultiGrammar -&gt; String -&gt; String
+  translate gr = case parseAllLang gr (startCat gr) s of
+    (lg,t:_):_ -&gt; unlines [linearize gr l t | l &lt;- languages gr, l /= lg]
+    _ -&gt; "NO PARSE"
+</PRE>
+<P>
+To run the translator, first compile it by
+</P>
+<PRE>
+    % ghc --make -o trans Translator.hs 
+</PRE>
+<P>
+Then produce a GFCC file. For instance, the <CODE>Food</CODE> grammar set can be 
+compiled as follows:
+</P>
+<PRE>
+    % gfc --make FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+This produces the file <CODE>Food.gfcc</CODE> (its name comes from the abstract syntax).
+</P>
+<P>
+<I>The gfc batch compiler program is available in GF 3 and upwards.</I>
+<I>In earlier versions, the appropriate command can be piped to gf:</I>
+</P>
+<PRE>
+    % echo "pm -printer=gfcc | wf Food.gfcc" | gf FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+Equivalently, the grammars could be read into GF shell and the <CODE>pm</CODE> command
+issued from there. But the unix command has the advantage that it can
+be put into a <CODE>Makefile</CODE> to automate the compilation of an application.
+</P>
+<P>
+The Haskell library function <CODE>interact</CODE> makes the <CODE>trans</CODE> program work
+like a Unix filter, which reads from standard input and writes to standard
+output. Therefore it can be a part of a pipe and read and write files.
+The simplest way to translate is to <CODE>echo</CODE> input to the program:
+</P>
+<PRE>
+    % echo "this wine is delicious" | ./trans Food.gfcc
+    questo vino � delizioso
+</PRE>
+<P>
+The result is given in all languages except the input language.
+</P>
+<A NAME="toc137"></A>
+<H3>A looping translator</H3>
+<P>
+If the user wants to translate many expressions in a sequence, it
+is cumbersome to have to start the translator over and over again,
+because reading the grammar and building the parser always takes 
+time. The translator of the previous section is easy to modify
+to enable this: just change <CODE>interact</CODE> in the main function to
+<CODE>loop</CODE>. It is not a standard Haskell function, so its definition has
+to be included:
+</P>
+<PRE>
+  loop :: (String -&gt; String) -&gt; IO ()
+  loop trans = do 
+    s &lt;- getLine
+    if s == "quit" then putStrLn "bye" else do  
+      putStrLn $ trans s
+      loop trans
+</PRE>
+<P>
+The loop keeps on translating line by line until the input line
+is <CODE>quit</CODE>.
+</P>
+<A NAME="toc138"></A>
+<H3>A question-answer system</H3>
+<P>
+<a name="secmathprogram"></a>
+</P>
+<P>
+The next application is also a translator, but it adds a
+<B>transfer</B> component to the grammar. Transfer is a function that
+takes the input syntax tree into some other syntax tree, which is
+then linearized and shown back to the user. The transfer function we
+are going to use is one that computes a question into an answer.
+The program accepts simple questions about arithmetic and answers
+"yes" or "no" in the language in which the question was made:
+</P>
+<PRE>
+    Is 123 prime?
+    No.
+    77 est impair ?
+    Oui.
+</PRE>
+<P>
+The main change that is needed to the pure translator is to give
+the type of <CODE>translate</CODE> an extra argument: a transfer function.
+</P>
+<PRE>
+    translate :: (Tree -&gt; Tree) -&gt; MultiGrammar -&gt; String -&gt; String
+</PRE>
+<P>
+You can think of ordinary translation as a special case where
+transfer is the identity function (<CODE>id</CODE> in Haskell).
+</P>
+<P>
+Also the behaviour of returning the reply in different languages
+should be changed so that the reply is returned in the <I>same</I> language.
+Here is the complete definition of <CODE>translate</CODE> in the new form.
+</P>
+<PRE>
+    translate tr gr = case parseAllLang gr (startCat gr) s of
+      (lg,t:_):_ -&gt; linearize gr lg (tr t)
+      _ -&gt; "NO PARSE"
+</PRE>
+<P>
+To complete the system, we have to define the transfer function.
+So, how can we write a function from from abstract syntax trees
+to abstract syntax trees? The embedded API does not make 
+the constructors of the type <CODE>Tree</CODE> available for users. Even if it did, it would
+be quite complicated to use the type, and programs would be likely
+to produce trees that are ill-typed in GF and therefore cannot
+be linearized.
+</P>
+<A NAME="toc139"></A>
+<H3>Exporting GF datatypes</H3>
+<P>
+The way to go in defining transfer is to use GF's tree constructors, that
+is, the <CODE>fun</CODE> functions, as if they were Haskell's data constructors.
+There is enough resemblance between GF and Haskell to make this possible
+in most cases. It is even possible in Java, as we shall see later.
+</P>
+<P>
+Thus every category of GF is translated into a Haskell datatype, where the
+functions producing a value of that category are treated as constructors.
+The translation is obtained by using the batch compiler with the command
+</P>
+<PRE>
+    % gfc -haskell Food.gfcc
+</PRE>
+<P>
+It is also possible to produce the Haskell file together with GFCC, by
+</P>
+<PRE>
+    % gfc --make -haskell FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+The result is a file named <CODE>GSyntax.hs</CODE>, containing a 
+module named <CODE>GSyntax</CODE>. 
+</P>
+<P>
+<I>In GF before version 3, the same result is obtained from within GF, by the command</I>
+</P>
+<PRE>
+    &gt; print_grammar -printer=gfcc_haskell | write_file GSyntax.hs
+</PRE>
+<P></P>
+<P>
+As an example, we take
+the grammar we are going to use for queries. The abstract syntax is
+</P>
+<PRE>
+    abstract Math = {
+  
+    flags startcat = Question ;
+  
+    cat Answer ; Question ; Object ;
+  
+    fun 
+      Even   : Object -&gt; Question ;
+      Odd    : Object -&gt; Question ;
+      Prime  : Object -&gt; Question ;
+      Number : Int -&gt; Object ;
+  
+      Yes : Answer ;
+      No  : Answer ;
+    }
+</PRE>
+<P>
+It is translated to the following system of datatypes:
+</P>
+<PRE>
+  newtype GInt = GInt Integer
+  
+  data GAnswer =
+     GYes 
+   | GNo 
+  
+  data GObject = GNumber GInt 
+  
+  data GQuestion =
+     GPrime GObject 
+   | GOdd GObject 
+   | GEven GObject 
+</PRE>
+<P>
+All type and constructor names are prefixed with a <CODE>G</CODE> to prevent clashes.
+</P>
+<P>
+Now it is possible to define functions from and to these datatype, in Haskell.
+Haskell's type checker guarantees that the functions are well-typed also with
+respect to GF. Here is a question-to-answer function for this language:
+</P>
+<PRE>
+  answer :: GQuestion -&gt; GAnswer
+  answer p = case p of
+    GOdd x   -&gt; test odd x
+    GEven x  -&gt; test even x
+    GPrime x -&gt; test prime x
+  
+  value :: GObject -&gt; Int
+  value e = case e of
+    GNumber (GInt i) -&gt; fromInteger i
+  
+  test :: (Int -&gt; Bool) -&gt; GObject -&gt; GAnswer
+  test f x = if f (value x) then GYes else GNo
+</PRE>
+<P>
+So it is the function <CODE>answer</CODE> that we want to apply as transfer.
+The only problem is the <I>type</I> of this function: the parsing and
+linearization method of <CODE>API</CODE> work with <CODE>Tree</CODE>s and not
+with <CODE>GQuestion</CODE>s and <CODE>GAnswers</CODE>.
+</P>
+<P>
+Fortunately the Haskell translation of GF takes care of translating
+between trees and the generated datatypes. This is done by using
+a class with the required translation methods:
+</P>
+<PRE>
+  class Gf a where 
+    gf :: a -&gt; Tree
+    fg :: Tree -&gt; a
+</PRE>
+<P>
+The Haskell code generator also generates instances of these classes
+for each datatype, for example,
+</P>
+<PRE>
+  instance Gf GQuestion where
+    gf (GEven x1) = DTr [] (AC (CId "Even")) [gf x1]
+    gf (GOdd x1) = DTr [] (AC (CId "Odd")) [gf x1]
+    gf (GPrime x1) = DTr [] (AC (CId "Prime")) [gf x1]
+    fg t =
+      case t of
+        DTr [] (AC (CId "Even")) [x1] -&gt; GEven (fg x1)
+        DTr [] (AC (CId "Odd")) [x1] -&gt; GOdd (fg x1)
+        DTr [] (AC (CId "Prime")) [x1] -&gt; GPrime (fg x1)
+        _ -&gt; error ("no Question " ++ show t)
+</PRE>
+<P>
+Needless to say, <CODE>GSyntax</CODE> is a module that a programmer
+never needs to look into, let alone change: it is enough to know that it
+contains a systematic encoding and decoding between an abstract syntax
+and Haskell datatypes, where 
+</P>
+<UL>
+<LI>all GF names are in Haskell prefixed with <CODE>G</CODE>
+<LI><CODE>gf</CODE> translates from Haskell to GF
+<LI><CODE>fg</CODE> translates from GF to Haskell
+</UL>
+
+<A NAME="toc140"></A>
+<H3>Putting it all together</H3>
+<P>
+Here is the complete code for the Haskell module <CODE>TransferLoop.hs</CODE>.
+</P>
+<PRE>
+  module Main where
+  
+  import GF.GFCC.API
+  import TransferDef (transfer)
+  
+  main :: IO () 
+  main = do
+    gr &lt;- file2grammar "Math.gfcc"
+    loop (translate transfer gr)
+  
+  loop :: (String -&gt; String) -&gt; IO ()
+  loop trans = do 
+    s &lt;- getLine
+    if s == "quit" then putStrLn "bye" else do  
+      putStrLn $ trans s
+      loop trans
+  
+  translate :: (Tree -&gt; Tree) -&gt; MultiGrammar -&gt; String -&gt; String
+  translate tr gr = case parseAllLang gr (startCat gr) s of
+    (lg,t:_):_ -&gt; linearize gr lg (tr t)
+    _ -&gt; "NO PARSE"
+</PRE>
+<P>
+This is the <CODE>Main</CODE> module, which just needs a function <CODE>transfer</CODE> from 
+<CODE>TransferDef</CODE> in order to compile. In the current application, this module
+looks as follows:
+</P>
+<PRE>
+  module TransferDef where
+  
+  import GF.GFCC.API (Tree)
+  import GSyntax
+  
+  transfer :: Tree -&gt; Tree
+  transfer = gf . answer . fg
+  
+  answer :: GQuestion -&gt; GAnswer
+  answer p = case p of
+    GOdd x   -&gt; test odd x
+    GEven x  -&gt; test even x
+    GPrime x -&gt; test prime x
+  
+  value :: GObject -&gt; Int
+  value e = case e of
+    GNumber (GInt i) -&gt; fromInteger i
+  
+  test :: (Int -&gt; Bool) -&gt; GObject -&gt; GAnswer
+  test f x = if f (value x) then GYes else GNo
+  
+  prime :: Int -&gt; Bool
+  prime x = elem x primes where
+    primes = sieve [2 .. x]
+    sieve (p:xs) = p : sieve [ n | n &lt;- xs, n `mod` p &gt; 0 ]
+    sieve [] = []
+</PRE>
+<P>
+This module, in turn, needs <CODE>GSyntax</CODE> to compile, and the main module
+needs <CODE>Math.gfcc</CODE> to run. To automate the production of the system,
+we write a <CODE>Makefile</CODE> as follows:
+</P>
+<PRE>
+  all:
+          gfc --make -haskell MathEng.gf MathFre.gf
+          ghc --make -o ./math TransferLoop.hs
+          strip math
+</PRE>
+<P>
+(Notice that the empty segments starting the command lines in a Makefile must be tabs.)
+Now we can compile the whole system by just typing 
+</P>
+<PRE>
+    make
+</PRE>
+<P>
+Then you can run it by typing
+</P>
+<PRE>
+    ./math
+</PRE>
+<P>
+Well --- you will of course need some concrete syntaxes of <CODE>Math</CODE> in order
+to succeed. We have defined ours by using the resource functor design pattern,
+as explained <a href="#secfunctor">here</a>.
+</P>
+<P>
+Just to summarize, the source of the application consists of the following files:
+</P>
+<PRE>
+    Makefile         -- a makefile
+    Math.gf          -- abstract syntax
+    Math???.gf       -- concrete syntaxes
+    TransferDef.hs   -- definition of question-to-answer function
+    TransferLoop.hs  -- Haskell Main module
+</PRE>
+<P></P>
+<A NAME="toc141"></A>
+<H2>Embedded GF applications in Java</H2>
+<P>
+When an API for GFCC in Java is available,
+we will write the same applications in Java as
+were written in Haskell above. Until then, we will
+build another kind of application, which does not require
+modification of generated Java code.
+</P>
+<P>
+More information on embedded GF grammars in Java can be found in the document
+</P>
+<PRE>
+    www.cs.chalmers.se/~bringert/gf/gf-java.html
+</PRE>
+<P>
+by Bj�rn Bringert.
+</P>
+<A NAME="toc142"></A>
+<H3>Translets</H3>
+<P>
+A Java system needs many more files than a Haskell system.
+To get started, one can fetch the package <CODE>gfc2java</CODE> from
+</P>
+<PRE>
+    www.cs.chalmers.se/~bringert/darcs/gfc2java/
+</PRE>
+<P>
+by using the Darcs version control system as described in the <CODE>gf-java</CODE> home page.
+</P>
+<P>
+The <CODE>gfc2java</CODE> package contains a script <CODE>build-translet</CODE>, which can be applied
+to any <CODE>.gfcm</CODE> file to create a <B>translet</B>, a small translation GUI. Foor the <CODE>Food</CODE>
+grammars of <a href="#chapthree">the third chapter</a>, we first create a file <CODE>food.gfcm</CODE> by
+</P>
+<PRE>
+    % echo "pm | wf food.gfcm" | gf FoodEng.gf FoodIta.gf
+</PRE>
+<P>
+and then run
+</P>
+<PRE>
+    % build_translet food.gfcm
+</PRE>
+<P>
+The resulting file <CODE>translate-food.jar</CODE> can be run with
+</P>
+<PRE>
+    % java -jar translate-food.jar
+</PRE>
+<P>
+The translet looks like this:
+</P>
+<P>
+  <IMG ALIGN="right" SRC="food-translet.png" BORDER="0" ALT="">
+</P>
+<A NAME="toc143"></A>
+<H3>Dialogue systems</H3>
+<P>
+A question-answer system is a special case of a <B>dialogue system</B>, where the user and
+the computer communicate by writing or, even more properly, by speech. The <CODE>gf-java</CODE>
+homepage provides an example of a most simple dialogue system imaginable, where two
+the conversation has just two rules:
+</P>
+<UL>
+<LI>if the user says <I>here you go</I>, the system says <I>thanks</I>
+<LI>if the user says <I>thanks</I>, the system says <I>you are welcome</I>
+</UL>
+
+<P>
+The conversation can be made in both English and Swedish; the user's initiative
+decides which language the system replies in. Thus the structure is very similar
+to the <CODE>math</CODE> program <a href="#secmathprogram">here</a>. The GF and 
+Java sources of the program can be
+found in 
+</P>
+<PRE>
+    www.cs.chalmers.se/~bringert/darcs/simpledemo
+</PRE>
+<P>
+again accessible with the Darcs version control system.
+</P>
+<A NAME="toc144"></A>
+<H2>Language models for speech recognition</H2>
+<P>
+The standard way of using GF in speech recognition is by building
+<B>grammar-based language models</B>. To this end, GF comes with compilers
+into several formats that are used in speech recognition systems.
+One such format is GSL, used in the <A HREF="http://www.nuance.com">Nuance speech recognizer</A>.
+It is produced from GF simply by printing a grammar with the flag
+<CODE>-printer=gsl</CODE>. The following example uses the smart house grammar defined
+<a href="#secsmarthouse">here</a>.
+</P>
+<PRE>
+    &gt; import -conversion=finite SmartEng.gf
+    &gt; print_grammar -printer=gsl
+  
+    ;GSL2.0
+    ; Nuance speech recognition grammar for SmartEng
+    ; Generated by GF
+  
+    .MAIN SmartEng_2
+  
+    SmartEng_0 [("switch" "off") ("switch" "on")]
+    SmartEng_1 ["dim" ("switch" "off")
+                ("switch" "on")]
+    SmartEng_2 [(SmartEng_0 SmartEng_3)
+                (SmartEng_1 SmartEng_4)]
+    SmartEng_3 ("the" SmartEng_5)
+    SmartEng_4 ("the" SmartEng_6)
+    SmartEng_5 "fan"
+    SmartEng_6 "light"
+</PRE>
+<P>
+Other formats available via the <CODE>-printer</CODE> flag include:
+</P>
+<TABLE ALIGN="center" CELLPADDING="4" BORDER="1">
+<TR>
+<TH>Format</TH>
+<TH COLSPAN="2">Description</TH>
+</TR>
+<TR>
+<TD><CODE>gsl</CODE></TD>
+<TD>Nuance GSL speech recognition grammar</TD>
+</TR>
+<TR>
+<TD><CODE>jsgf</CODE></TD>
+<TD>Java Speech Grammar Format (JSGF)</TD>
+</TR>
+<TR>
+<TD><CODE>jsgf_sisr_old</CODE></TD>
+<TD>JSGF with semantic tags in SISR  WD 20030401 format</TD>
+</TR>
+<TR>
+<TD><CODE>srgs_abnf</CODE></TD>
+<TD>SRGS ABNF format</TD>
+</TR>
+<TR>
+<TD><CODE>srgs_xml</CODE></TD>
+<TD>SRGS XML format</TD>
+</TR>
+<TR>
+<TD><CODE>srgs_xml_prob</CODE></TD>
+<TD>SRGS XML format, with weights</TD>
+</TR>
+<TR>
+<TD><CODE>slf</CODE></TD>
+<TD>finite automaton in the HTK SLF format</TD>
+</TR>
+<TR>
+<TD><CODE>slf_sub</CODE></TD>
+<TD>finite automaton with sub-automata in HTK SLF</TD>
+</TR>
+</TABLE>
+
+<P></P>
+<P>
+All currently available formats can be seen in gf with <CODE>help -printer</CODE>.
+</P>
+<A NAME="toc145"></A>
+<H2>Dependent types and spoken language models</H2>
+<P>
+We have used dependent types to control semantic well-formedness
+in grammars. This is important in traditional type theory 
+applications such as proof assistants, where only mathematically
+meaningful formulas should be constructed. But semantic filtering has
+also proved important in speech recognition, because it reduces the
+ambiguity of the results.
+</P>
+<P>
+Now, GSL is a context-free format, so how does it cope with dependent types?
+In general, dependent types can give rise to infinitely many basic types
+(exercise!), whereas a context-free grammar can by definition only have
+finitely many nonterminals.
+</P>
+<P>
+This is where the flag <CODE>-conversion=finite</CODE> is needed in the <CODE>import</CODE>
+command. Its effect is to convert a GF grammar with dependent types to
+one without, so that each instance of a dependent type is replaced by
+an atomic type. This can then be used as a nonterminal in a context-free
+grammar. The <CODE>finite</CODE> conversion presupposes that every 
+dependent type has only finitely many instances, which is in fact
+the case in the <CODE>Smart</CODE> grammar.
+</P>
+<P>
+<B>Exercise</B>. If you have access to the Nuance speech recognizer,
+test it with GF-generated language models for <CODE>SmartEng</CODE>. Do this
+both with and without <CODE>-conversion=finite</CODE>.
+</P>
+<P>
+<B>Exercise</B>. Construct an abstract syntax with infinitely many instances
+of dependent types.
+</P>
+<A NAME="toc146"></A>
+<H3>Statistical language models</H3>
+<P>
+An alternative to grammar-based language models are
+<B>statistical language models</B> (<B>SLM</B>s). An SLM is 
+built from a <B>corpus</B>, i.e. a set of utterances. It specifies the
+probability of each <B>n-gram</B>, i.e. sequence of <I>n</I> words. The
+typical value of <I>n</I> is 2 (bigrams) or 3 (trigrams).
+</P>
+<P>
+One advantage of SLMs over grammar-based models is that they are
+<B>robust</B>, i.e. they can be used to recognize sequences that would
+be out of the grammar or the corpus. Another advantage is that
+an SLM can be built "for free" if a corpus is available.
+</P>
+<P>
+However, collecting a corpus can require a lot of work, and writing
+a grammar can be less demanding, especially with tools such as GF or
+Regulus. This advantage of grammars can be combined with robustness
+by creating a back-up SLM from a <B>synthesized corpus</B>. This means
+simply that the grammar is used for generating such a corpus. 
+In GF, this can be done with the <CODE>generate_trees</CODE> command.
+As with grammar-based models, the quality of the SLM is better
+if meaningless utterances are excluded from the corpus. Thus 
+a good way to generate an SLM from a GF grammar is by using
+dependent types and filter the results through the type checker:
+</P>
+<PRE>
+    &gt; generate_trees | put_trees -transform=solve | linearize
+</PRE>
+<P>
+The method of creating statistical language model from corpora synthesized
+from GF grammars is applied and evaluated in (Jonson 2006).
+</P>
+<P>
+<B>Exercise</B>. Measure the size of the corpus generated from
+<CODE>SmartEng</CODE> (defined <a href="#secsmarthouse">here</a>), with and without type checker filtering.
+</P>
+
+<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) -->
+<!-- cmdline: txt2tags -thtml -\-toc gf-tutorial.txt -->
+</BODY></HTML>
author	aarne <aarne@cs.chalmers.se>	2007-12-21 15:10:38 +0000
committer	aarne <aarne@cs.chalmers.se>	2007-12-21 15:10:38 +0000
commit	5ee1714fd23e974d1cf2511fa398b6ce310a9807 (patch)
tree	7a82f85d4f4681086430fdefd7903e4a26015c3f /doc/gf-tutorial.html
parent	c5017f28aad7702838b9861aa3f6cbf7b3bacca5 (diff)