diff options
| author | aarne <aarne@chalmers.se> | 2010-12-23 09:33:11 +0000 |
|---|---|---|
| committer | aarne <aarne@chalmers.se> | 2010-12-23 09:33:11 +0000 |
| commit | 3036881d206332a8ca9ed4d97715bf24142227a4 (patch) | |
| tree | cd729ea6c66bd654efbd4d22ee83fb9b30901d0b /doc/tutorial/gf-tutorial.html | |
| parent | 0b4aad88f6924672f45f3f3405ad4854700e85ab (diff) | |
updated tutorial and quickstart for 3.2
Diffstat (limited to 'doc/tutorial/gf-tutorial.html')
| -rw-r--r-- | doc/tutorial/gf-tutorial.html | 520 |
1 files changed, 468 insertions, 52 deletions
diff --git a/doc/tutorial/gf-tutorial.html b/doc/tutorial/gf-tutorial.html index 46b17b96b..3652df3a1 100644 --- a/doc/tutorial/gf-tutorial.html +++ b/doc/tutorial/gf-tutorial.html @@ -8,12 +8,264 @@ <P ALIGN="center"><CENTER><H1>Grammatical Framework Tutorial</H1> <FONT SIZE="4"> <I>Aarne Ranta</I><BR> -December 2010 (November 2008) +December 2010 for GF 3.2 </FONT></CENTER> +<P></P> +<HR NOSHADE SIZE=1> +<P></P> + <UL> + <LI><A HREF="#toc1">Overview</A> + <UL> + <LI><A HREF="#toc2">Outline</A> + <LI><A HREF="#toc3">Slides</A> + </UL> + <LI><A HREF="#toc4">Lesson 1: Getting Started with GF</A> + <UL> + <LI><A HREF="#toc5">What GF is</A> + <LI><A HREF="#toc6">GF grammars and language processing tasks</A> + <LI><A HREF="#toc7">Getting the GF system</A> + <LI><A HREF="#toc8">Running the GF system</A> + <LI><A HREF="#toc9">A "Hello World" grammar</A> + <UL> + <LI><A HREF="#toc10">The program: abstract syntax and concrete syntaxes</A> + <LI><A HREF="#toc11">Using grammars in the GF system</A> + <LI><A HREF="#toc12">Exercises on the Hello World grammar</A> + </UL> + <LI><A HREF="#toc13">Using grammars from outside GF</A> + <LI><A HREF="#toc14">GF scripts</A> + <LI><A HREF="#toc15">What else can be done with the grammar</A> + <LI><A HREF="#toc16">Embedded grammar applications</A> + </UL> + <LI><A HREF="#toc17">Lesson 2: Designing a grammar for complex phrases</A> + <UL> + <LI><A HREF="#toc18">The abstract syntax Food</A> + <LI><A HREF="#toc19">The concrete syntax FoodEng</A> + <UL> + <LI><A HREF="#toc20">Exercises on the Food grammar</A> + </UL> + <LI><A HREF="#toc21">Commands for testing grammars</A> + <UL> + <LI><A HREF="#toc22">Generating trees and strings</A> + <LI><A HREF="#toc23">Exercises on generation</A> + <LI><A HREF="#toc24">More on pipes: tracing</A> + <LI><A HREF="#toc25">Writing and reading files</A> + <LI><A HREF="#toc26">Visualizing trees</A> + <LI><A HREF="#toc27">System commands</A> + </UL> + <LI><A HREF="#toc28">An Italian concrete syntax</A> + <UL> + <LI><A HREF="#toc29">Exercises on multilinguality</A> + </UL> + <LI><A HREF="#toc30">Free variation</A> + <LI><A HREF="#toc31">More application of multilingual grammars</A> + <UL> + <LI><A HREF="#toc32">Multilingual treebanks</A> + <LI><A HREF="#toc33">Translation quiz</A> + </UL> + <LI><A HREF="#toc34">Context-free grammars and GF</A> + <UL> + <LI><A HREF="#toc35">The "cf" grammar format</A> + <LI><A HREF="#toc36">Restrictions of context-free grammars</A> + </UL> + <LI><A HREF="#toc37">Modules and files</A> + <LI><A HREF="#toc38">Using operations and resource modules</A> + <UL> + <LI><A HREF="#toc39">Operation definitions</A> + <LI><A HREF="#toc40">The ``resource`` module type</A> + <LI><A HREF="#toc41">Opening a resource</A> + <LI><A HREF="#toc42">Partial application</A> + <LI><A HREF="#toc43">Testing resource modules</A> + </UL> + <LI><A HREF="#toc44">Grammar architecture</A> + <UL> + <LI><A HREF="#toc45">Extending a grammar</A> + <LI><A HREF="#toc46">Multiple inheritance</A> + </UL> + </UL> + <LI><A HREF="#toc47">Lesson 3: Grammars with parameters</A> + <UL> + <LI><A HREF="#toc48">The problem: words have to be inflected</A> + <LI><A HREF="#toc49">Parameters and tables</A> + <LI><A HREF="#toc50">Inflection tables and paradigms</A> + <UL> + <LI><A HREF="#toc51">Exercises on morphology</A> + </UL> + <LI><A HREF="#toc52">Using parameters in concrete syntax</A> + <UL> + <LI><A HREF="#toc53">Agreement</A> + <LI><A HREF="#toc54">Determiners</A> + <LI><A HREF="#toc55">Parametric vs. inherent features</A> + </UL> + <LI><A HREF="#toc56">An English concrete syntax for Foods with parameters</A> + <LI><A HREF="#toc57">More on inflection paradigms</A> + <UL> + <LI><A HREF="#toc58">Worst-case functions</A> + <LI><A HREF="#toc59">Smart paradigms</A> + <LI><A HREF="#toc60">Exercises on regular patterns</A> + <LI><A HREF="#toc61">Function types with variables</A> + <LI><A HREF="#toc62">Separating operation types and definitions</A> + <LI><A HREF="#toc63">Overloading of operations</A> + <LI><A HREF="#toc64">Morphological analysis and morphology quiz</A> + </UL> + <LI><A HREF="#toc65">The Italian Foods grammar</A> + <UL> + <LI><A HREF="#toc66">Exercises on using parameters</A> + </UL> + <LI><A HREF="#toc67">Discontinuous constituents</A> + <LI><A HREF="#toc68">Strings at compile time vs. run time</A> + <UL> + <LI><A HREF="#toc69">Supplementary constructs for concrete syntax</A> + </UL> + </UL> + <LI><A HREF="#toc70">Lesson 4: Using the resource grammar library</A> + <UL> + <LI><A HREF="#toc71">The coverage of the library</A> + <LI><A HREF="#toc72">The structure of the library</A> + <UL> + <LI><A HREF="#toc73">Lexical vs. phrasal rules</A> + <LI><A HREF="#toc74">Lexical categories</A> + <LI><A HREF="#toc75">Lexical rules</A> + <LI><A HREF="#toc76">Resource lexicon</A> + <LI><A HREF="#toc77">Phrasal categories</A> + <LI><A HREF="#toc78">Syntactic combinations</A> + <LI><A HREF="#toc79">Example syntactic combination</A> + </UL> + <LI><A HREF="#toc80">The resource API</A> + <UL> + <LI><A HREF="#toc81">A miniature resource API: categories</A> + <LI><A HREF="#toc82">A miniature resource API: rules</A> + <LI><A HREF="#toc83">A miniature resource API: structural words</A> + <LI><A HREF="#toc84">A miniature resource API: paradigms</A> + <LI><A HREF="#toc85">A miniature resource API: more paradigms</A> + <LI><A HREF="#toc86">Exercises</A> + </UL> + <LI><A HREF="#toc87">Example: English</A> + <UL> + <LI><A HREF="#toc88">English example: linearization types and combination rules</A> + <LI><A HREF="#toc89">English example: lexical rules</A> + <LI><A HREF="#toc90">English example: exercises</A> + </UL> + <LI><A HREF="#toc91">Functor implementation of multilingual grammars</A> + <UL> + <LI><A HREF="#toc92">New language by copy and paste</A> + <LI><A HREF="#toc93">Functors: functions on the module level</A> + <LI><A HREF="#toc94">Code for the Foods functor</A> + <LI><A HREF="#toc95">Code for the LexFoods interface</A> + <LI><A HREF="#toc96">Code for a German instance of the lexicon</A> + <LI><A HREF="#toc97">Code for a German functor instantiation</A> + <LI><A HREF="#toc98">Adding languages to a functor implementation</A> + <LI><A HREF="#toc99">Example: adding Finnish</A> + <LI><A HREF="#toc100">A design pattern</A> + <LI><A HREF="#toc101">Functors: exercises</A> + </UL> + <LI><A HREF="#toc102">Restricted inheritance</A> + <UL> + <LI><A HREF="#toc103">A problem with functors</A> + <LI><A HREF="#toc104">Restricted inheritance: include or exclude</A> + <LI><A HREF="#toc105">The functor problem solved</A> + </UL> + <LI><A HREF="#toc106">Grammar reuse</A> + <UL> + <LI><A HREF="#toc107">Library exercises</A> + </UL> + <LI><A HREF="#toc108">Tenses</A> + </UL> + <LI><A HREF="#toc109">Lesson 5: Refining semantics in abstract syntax</A> + <UL> + <LI><A HREF="#toc110">Dependent types</A> + <UL> + <LI><A HREF="#toc111">A dependent type system</A> + <LI><A HREF="#toc112">Examples of devices and actions</A> + <LI><A HREF="#toc113">Linearization and parsing with dependent types</A> + <LI><A HREF="#toc114">Solving metavariables</A> + </UL> + <LI><A HREF="#toc115">Polymorphism</A> + <UL> + <LI><A HREF="#toc116">Dependent types: exercises</A> + </UL> + <LI><A HREF="#toc117">Proof objects</A> + <UL> + <LI><A HREF="#toc118">Proof-carrying documents</A> + </UL> + <LI><A HREF="#toc119">Restricted polymorphism</A> + <UL> + <LI><A HREF="#toc120">Example: classes for switching and dimming</A> + </UL> + <LI><A HREF="#toc121">Variable bindings</A> + <UL> + <LI><A HREF="#toc122">Higher-order abstract syntax</A> + <LI><A HREF="#toc123">Higher-order abstract syntax: linearization</A> + <LI><A HREF="#toc124">Eta expansion</A> + <LI><A HREF="#toc125">Parsing variable bindings</A> + <LI><A HREF="#toc126">Exercises on variable bindings</A> + </UL> + <LI><A HREF="#toc127">Semantic definitions</A> + <UL> + <LI><A HREF="#toc128">Computing a tree</A> + <LI><A HREF="#toc129">Definitional equality</A> + <LI><A HREF="#toc130">Judgement forms for constructors</A> + <LI><A HREF="#toc131">Exercises on semantic definitions</A> + </UL> + <LI><A HREF="#toc132">Lesson 6: Grammars of formal languages</A> + <UL> + <LI><A HREF="#toc133">Arithmetic expressions</A> + <LI><A HREF="#toc134">Concrete syntax: a simple approach</A> + </UL> + <LI><A HREF="#toc135">Lexing and unlexing</A> + <UL> + <LI><A HREF="#toc136">Most common lexers and unlexers</A> + </UL> + <LI><A HREF="#toc137">Precedence and fixity</A> + <UL> + <LI><A HREF="#toc138">Precedence as a parameter</A> + <LI><A HREF="#toc139">Fixities</A> + <LI><A HREF="#toc140">Exercises on precedence</A> + </UL> + <LI><A HREF="#toc141">Code generation as linearization</A> + <UL> + <LI><A HREF="#toc142">Programs with variables</A> + <LI><A HREF="#toc143">Exercises on code generation</A> + </UL> + </UL> + <LI><A HREF="#toc144">Lesson 7: Embedded grammars</A> + <UL> + <LI><A HREF="#toc145">Functionalities of an embedded grammar format</A> + <LI><A HREF="#toc146">The portable grammar format</A> + <UL> + <LI><A HREF="#toc147">Haskell: the EmbedAPI module</A> + <LI><A HREF="#toc148">First application: a translator</A> + <LI><A HREF="#toc149">Producing PGF for the translator</A> + <LI><A HREF="#toc150">A translator loop</A> + <LI><A HREF="#toc151">A question-answer system</A> + <LI><A HREF="#toc152">Abstract syntax of the query system</A> + <LI><A HREF="#toc153">Exporting GF datatypes to Haskell</A> + <LI><A HREF="#toc154">The question-answer function</A> + <LI><A HREF="#toc155">Converting between Haskell and GF trees</A> + <LI><A HREF="#toc156">Putting it all together: the transfer definition</A> + <LI><A HREF="#toc157">Putting it all together: the Main module</A> + <LI><A HREF="#toc158">Putting it all together: the Makefile</A> + </UL> + <LI><A HREF="#toc159">Web server applications</A> + <LI><A HREF="#toc160">JavaScript applications</A> + <UL> + <LI><A HREF="#toc161">Compiling to JavaScript</A> + <LI><A HREF="#toc162">Using the JavaScript grammar</A> + </UL> + <LI><A HREF="#toc163">Language models for speech recognition</A> + <UL> + <LI><A HREF="#toc164">More speech recognition grammar formats</A> + </UL> + </UL> + </UL> + +<P></P> +<HR NOSHADE SIZE=1> +<P></P> <P> <!-- NEW --> </P> +<A NAME="toc1"></A> <H1>Overview</H1> <P> This is a hands-on introduction to grammar writing in GF. @@ -40,6 +292,7 @@ Prerequisites: <P> <!-- NEW --> </P> +<A NAME="toc2"></A> <H2>Outline</H2> <P> <a href="#chaptwo">Lesson 1</a>: a multilingual "Hello World" grammar. English, Finnish, Italian. @@ -66,6 +319,7 @@ and <B>semantic definitions</B>. <P> <!-- NEW --> </P> +<A NAME="toc3"></A> <H2>Slides</H2> <P> You can chop this tutorial into a set of slides by the command @@ -89,6 +343,7 @@ upper left corner of each slide, and the links behind the "Contents" link. <P> <!-- NEW --> </P> +<A NAME="toc4"></A> <H1>Lesson 1: Getting Started with GF</H1> <P> <a name="chaptwo"></a> @@ -105,6 +360,7 @@ Goals: <P> <!-- NEW --> </P> +<A NAME="toc5"></A> <H2>What GF is</H2> <P> We use the term GF for three different things: @@ -133,6 +389,7 @@ using the GF system. <P> <!-- NEW --> </P> +<A NAME="toc6"></A> <H2>GF grammars and language processing tasks</H2> <P> A GF program is called a <B>grammar</B>. @@ -160,6 +417,7 @@ In general, a GF grammar is <B>multilingual</B>: <P> <!-- NEW --> </P> +<A NAME="toc7"></A> <H2>Getting the GF system</H2> <P> Open-source free software, downloaded via the GF Homepage: @@ -188,6 +446,7 @@ instructions in the <A HREF="../gf-developers.html">Developers Guide</A>. <P> <!-- NEW --> </P> +<A NAME="toc8"></A> <H2>Running the GF system</H2> <P> Type <CODE>gf</CODE> in the Unix (or Cygwin) shell: @@ -220,6 +479,7 @@ follow them. <P> <!-- NEW --> </P> +<A NAME="toc9"></A> <H2>A "Hello World" grammar</H2> <P> Like most programming language tutorials, we start with a @@ -237,6 +497,7 @@ Extra features: <P> <!-- NEW --> </P> +<A NAME="toc10"></A> <H3>The program: abstract syntax and concrete syntaxes</H3> <P> A GF program, in general, is a <B>multilingual grammar</B>. Its main parts @@ -356,6 +617,7 @@ Finnish and an Italian concrete syntaxes: <P> <!-- NEW --> </P> +<A NAME="toc11"></A> <H3>Using grammars in the GF system</H3> <P> In order to compile the grammar in GF, @@ -462,6 +724,7 @@ Linearization is by default to all available languages. <P> <!-- NEW --> </P> +<A NAME="toc12"></A> <H3>Exercises on the Hello World grammar</H3> <OL> <LI>Test the parsing and translation examples shown above, as well as @@ -491,6 +754,7 @@ of a variable. Inspect the error messages generated by GF. <P> <!-- NEW --> </P> +<A NAME="toc13"></A> <H2>Using grammars from outside GF</H2> <P> You can use the <CODE>gf</CODE> program in a Unix pipe. @@ -516,6 +780,7 @@ You can also write a <B>script</B>, a file containing the lines <P> <!-- NEW --> </P> +<A NAME="toc14"></A> <H2>GF scripts</H2> <P> If we name this script <CODE>hello.gfs</CODE>, we can do @@ -541,6 +806,7 @@ translation to the output. <P> <!-- NEW --> </P> +<A NAME="toc15"></A> <H2>What else can be done with the grammar</H2> <P> Some more functions that will be covered: @@ -559,6 +825,7 @@ Some more functions that will be covered: <P> <!-- NEW --> </P> +<A NAME="toc16"></A> <H2>Embedded grammar applications</H2> <P> Application programs, using techniques from <a href="#chapeight">Lesson 7</a>: @@ -580,6 +847,7 @@ Application programs, using techniques from <a href="#chapeight">Lesson 7</a>: <P> <!-- NEW --> </P> +<A NAME="toc17"></A> <H1>Lesson 2: Designing a grammar for complex phrases</H1> <P> <a name="chapthree"></a> @@ -596,6 +864,7 @@ Goals: <P> <!-- NEW --> </P> +<A NAME="toc18"></A> <H2>The abstract syntax Food</H2> <P> Phrases usable for speaking about food: @@ -643,6 +912,7 @@ Example <CODE>Phrase</CODE> <P> <!-- NEW --> </P> +<A NAME="toc19"></A> <H2>The concrete syntax FoodEng</H2> <PRE> concrete FoodEng of Food = { @@ -690,6 +960,7 @@ Parse in other categories setting the <CODE>cat</CODE> flag: <P> <!-- NEW --> </P> +<A NAME="toc20"></A> <H3>Exercises on the Food grammar</H3> <OL> <LI>Extend the <CODE>Food</CODE> grammar by ten new food kinds and @@ -706,7 +977,9 @@ the prefix can occur at most once. <P> <!-- NEW --> </P> +<A NAME="toc21"></A> <H2>Commands for testing grammars</H2> +<A NAME="toc22"></A> <H3>Generating trees and strings</H3> <P> Random generation (<CODE>generate_random = gr</CODE>): build @@ -768,6 +1041,7 @@ What options a command has can be seen by the <CODE>help = h</CODE> command: <P> <!-- NEW --> </P> +<A NAME="toc23"></A> <H3>Exercises on generation</H3> <OL> <LI>If the command <CODE>gt</CODE> generated all @@ -781,6 +1055,7 @@ use the Unix <B>word count</B> command <CODE>wc</CODE> to count lines. <P> <!-- NEW --> </P> +<A NAME="toc24"></A> <H3>More on pipes: tracing</H3> <P> Put the <B>tracing</B> option <CODE>-tr</CODE> to each command whose output you @@ -805,6 +1080,7 @@ strings, and try out the ambiguity test. <P> <!-- NEW --> </P> +<A NAME="toc25"></A> <H3>Writing and reading files</H3> <P> To save the outputs into a file, pipe it to the <CODE>write_file = wf</CODE> command, @@ -829,6 +1105,7 @@ of grammars - the most systematic way to do this is by <P> <!-- NEW --> </P> +<A NAME="toc26"></A> <H3>Visualizing trees</H3> <P> Parentheses give a linear representation of trees, @@ -867,10 +1144,21 @@ program (from the Graphviz package). <PRE> % dot -Tpng _grph.dot > mytree.png </PRE> +<P> +You can also visualize <B>parse trees</B>, which show categories and words instead of +function symbols. The command is <CODE>visualize_parse = vp</CODE>: +</P> +<PRE> + > parse "this delicious cheese is very Italian" | visualize_parse +</PRE> <P></P> <P> +<IMG ALIGN="middle" SRC="myparse.png" BORDER="0" ALT=""> +</P> +<P> <!-- NEW --> </P> +<A NAME="toc27"></A> <H3>System commands</H3> <P> You can give a <B>system command</B> without leaving GF: @@ -882,10 +1170,10 @@ You can give a <B>system command</B> without leaving GF: </PRE> <P> A system command may also receive its argument from -a GF pipes. It then has the name <CODE>sp</CODE> = <CODE>system_pipe</CODE>: +a GF pipes. It then uses the symbol <CODE>?</CODE>: </P> <PRE> - > generate_trees -depth=4 | sp -command="wc -l" + > generate_trees -depth=4 | ? wc -l </PRE> <P> This command example returns the number of generated trees. @@ -899,6 +1187,7 @@ a system pipe from a GF command into a Unix command. <P> <!-- NEW --> </P> +<A NAME="toc28"></A> <H2>An Italian concrete syntax</H2> <P> <a name="secanitalian"></a> @@ -953,6 +1242,7 @@ which are introduced in <a href="#chaptwo">Lesson 3</a>.) <P> <!-- NEW --> </P> +<A NAME="toc29"></A> <H3>Exercises on multilinguality</H3> <OL> <LI>Write a concrete syntax of <CODE>Food</CODE> for some other language. @@ -970,6 +1260,7 @@ after having worked out <a href="#chaptwo">Lesson 3</a>. <P> <!-- NEW --> </P> +<A NAME="toc30"></A> <H2>Free variation</H2> <P> Semantically indistinguishable ways of expressing a thing. @@ -1017,7 +1308,9 @@ a variant list must be of the same type. <P> <!-- NEW --> </P> +<A NAME="toc31"></A> <H2>More application of multilingual grammars</H2> +<A NAME="toc32"></A> <H3>Multilingual treebanks</H3> <P> <a name="sectreebank"></a> @@ -1041,6 +1334,7 @@ linearizations in different languages: <P> <!-- NEW --> </P> +<A NAME="toc33"></A> <H3>Translation quiz</H3> <P> <CODE>translation_quiz = tq</CODE>: @@ -1072,7 +1366,9 @@ answer given in another language. <P> <!-- NEW --> </P> +<A NAME="toc34"></A> <H2>Context-free grammars and GF</H2> +<A NAME="toc35"></A> <H3>The "cf" grammar format</H3> <P> The grammar <CODE>FoodEng</CODE> can be written in a BNF format as follows: @@ -1106,6 +1402,7 @@ The compiler creates separate abstract and concrete modules internally. <P> <!-- NEW --> </P> +<A NAME="toc36"></A> <H3>Restrictions of context-free grammars</H3> <P> Separating concrete and abstract syntax allows @@ -1124,6 +1421,7 @@ copy language <CODE>{x x | x <- (a|b)*}</CODE> in GF. <P> <!-- NEW --> </P> +<A NAME="toc37"></A> <H2>Modules and files</H2> <P> GF uses suffixes to recognize different file formats: @@ -1169,7 +1467,9 @@ a second time? Try this in different situations: <P> <!-- NEW --> </P> +<A NAME="toc38"></A> <H2>Using operations and resource modules</H2> +<A NAME="toc39"></A> <H3>Operation definitions</H3> <P> The golden rule of functional programmin: @@ -1231,6 +1531,7 @@ sugar for abstraction: <P> <!-- NEW --> </P> +<A NAME="toc40"></A> <H3>The ``resource`` module type</H3> <P> The <CODE>resource</CODE> module type is used to package @@ -1249,6 +1550,7 @@ The <CODE>resource</CODE> module type is used to package <P> <!-- NEW --> </P> +<A NAME="toc41"></A> <H3>Opening a resource</H3> <P> Any number of <CODE>resource</CODE> modules can be @@ -1281,6 +1583,7 @@ Any number of <CODE>resource</CODE> modules can be <P> <!-- NEW --> </P> +<A NAME="toc42"></A> <H3>Partial application</H3> <P> <a name="secpartapp"></a> @@ -1318,6 +1621,7 @@ such that it allows you to write <P> <!-- NEW --> </P> +<A NAME="toc43"></A> <H3>Testing resource modules</H3> <P> Import with the flag <CODE>-retain</CODE>, @@ -1336,10 +1640,12 @@ Compute the value with <CODE>compute_concrete = cc</CODE>, <P> <!-- NEW --> </P> +<A NAME="toc44"></A> <H2>Grammar architecture</H2> <P> <a name="secarchitecture"></a> </P> +<A NAME="toc45"></A> <H3>Extending a grammar</H3> <P> A new module can <B>extend</B> an old one: @@ -1395,6 +1701,7 @@ possible to build resource hierarchies. <P> <!-- NEW --> </P> +<A NAME="toc46"></A> <H3>Multiple inheritance</H3> <P> Extend several grammars at the same time: @@ -1428,6 +1735,7 @@ where <P> <!-- NEW --> </P> +<A NAME="toc47"></A> <H1>Lesson 3: Grammars with parameters</H1> <P> <a name="chapfour"></a> @@ -1456,6 +1764,7 @@ could be left to library implementors. <P> <!-- NEW --> </P> +<A NAME="toc48"></A> <H2>The problem: words have to be inflected</H2> <P> Plural forms are needed in things like @@ -1488,6 +1797,7 @@ adjectives, and verbs can have in some languages that you know. <P> <!-- NEW --> </P> +<A NAME="toc49"></A> <H2>Parameters and tables</H2> <P> We define the <B>parameter type</B> of number in English by @@ -1598,6 +1908,7 @@ module, which you can test by using the command <CODE>compute_concrete</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc50"></A> <H2>Inflection tables and paradigms</H2> <P> A morphological <B>paradigm</B> is a formula telling how a class of @@ -1649,6 +1960,7 @@ uses a <B>wild card</B> pattern <CODE>_</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc51"></A> <H3>Exercises on morphology</H3> <OL> <LI>Identify cases in which the <CODE>regNoun</CODE> paradigm does not @@ -1661,6 +1973,7 @@ considered in earlier exercises. <P> <!-- NEW --> </P> +<A NAME="toc52"></A> <H2>Using parameters in concrete syntax</H2> <P> Purpose: a more radical @@ -1685,6 +1998,7 @@ This will force us to deal with gender- <P> <!-- NEW --> </P> +<A NAME="toc53"></A> <H3>Agreement</H3> <P> In English, the phrase-forming rule @@ -1726,6 +2040,7 @@ Now we can write <P> <!-- NEW --> </P> +<A NAME="toc54"></A> <H3>Determiners</H3> <P> How does an <CODE>Item</CODE> subject receive its number? The rules @@ -1795,6 +2110,7 @@ In a more <B>lexicalized</B> grammar, determiners would be a category: <P> <!-- NEW --> </P> +<A NAME="toc55"></A> <H3>Parametric vs. inherent features</H3> <P> <CODE>Kind</CODE>s have number as a <B>parametric feature</B>: both singular and plural @@ -1862,6 +2178,7 @@ Notice <P> <!-- NEW --> </P> +<A NAME="toc56"></A> <H2>An English concrete syntax for Foods with parameters</H2> <P> We use some string operations from the library <CODE>Prelude</CODE> are used. @@ -1926,6 +2243,7 @@ We use some string operations from the library <CODE>Prelude</CODE> are used. <P> <!-- NEW --> </P> +<A NAME="toc57"></A> <H2>More on inflection paradigms</H2> <P> <a name="secinflection"></a> @@ -1939,6 +2257,7 @@ add words to a lexicon. <P> <!-- NEW --> </P> +<A NAME="toc58"></A> <H3>Worst-case functions</H3> <P> We perform <B>data abstraction</B> from the type @@ -2028,6 +2347,7 @@ parameters. <P> <!-- NEW --> </P> +<A NAME="toc59"></A> <H3>Smart paradigms</H3> <P> The regular <I>dog</I>-<I>dogs</I> paradigm has @@ -2094,6 +2414,7 @@ the suffix <CODE>"oo"</CODE> prevents <I>bamboo</I> from matching the suffix <P> <!-- NEW --> </P> +<A NAME="toc60"></A> <H3>Exercises on regular patterns</H3> <OL> <LI>The same rules that form plural nouns in English also @@ -2118,6 +2439,7 @@ operation to see whether it correctly changes <I>Arzt</I> to <I>Ärzt</I>, <P> <!-- NEW --> </P> +<A NAME="toc61"></A> <H3>Function types with variables</H3> <P> In <a href="#chapsix">Lesson 5</a>, <B>dependent function types</B> need a notation @@ -2173,6 +2495,7 @@ looking like the expected forms: <P> <!-- NEW --> </P> +<A NAME="toc62"></A> <H3>Separating operation types and definitions</H3> <P> In librarues, it is useful to group type signatures separately from @@ -2192,6 +2515,7 @@ With the <CODE>interface</CODE> and <CODE>instance</CODE> module types <P> <!-- NEW --> </P> +<A NAME="toc63"></A> <H3>Overloading of operations</H3> <P> <B>Overloading</B>: different functions can be given the same name, as e.g. in C++. @@ -2233,6 +2557,7 @@ an overload group. <P> <!-- NEW --> </P> +<A NAME="toc64"></A> <H3>Morphological analysis and morphology quiz</H3> <P> The command <CODE>morpho_analyse = ma</CODE> @@ -2269,6 +2594,7 @@ To create a list for later use, use the command <CODE>morpho_list = ml</CODE> <P> <!-- NEW --> </P> +<A NAME="toc65"></A> <H2>The Italian Foods grammar</H2> <P> <a name="secitalian"></a> @@ -2406,6 +2732,7 @@ The complete set of linearization rules: <P> <!-- NEW --> </P> +<A NAME="toc66"></A> <H3>Exercises on using parameters</H3> <OL> <LI>Experiment with multilingual generation and translation in the @@ -2425,6 +2752,7 @@ now aiming for complete grammatical correctness by the use of parameters. <P> <!-- NEW --> </P> +<A NAME="toc67"></A> <H2>Discontinuous constituents</H2> <P> A linearization record may contain more strings than one, and those @@ -2462,6 +2790,7 @@ but can be defined in GF by using discontinuous constituents. <P> <!-- NEW --> </P> +<A NAME="toc68"></A> <H2>Strings at compile time vs. run time</H2> <P> Tokens are created in the following ways: @@ -2520,6 +2849,7 @@ This topic will be covered in <a href="#seclexing">here</a>. <P> <!-- NEW --> </P> +<A NAME="toc69"></A> <H3>Supplementary constructs for concrete syntax</H3> <H4>Record extension and subtyping</H4> <P> @@ -2581,6 +2911,7 @@ Thus <P> <!-- NEW --> </P> +<A NAME="toc70"></A> <H1>Lesson 4: Using the resource grammar library</H1> <P> <a name="chapfive"></a> @@ -2597,32 +2928,38 @@ Goals: <P> <!-- NEW --> </P> +<A NAME="toc71"></A> <H2>The coverage of the library</H2> <P> -The current 12 resource languages are +The current 16 resource languages (GF version 3.2, December 2010) are </P> <UL> <LI><CODE>Bul</CODE>garian <LI><CODE>Cat</CODE>alan <LI><CODE>Dan</CODE>ish +<LI><CODE>Dut</CODE>ch <LI><CODE>Eng</CODE>lish <LI><CODE>Fin</CODE>nish <LI><CODE>Fre</CODE>nch <LI><CODE>Ger</CODE>man <LI><CODE>Ita</CODE>lian <LI><CODE>Nor</CODE>wegian +<LI><CODE>Pol</CODE>ish +<LI><CODE>Ron</CODE>, Romanian <LI><CODE>Rus</CODE>sian <LI><CODE>Spa</CODE>nish <LI><CODE>Swe</CODE>dish +<LI><CODE>Urd</CODE>u </UL> <P> The first three letters (<CODE>Eng</CODE> etc) are used in grammar module names -(ISO 639 standard). +(ISO 639-3 standard). </P> <P> <!-- NEW --> </P> +<A NAME="toc72"></A> <H2>The structure of the library</H2> <P> <a name="seclexical"></a> @@ -2644,6 +2981,7 @@ wider coverage than with semantic grammars. <P> <!-- NEW --> </P> +<A NAME="toc73"></A> <H3>Lexical vs. phrasal rules</H3> <P> A resource grammar has two kinds of categories and two kinds of rules: @@ -2671,6 +3009,7 @@ But it is a good discipline to follow. <P> <!-- NEW --> </P> +<A NAME="toc74"></A> <H3>Lexical categories</H3> <P> Two kinds of lexical categories: @@ -2683,8 +3022,7 @@ Two kinds of lexical categories: <LI>structural words / function words, e.g. <PRE> Conj ; -- conjunction e.g. "and" - QuantSg ; -- singular quantifier e.g. "this" - QuantPl ; -- plural quantifier e.g. "this" + Det ; -- determiner e.g. "this" </PRE> <P></P> </UL> @@ -2703,13 +3041,13 @@ Two kinds of lexical categories: <P> <!-- NEW --> </P> +<A NAME="toc75"></A> <H3>Lexical rules</H3> <P> Closed classes: module <CODE>Syntax</CODE>. In the <CODE>Foods</CODE> grammar, we need </P> <PRE> - this_QuantSg, that_QuantSg : QuantSg ; - these_QuantPl, those_QuantPl : QuantPl ; + this_Det, that_Det, these_Det, those_Det : Det ; very_AdA : AdA ; </PRE> <P> @@ -2735,6 +3073,7 @@ where we use <CODE>mkN</CODE> from <CODE>ParadigmsEng</CODE>: <P> <!-- NEW --> </P> +<A NAME="toc76"></A> <H3>Resource lexicon</H3> <P> Alternative concrete syntax for @@ -2765,6 +3104,7 @@ Advantages: <P> <!-- NEW --> </P> +<A NAME="toc77"></A> <H3>Phrasal categories</H3> <P> In <CODE>Foods</CODE>, we need just four phrasal categories: @@ -2785,14 +3125,14 @@ Common nouns are made into noun phrases by adding determiners. <P> <!-- NEW --> </P> +<A NAME="toc78"></A> <H3>Syntactic combinations</H3> <P> We need the following combinations: </P> <PRE> mkCl : NP -> AP -> Cl ; -- e.g. "this pizza is very warm" - mkNP : QuantSg -> CN -> NP ; -- e.g. "this pizza" - mkNP : QuantPl -> CN -> NP ; -- e.g. "these pizzas" + mkNP : Det -> CN -> NP ; -- e.g. "this pizza" mkCN : AP -> CN -> CN ; -- e.g. "warm pizza" mkAP : AdA -> AP -> AP ; -- e.g. "very warm" </PRE> @@ -2813,6 +3153,7 @@ Heavy overloading: the current library <P> <!-- NEW --> </P> +<A NAME="toc79"></A> <H3>Example syntactic combination</H3> <P> The sentence @@ -2823,7 +3164,7 @@ can be built as follows: </P> <PRE> mkCl - (mkNP these_QuantPl + (mkNP these_Det (mkCN (mkAP very_AdA (mkAP warm_A)) (mkCN pizza_CN))) (mkAP italian_AP) </PRE> @@ -2838,6 +3179,7 @@ this syntactic tree gives the value of linearizing the semantic tree <P> <!-- NEW --> </P> +<A NAME="toc80"></A> <H2>The resource API</H2> <P> Language-specific and language-independent parts - roughly, @@ -2854,11 +3196,12 @@ Language-specific and language-independent parts - roughly, Full API documentation on-line: the <B>resource synopsis</B>, </P> <P> -<A HREF="http://grammaticalframework.org/lib/doc/synopsis.html"><CODE>grammaticalframework.org/lib/resource/doc/synopsis.html</CODE></A> +<A HREF="http://grammaticalframework.org/lib/doc/synopsis.html"><CODE>grammaticalframework.org/lib/doc/synopsis.html</CODE></A> </P> <P> <!-- NEW --> </P> +<A NAME="toc81"></A> <H3>A miniature resource API: categories</H3> <TABLE CELLPADDING="4" BORDER="1"> <TR> @@ -2892,16 +3235,11 @@ Full API documentation on-line: the <B>resource synopsis</B>, <TD><I>very</I></TD> </TR> <TR> -<TD><CODE>QuantSg</CODE></TD> -<TD>singular quantifier</TD> +<TD><CODE>Det</CODE></TD> +<TD>determiner</TD> <TD><I>these</I></TD> </TR> <TR> -<TD><CODE>QuantPl</CODE></TD> -<TD>plural quantifier</TD> -<TD><I>this</I></TD> -</TR> -<TR> <TD><CODE>A</CODE></TD> <TD>one-place adjective</TD> <TD><I>warm</I></TD> @@ -2916,6 +3254,7 @@ Full API documentation on-line: the <B>resource synopsis</B>, <P> <!-- NEW --> </P> +<A NAME="toc82"></A> <H3>A miniature resource API: rules</H3> <TABLE CELLPADDING="4" BORDER="1"> <TR> @@ -2930,12 +3269,7 @@ Full API documentation on-line: the <B>resource synopsis</B>, </TR> <TR> <TD><CODE>mkNP</CODE></TD> -<TD><CODE>QuantSg -> CN -> NP</CODE></TD> -<TD><I>this old man</I></TD> -</TR> -<TR> -<TD><CODE>mkNP</CODE></TD> -<TD><CODE>QuantPl -> CN -> NP</CODE></TD> +<TD><CODE>Det -> CN -> NP</CODE></TD> <TD><I>these old man</I></TD> </TR> <TR> @@ -2963,6 +3297,7 @@ Full API documentation on-line: the <B>resource synopsis</B>, <P> <!-- NEW --> </P> +<A NAME="toc83"></A> <H3>A miniature resource API: structural words</H3> <TABLE CELLPADDING="4" BORDER="1"> <TR> @@ -2971,23 +3306,23 @@ Full API documentation on-line: the <B>resource synopsis</B>, <TH COLSPAN="2">In English</TH> </TR> <TR> -<TD><CODE>this_QuantSg</CODE></TD> -<TD><CODE>QuantSg</CODE></TD> +<TD><CODE>this_Det</CODE></TD> +<TD><CODE>Det</CODE></TD> <TD><I>this</I></TD> </TR> <TR> -<TD><CODE>that_QuantSg</CODE></TD> -<TD><CODE>QuantSg</CODE></TD> +<TD><CODE>that_Det</CODE></TD> +<TD><CODE>Det</CODE></TD> <TD><I>that</I></TD> </TR> <TR> -<TD><CODE>these_QuantPl</CODE></TD> -<TD><CODE>QuantPl</CODE></TD> +<TD><CODE>these_Det</CODE></TD> +<TD><CODE>Det</CODE></TD> <TD><I>this</I></TD> </TR> <TR> -<TD><CODE>those_QuantPl</CODE></TD> -<TD><CODE>QuantPl</CODE></TD> +<TD><CODE>those_Det</CODE></TD> +<TD><CODE>Det</CODE></TD> <TD><I>that</I></TD> </TR> <TR> @@ -3000,6 +3335,7 @@ Full API documentation on-line: the <B>resource synopsis</B>, <P> <!-- NEW --> </P> +<A NAME="toc84"></A> <H3>A miniature resource API: paradigms</H3> <P> From <CODE>ParadigmsEng</CODE>: @@ -3044,6 +3380,7 @@ From <CODE>ParadigmsIta</CODE>: <P> <!-- NEW --> </P> +<A NAME="toc85"></A> <H3>A miniature resource API: more paradigms</H3> <P> From <CODE>ParadigmsGer</CODE>: @@ -3108,6 +3445,7 @@ From <CODE>ParadigmsFin</CODE>: <P> <!-- NEW --> </P> +<A NAME="toc86"></A> <H3>Exercises</H3> <P> 1. Try out the morphological paradigms in different languages. Do @@ -3122,6 +3460,7 @@ as follows: <P> <!-- NEW --> </P> +<A NAME="toc87"></A> <H2>Example: English</H2> <P> <a name="secenglish"></a> @@ -3155,6 +3494,7 @@ Thus the beginning of the module is <P> <!-- NEW --> </P> +<A NAME="toc88"></A> <H3>English example: linearization types and combination rules</H3> <P> As linearization types, we use clauses for <CODE>Phrase</CODE>, noun phrases @@ -3173,10 +3513,10 @@ Now the combination rules we need almost write themselves automatically: <PRE> lin Is item quality = mkCl item quality ; - This kind = mkNP this_QuantSg kind ; - That kind = mkNP that_QuantSg kind ; - These kind = mkNP these_QuantPl kind ; - Those kind = mkNP those_QuantPl kind ; + This kind = mkNP this_Det kind ; + That kind = mkNP that_Det kind ; + These kind = mkNP these_Det kind ; + Those kind = mkNP those_Det kind ; QKind quality kind = mkCN quality kind ; Very quality = mkAP very_AdA quality ; </PRE> @@ -3184,6 +3524,7 @@ Now the combination rules we need almost write themselves automatically: <P> <!-- NEW --> </P> +<A NAME="toc89"></A> <H3>English example: lexical rules</H3> <P> We use resource paradigms and lexical insertion rules. @@ -3209,6 +3550,7 @@ The two-place noun paradigm is needed only once, for <P> <!-- NEW --> </P> +<A NAME="toc90"></A> <H3>English example: exercises</H3> <P> 1. Compile the grammar <CODE>FoodsEng</CODE> and generate @@ -3223,10 +3565,12 @@ grammars presented earlier in this tutorial. <P> <!-- NEW --> </P> +<A NAME="toc91"></A> <H2>Functor implementation of multilingual grammars</H2> <P> <a name="secfunctor"></a> </P> +<A NAME="toc92"></A> <H3>New language by copy and paste</H3> <P> If you write a concrete syntax of <CODE>Foods</CODE> for some other @@ -3257,6 +3601,7 @@ Can we avoid this programming by copy-and-paste? <P> <!-- NEW --> </P> +<A NAME="toc93"></A> <H3>Functors: functions on the module level</H3> <P> <B>Functors</B> familiar from the functional programming languages ML and OCaml, @@ -3301,6 +3646,7 @@ we can write a <B>functor instantiation</B>, <P> <!-- NEW --> </P> +<A NAME="toc94"></A> <H3>Code for the Foods functor</H3> <PRE> --# -path=.:../foods @@ -3313,10 +3659,10 @@ we can write a <B>functor instantiation</B>, Quality = AP ; lin Is item quality = mkCl item quality ; - This kind = mkNP this_QuantSg kind ; - That kind = mkNP that_QuantSg kind ; - These kind = mkNP these_QuantPl kind ; - Those kind = mkNP those_QuantPl kind ; + This kind = mkNP this_Det kind ; + That kind = mkNP that_Det kind ; + These kind = mkNP these_Det kind ; + Those kind = mkNP those_Det kind ; QKind quality kind = mkCN quality kind ; Very quality = mkAP very_AdA quality ; @@ -3336,6 +3682,7 @@ we can write a <B>functor instantiation</B>, <P> <!-- NEW --> </P> +<A NAME="toc95"></A> <H3>Code for the LexFoods interface</H3> <P> <a name="secinterface"></a> @@ -3359,6 +3706,7 @@ we can write a <B>functor instantiation</B>, <P> <!-- NEW --> </P> +<A NAME="toc96"></A> <H3>Code for a German instance of the lexicon</H3> <PRE> instance LexFoodsGer of LexFoods = open SyntaxGer, ParadigmsGer in { @@ -3379,6 +3727,7 @@ we can write a <B>functor instantiation</B>, <P> <!-- NEW --> </P> +<A NAME="toc97"></A> <H3>Code for a German functor instantiation</H3> <PRE> --# -path=.:../foods:present @@ -3391,6 +3740,7 @@ we can write a <B>functor instantiation</B>, <P> <!-- NEW --> </P> +<A NAME="toc98"></A> <H3>Adding languages to a functor implementation</H3> <P> Just two modules are needed: @@ -3416,6 +3766,7 @@ language: <P> <!-- NEW --> </P> +<A NAME="toc99"></A> <H3>Example: adding Finnish</H3> <P> Lexicon instance @@ -3449,6 +3800,7 @@ Functor instantiation <P> <!-- NEW --> </P> +<A NAME="toc100"></A> <H3>A design pattern</H3> <P> This can be seen as a <I>design pattern</I> for multilingual grammars: @@ -3471,6 +3823,7 @@ Of the hand-written modules, only <CODE>LexDomainL</CODE> is language-dependent. <P> <!-- NEW --> </P> +<A NAME="toc101"></A> <H3>Functors: exercises</H3> <P> 1. Compile and test <CODE>FoodsGer</CODE>. @@ -3511,7 +3864,9 @@ The implementation goes in the following phases: <P> <!-- NEW --> </P> +<A NAME="toc102"></A> <H2>Restricted inheritance</H2> +<A NAME="toc103"></A> <H3>A problem with functors</H3> <P> Problem: a functor only works when all languages use the resource <CODE>Syntax</CODE> @@ -3541,6 +3896,7 @@ Problem with this solution: <P> <!-- NEW --> </P> +<A NAME="toc104"></A> <H3>Restricted inheritance: include or exclude</H3> <P> A module may inherit just a selection of names. @@ -3561,6 +3917,7 @@ A concrete syntax of <CODE>Foodmarket</CODE> must make the analogous restriction <P> <!-- NEW --> </P> +<A NAME="toc105"></A> <H3>The functor problem solved</H3> <P> The English instantiation inherits the functor @@ -3582,6 +3939,7 @@ is defined in the body instead: <P> <!-- NEW --> </P> +<A NAME="toc106"></A> <H2>Grammar reuse</H2> <P> Abstract syntax modules can be used as interfaces, @@ -3603,6 +3961,7 @@ The following correspondencies are then applied: <P> <!-- NEW --> </P> +<A NAME="toc107"></A> <H3>Library exercises</H3> <P> 1. Find resource grammar terms for the following @@ -3627,6 +3986,7 @@ Then translate the phrases to other languages. <P> <!-- NEW --> </P> +<A NAME="toc108"></A> <H2>Tenses</H2> <P> <a name="sectense"></a> @@ -3718,6 +4078,7 @@ tenses and moods, e.g. the Romance languages. <P> <!-- NEW --> </P> +<A NAME="toc109"></A> <H1>Lesson 5: Refining semantics in abstract syntax</H1> <P> <a name="chapsix"></a> @@ -3745,6 +4106,7 @@ GF = logical framework + concrete syntax. <P> <!-- NEW --> </P> +<A NAME="toc110"></A> <H2>Dependent types</H2> <P> <a name="secsmarthouse"></a> @@ -3772,6 +4134,7 @@ defines voice commands for household appliances. <P> <!-- NEW --> </P> +<A NAME="toc111"></A> <H3>A dependent type system</H3> <P> Ontology: @@ -3800,6 +4163,7 @@ Abstract syntax formalizing this: <P> <!-- NEW --> </P> +<A NAME="toc112"></A> <H3>Examples of devices and actions</H3> <P> Assume the kinds <CODE>light</CODE> and <CODE>fan</CODE>, @@ -3832,6 +4196,7 @@ but we cannot form the trees <P> <!-- NEW --> </P> +<A NAME="toc113"></A> <H3>Linearization and parsing with dependent types</H3> <P> Concrete syntax does not know if a category is a dependent type. @@ -3874,6 +4239,7 @@ to mark incomplete parts of trees in the syntax editor. <P> <!-- NEW --> </P> +<A NAME="toc114"></A> <H3>Solving metavariables</H3> <P> Use the command <CODE>put_tree = pt</CODE> with the option <CODE>-typecheck</CODE>: @@ -3896,6 +4262,7 @@ is shown and no tree is returned: <P> <!-- NEW --> </P> +<A NAME="toc115"></A> <H2>Polymorphism</H2> <P> <a name="secpolymorphic"></a> @@ -3928,6 +4295,7 @@ to express Haskell-type library functions: <P> <!-- NEW --> </P> +<A NAME="toc116"></A> <H3>Dependent types: exercises</H3> <P> 1. Write an abstract syntax module with above contents @@ -3944,6 +4312,7 @@ and an appropriate English concrete syntax. Try to parse the commands <P> <!-- NEW --> </P> +<A NAME="toc117"></A> <H2>Proof objects</H2> <P> <B>Curry-Howard isomorphism</B> = <B>propositions as types principle</B>: @@ -3988,6 +4357,7 @@ Example: the fact that 2 is less that 4 has the proof object <P> <!-- NEW --> </P> +<A NAME="toc118"></A> <H3>Proof-carrying documents</H3> <P> Idea: to be semantically well-formed, the abstract syntax of a document @@ -4031,6 +4401,7 @@ A legal connection is formed by the function <P> <!-- NEW --> </P> +<A NAME="toc119"></A> <H2>Restricted polymorphism</H2> <P> Above, all Actions were either of @@ -4055,6 +4426,7 @@ The notion of class uses the Curry-Howard isomorphism as follows: <P> <!-- NEW --> </P> +<A NAME="toc120"></A> <H3>Example: classes for switching and dimming</H3> <P> We modify the smart house grammar: @@ -4077,6 +4449,7 @@ Classes for new actions can be added incrementally. <P> <!-- NEW --> </P> +<A NAME="toc121"></A> <H2>Variable bindings</H2> <P> <a name="secbinding"></a> @@ -4110,6 +4483,7 @@ Examples from informal mathematical language: <P> <!-- NEW --> </P> +<A NAME="toc122"></A> <H3>Higher-order abstract syntax</H3> <P> Abstract syntax can use functions as arguments: @@ -4147,6 +4521,7 @@ expressed using higher-order syntactic constructors. <P> <!-- NEW --> </P> +<A NAME="toc123"></A> <H3>Higher-order abstract syntax: linearization</H3> <P> HOAS has proved to be useful in the semantics and computer implementation of @@ -4180,6 +4555,7 @@ If there are more bindings, we add <CODE>$1</CODE>, <CODE>$2</CODE>, etc. <P> <!-- NEW --> </P> +<A NAME="toc124"></A> <H3>Eta expansion</H3> <P> To make sense of linearization, syntax trees must be @@ -4228,6 +4604,7 @@ The linearization of the variable <CODE>x</CODE> is, <P> <!-- NEW --> </P> +<A NAME="toc125"></A> <H3>Parsing variable bindings</H3> <P> GF can treat any one-word string as a variable symbol. @@ -4247,6 +4624,7 @@ Variables must be bound if they are used: <P> <!-- NEW --> </P> +<A NAME="toc126"></A> <H3>Exercises on variable bindings</H3> <P> 1. Write an abstract syntax of the whole @@ -4265,6 +4643,7 @@ guarantee non-ambiguity. <P> <!-- NEW --> </P> +<A NAME="toc127"></A> <H2>Semantic definitions</H2> <P> <a name="secdefdef"></a> @@ -4303,6 +4682,7 @@ The key word is <CODE>def</CODE>: <P> <!-- NEW --> </P> +<A NAME="toc128"></A> <H3>Computing a tree</H3> <P> Computation: follow a chain of definition until no definition @@ -4328,6 +4708,7 @@ Computation in GF is performed with the <CODE>put_term</CODE> command and the <P> <!-- NEW --> </P> +<A NAME="toc129"></A> <H3>Definitional equality</H3> <P> Two trees are definitionally equal if they compute into the same tree. @@ -4355,6 +4736,7 @@ so that an object of one also is an object of the other. <P> <!-- NEW --> </P> +<A NAME="toc130"></A> <H3>Judgement forms for constructors</H3> <P> The judgement form <CODE>data</CODE> tells that a category has @@ -4384,6 +4766,7 @@ marked as <CODE>data</CODE> will be treated as variables. <P> <!-- NEW --> </P> +<A NAME="toc131"></A> <H3>Exercises on semantic definitions</H3> <P> 1. Implement an interpreter of a small functional programming @@ -4399,6 +4782,7 @@ Type checking can be invoked with <CODE>put_term -transform=solve</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc132"></A> <H2>Lesson 6: Grammars of formal languages</H2> <P> <a name="chapseven"></a> @@ -4415,6 +4799,7 @@ Goals: <P> <!-- NEW --> </P> +<A NAME="toc133"></A> <H3>Arithmetic expressions</H3> <P> We construct a calculator with addition, subtraction, multiplication, and @@ -4445,6 +4830,7 @@ grammars are not allowed to declare functions with <CODE>Int</CODE> as value typ <P> <!-- NEW --> </P> +<A NAME="toc134"></A> <H3>Concrete syntax: a simple approach</H3> <P> We begin with a @@ -4486,6 +4872,7 @@ First problems: <P> <!-- NEW --> </P> +<A NAME="toc135"></A> <H2>Lexing and unlexing</H2> <P> <a name="seclexing"></a> @@ -4538,6 +4925,7 @@ In linearization, we use a corresponding <B>unlexer</B>: <P> <!-- NEW --> </P> +<A NAME="toc136"></A> <H3>Most common lexers and unlexers</H3> <TABLE ALIGN="center" CELLPADDING="4" BORDER="1"> <TR> @@ -4575,6 +4963,7 @@ In linearization, we use a corresponding <B>unlexer</B>: <P> <!-- NEW --> </P> +<A NAME="toc137"></A> <H2>Precedence and fixity</H2> <P> Arithmetic expressions should be unambiguous. If we write @@ -4613,6 +5002,7 @@ The usual precedence rules: <P> <!-- NEW --> </P> +<A NAME="toc138"></A> <H3>Precedence as a parameter</H3> <P> Precedence can be made into an inherent feature of expressions: @@ -4657,6 +5047,7 @@ This idea is encoded in the operation <P> <!-- NEW --> </P> +<A NAME="toc139"></A> <H3>Fixities</H3> <P> We can define left-associative infix expressions: @@ -4697,6 +5088,7 @@ Now we can write the whole concrete syntax of <CODE>Calculator</CODE> compactly: <P> <!-- NEW --> </P> +<A NAME="toc140"></A> <H3>Exercises on precedence</H3> <P> 1. Define non-associative and right-associative infix operations @@ -4710,6 +5102,7 @@ Test parsing with and without a pipe to <CODE>pt -transform=compute</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc141"></A> <H2>Code generation as linearization</H2> <P> Translate arithmetic (infix) to JVM (postfix): @@ -4739,6 +5132,7 @@ Just give linearization rules for JVM: <P> <!-- NEW --> </P> +<A NAME="toc142"></A> <H3>Programs with variables</H3> <P> A <B>straight code</B> programming language, with @@ -4787,6 +5181,7 @@ of the extension is <CODE>Prog</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc143"></A> <H3>Exercises on code generation</H3> <P> 1. Define a C-like concrete syntax of the straight-code language. @@ -4827,6 +5222,7 @@ point literals as arguments. <P> <!-- NEW --> </P> +<A NAME="toc144"></A> <H1>Lesson 7: Embedded grammars</H1> <P> <a name="chapeight"></a> @@ -4844,6 +5240,7 @@ Goals: <P> <!-- NEW --> </P> +<A NAME="toc145"></A> <H2>Functionalities of an embedded grammar format</H2> <P> GF grammars can be used as parts of programs written in other programming @@ -4860,16 +5257,17 @@ This facility is based on several components: <P> <!-- NEW --> </P> +<A NAME="toc146"></A> <H2>The portable grammar format</H2> <P> The portable format is called PGF, "Portable Grammar Format". </P> <P> -This format is produced by the GF batch compiler <CODE>gf</CODE>, -executable from the operative system shell: +This format is produced by using GF as batch compiler, with the option <CODE>-make</CODE>, +from the operative system shell: </P> <PRE> - % gf --make SOURCE.gf + % gf -make SOURCE.gf </PRE> <P> PGF is the recommended format in @@ -4887,6 +5285,7 @@ general-purpose programming (or bytecode in Java). <P> <!-- NEW --> </P> +<A NAME="toc147"></A> <H3>Haskell: the EmbedAPI module</H3> <P> The Haskell API contains (among other things) the following types and functions: @@ -4915,6 +5314,7 @@ It is available as a part of the GF distribution, in the file <P> <!-- NEW --> </P> +<A NAME="toc148"></A> <H3>First application: a translator</H3> <P> Let us first build a stand-alone translator, which can translate @@ -4941,7 +5341,7 @@ in any multilingual grammar between any languages in the grammar. To run the translator, first compile it by </P> <PRE> - % ghc --make -o trans Translator.hs + % ghc -make -o trans Translator.hs </PRE> <P> For this, you need the Haskell compiler <A HREF="http://www.haskell.org/ghc">GHC</A>. @@ -4949,13 +5349,14 @@ For this, you need the Haskell compiler <A HREF="http://www.haskell.org/ghc">GHC <P> <!-- NEW --> </P> +<A NAME="toc149"></A> <H3>Producing PGF for the translator</H3> <P> Then produce a PGF file. For instance, the <CODE>Food</CODE> grammar set can be compiled as follows: </P> <PRE> - % gf --make FoodEng.gf FoodIta.gf + % gf -make FoodEng.gf FoodIta.gf </PRE> <P> This produces the file <CODE>Food.pgf</CODE> (its name comes from the abstract syntax). @@ -4976,6 +5377,7 @@ The result is given in all languages except the input language. <P> <!-- NEW --> </P> +<A NAME="toc150"></A> <H3>A translator loop</H3> <P> To avoid starting the translator over and over again: @@ -4997,6 +5399,7 @@ is <CODE>quit</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc151"></A> <H3>A question-answer system</H3> <P> <a name="secmathprogram"></a> @@ -5041,6 +5444,7 @@ To reply in the <I>same</I> language as the question: <P> <!-- NEW --> </P> +<A NAME="toc152"></A> <H3>Abstract syntax of the query system</H3> <P> Input: abstract syntax judgements @@ -5067,6 +5471,7 @@ Input: abstract syntax judgements <P> <!-- NEW --> </P> +<A NAME="toc153"></A> <H3>Exporting GF datatypes to Haskell</H3> <P> To make it easy to define a transfer function, we export the @@ -5079,7 +5484,7 @@ abstract syntax to a system of Haskell datatypes: It is also possible to produce the Haskell file together with PGF, by </P> <PRE> - % gf --make --output-format=haskell QueryEng.gf + % gf -make --output-format=haskell QueryEng.gf </PRE> <P> The result is a file named <CODE>Query.hs</CODE>, containing a @@ -5117,6 +5522,7 @@ The Haskell module name is the same as the abstract syntax name. <P> <!-- NEW --> </P> +<A NAME="toc154"></A> <H3>The question-answer function</H3> <P> Haskell's type checker guarantees that the functions are well-typed also with @@ -5140,6 +5546,7 @@ respect to GF. <P> <!-- NEW --> </P> +<A NAME="toc155"></A> <H3>Converting between Haskell and GF trees</H3> <P> The generated Haskell module also contains @@ -5172,6 +5579,7 @@ For the programmer, it is enougo to know: <P> <!-- NEW --> </P> +<A NAME="toc156"></A> <H3>Putting it all together: the transfer definition</H3> <PRE> module TransferDef where @@ -5205,6 +5613,7 @@ For the programmer, it is enougo to know: <P> <!-- NEW --> </P> +<A NAME="toc157"></A> <H3>Putting it all together: the Main module</H3> <P> Here is the complete code in the Haskell file <CODE>TransferLoop.hs</CODE>. @@ -5236,13 +5645,14 @@ Here is the complete code in the Haskell file <CODE>TransferLoop.hs</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc158"></A> <H3>Putting it all together: the Makefile</H3> <P> To automate the production of the system, we write a <CODE>Makefile</CODE> as follows: </P> <PRE> all: - gf --make --output-format=haskell QueryEng + gf -make --output-format=haskell QueryEng ghc --make -o ./math TransferLoop.hs strip math </PRE> @@ -5273,6 +5683,7 @@ Just to summarize, the source of the application consists of the following files <P> <!-- NEW --> </P> +<A NAME="toc159"></A> <H2>Web server applications</H2> <P> PGF files can be used in web servers, for which there is a Haskell library included @@ -5291,6 +5702,7 @@ is an example of its application to the <CODE>Foods</CODE> grammars. <P> <!-- NEW --> </P> +<A NAME="toc160"></A> <H2>JavaScript applications</H2> <P> JavaScript is a programming language that has interpreters built in in most @@ -5304,13 +5716,14 @@ program compiled from GF grammars as run on an iPhone. <P> <!-- NEW --> </P> +<A NAME="toc161"></A> <H3>Compiling to JavaScript</H3> <P> JavaScript is one of the output formats of the GF batch compiler. Thus the following command generates a JavaScript file from two <CODE>Food</CODE> grammars. </P> <PRE> - % gf --make --output-format=js FoodEng.gf FoodIta.gf + % gf -make --output-format=js FoodEng.gf FoodIta.gf </PRE> <P> The name of the generated file is <CODE>Food.js</CODE>, derived from the top-most abstract @@ -5319,6 +5732,7 @@ syntax name. This file contains the multilingual grammar as a JavaScript object. <P> <!-- NEW --> </P> +<A NAME="toc162"></A> <H3>Using the JavaScript grammar</H3> <P> To perform parsing and linearization, the run-time library @@ -5344,6 +5758,7 @@ With these changes, the translator works for any multilingual grammar. <P> <!-- NEW --> </P> +<A NAME="toc163"></A> <H2>Language models for speech recognition</H2> <P> The standard way of using GF in speech recognition is by building @@ -5361,7 +5776,7 @@ GSL is produced from GF by running <CODE>gf</CODE> with the flag Example: GSL generated from <CODE>FoodsEng.gf</CODE>. </P> <PRE> - % gf --make --output-format=gsl FoodsEng.gf + % gf -make --output-format=gsl FoodsEng.gf % more FoodsEng.gsl ;GSL2.0 @@ -5390,6 +5805,7 @@ Example: GSL generated from <CODE>FoodsEng.gf</CODE>. <P> <!-- NEW --> </P> +<A NAME="toc164"></A> <H3>More speech recognition grammar formats</H3> <P> Other formats available via the <CODE>--output-format</CODE> flag include: @@ -5438,5 +5854,5 @@ All currently available formats can be seen with <CODE>gf --help</CODE>. </P> <!-- html code generated by txt2tags 2.4 (http://txt2tags.sf.net) --> -<!-- cmdline: txt2tags gf-tutorial.txt --> +<!-- cmdline: txt2tags -\-toc gf-tutorial.t2t --> </BODY></HTML> |
