diff options
| author | John J. Camilleri <john@digitalgrammars.com> | 2018-12-09 20:38:02 +0100 |
|---|---|---|
| committer | John J. Camilleri <john@digitalgrammars.com> | 2018-12-09 20:38:02 +0100 |
| commit | d82a53ebc660cf0319cd18edb4ddf9a2256bda3f (patch) | |
| tree | 0279bfd09ec0eaf8cba04fc405395793c2a01370 /doc | |
| parent | 5006b520d10b235d6327525f434de0b5538b38a8 (diff) | |
Replace gf-refman.html with Markdown version gf-refman.md
The raw HTML was invalid, and this way we use the common website template
for a uniform look without any duplication.
It seems gf-refman.html was once generated from txt2tags, although I have
been unable to find this original .t2t file.
I also tried to re-generate txt2tags from HTML but was not able to.
However I was able to convert HTML to Markdown using Pandoc and I think
the result is pretty good, so I think we should use this.
The original gf-refman.html can be obtained from git history, e.g.:
https://github.com/GrammaticalFramework/gf-core/blob/a7e43d872f5e612f93131f2d8caf811fbee9aa83/doc/gf-refman.html
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/gf-refman.html | 4622 | ||||
| -rw-r--r-- | doc/gf-refman.md | 2770 |
2 files changed, 2770 insertions, 4622 deletions
diff --git a/doc/gf-refman.html b/doc/gf-refman.html deleted file mode 100644 index 31456b09a..000000000 --- a/doc/gf-refman.html +++ /dev/null @@ -1,4622 +0,0 @@ -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> -<HEAD> - <TITLE>GF Language Reference Manual</TITLE> - <meta charset="utf-8" /> - <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css" integrity="sha384-MCw98/SFnGE8fJT3GXwEOngsV7Zt27NXFoaoApmYm81iuXoPkFOJwJ8ERdknLPMO" crossorigin="anonymous"> - <style> - code { - color: inherit; - } - center { - margin-bottom: 1rem; - } - </style> -</HEAD> -<BODY> -<div class="container-fluid my-5" style="max-width:1200px"> - -<P ALIGN="center"> - <CENTER> - <a href=".."> - <img src="Logos/gf1.svg" height="200px" class="mb-3" alt="GF Logo"> - </a> - <H1>GF Language Reference Manual</H1> -<FONT SIZE="4"> -<I>Aarne Ranta</I>, <I>Krasimir Angelov</I><BR>June 2014, GF 3.6 -</FONT></CENTER> - -<P></P> - <UL> - <LI><A HREF="#toc1">Overview of GF</A> - <LI><A HREF="#toc2">The module system</A> - <UL> - <LI><A HREF="#toc3">Top-level and supplementary module structure</A> - <LI><A HREF="#toc4">Compilation units</A> - <LI><A HREF="#toc5">Names</A> - <LI><A HREF="#toc6">The structure of a module</A> - <LI><A HREF="#toc7">Module types, headers, and bodies</A> - <LI><A HREF="#toc8">Digression: the logic of module types</A> - <LI><A HREF="#toc9">Inheritance</A> - <LI><A HREF="#toc10">Opening</A> - <LI><A HREF="#toc11">Name resolution</A> - <LI><A HREF="#toc12">Functor instantiations</A> - <LI><A HREF="#toc13">Completeness</A> - </UL> - <LI><A HREF="#toc14">Judgements</A> - <UL> - <LI><A HREF="#toc15">Overview of the forms of judgement</A> - <LI><A HREF="#toc16">Category declarations, cat</A> - <LI><A HREF="#toc17">Hypotheses and contexts</A> - <LI><A HREF="#toc18">Function declarations, fun</A> - <LI><A HREF="#toc19">Function definitions, def</A> - <LI><A HREF="#toc20">Data constructor definitions, data</A> - <LI><A HREF="#toc21">The semantic status of an abstract syntax function</A> - <LI><A HREF="#toc22">Linearization type definitions, lincat</A> - <LI><A HREF="#toc23">Linearization definitions, lin</A> - <LI><A HREF="#toc24">Linearization default definitions, lindef</A> - <LI><A HREF="#toc24_r">Linearization reference definitions, linref</A> - <LI><A HREF="#toc25">Printname definitions, printname cat and printname fun</A> - <LI><A HREF="#toc26">Parameter type definitions, param</A> - <LI><A HREF="#toc27">Parameter values</A> - <LI><A HREF="#toc28">Operation definitions, oper</A> - <LI><A HREF="#toc29">Operation overloading</A> - <LI><A HREF="#toc30">Flag definitions, flags</A> - </UL> - <LI><A HREF="#toc31">Types and expressions</A> - <UL> - <LI><A HREF="#toc32">Overview of expression forms</A> - <LI><A HREF="#toc33">The functional fragment: expressions in abstract syntax</A> - <LI><A HREF="#toc34">Conversions</A> - <LI><A HREF="#toc35">Syntax trees</A> - <LI><A HREF="#toc36">Predefined types in abstract syntax</A> - <LI><A HREF="#toc37">Overview of expressions in concrete syntax</A> - <LI><A HREF="#toc38">Values, canonical forms, and run-time variables</A> - <LI><A HREF="#toc39">Token lists, tokens, and strings</A> - <LI><A HREF="#toc40">Records and record types</A> - <LI><A HREF="#toc41">Subtyping</A> - <LI><A HREF="#toc42">Tables and table types</A> - <LI><A HREF="#toc43">Pattern matching</A> - <LI><A HREF="#toc44">Free variation</A> - <LI><A HREF="#toc45">Local definitions</A> - <LI><A HREF="#toc46">Function applications in concrete syntax</A> - <LI><A HREF="#toc47">Reusing top-level grammars as resources</A> - <LI><A HREF="#toc48">Predefined concrete syntax types</A> - <LI><A HREF="#toc49">Predefined concrete syntax operations</A> - </UL> - <LI><A HREF="#toc50">Flags and pragmas</A> - <UL> - <LI><A HREF="#toc51">Some flags and their values</A> - <LI><A HREF="#toc52">Compiler pragmas</A> - </UL> - <LI><A HREF="#toc53">Alternative grammar input formats</A> - <UL> - <LI><A HREF="#toc54">Old GF without modules</A> - <LI><A HREF="#toc55">Context-free grammars</A> - <LI><A HREF="#toc56">Extended BNF grammars</A> - <LI><A HREF="#toc57">Example-based grammars</A> - </UL> - <LI><A HREF="#toc58">The grammar of GF</A> - <LI><A HREF="#toc59">The lexical structure of GF</A> - <UL> - <LI><A HREF="#toc60">Identifiers</A> - <LI><A HREF="#toc61">Literals</A> - <LI><A HREF="#toc62">Reserved words and symbols</A> - <LI><A HREF="#toc63">Comments</A> - </UL> - <LI><A HREF="#toc64">The syntactic structure of GF</A> - </UL> - -<P></P> -<HR> -<P></P> -<P> - -</P> -<P> -This document is a reference manual to the GF programming language. -GF, Grammatical Framework, is a special-purpose programming language, -designed to support definitions of grammars. -</P> -<P> -This document is not an introduction to GF; such introduction can be -found in the GF tutorial available on line on the GF web page, -</P> -<P> -<A HREF="http://grammaticalframework.org"><CODE>grammaticalframework.org</CODE></A> -</P> -<P> -This manual covers only the language, not the GF compiler or -interactive system. We will however make some references to different -compiler versions, if they involve changes of behaviour having to -do with the language specification. -</P> -<P> -This manual is meant to be fully compatible with GF version 3.0. -Main discrepancies with version 2.8 are indicated, -as well as with the reference article on GF, -</P> -<P> -A. Ranta, "Grammatical Framework. A Type Theoretical Grammar Formalism", -<I>The Journal of Functional Programming</I> 14(2), 2004, pp. 145-189. -</P> -<P> -This article will referred to as "the JFP article". -</P> -<P> -As metalinguistic notation, we will use the symbols -</P> -<UL> -<LI><I>a</I> === <I>b</I> to say that <I>a</I> is syntactic sugar for <I>b</I> -<LI><I>a</I> ==> <I>b</I> to say that <I>a</I> is computed (or compiled) to <I>b</I> -</UL> - -<A NAME="toc1"></A> -<H2>Overview of GF</H2> -<P> -GF is a typed functional language, -borrowing many of its constructs from ML and Haskell: algebraic datatypes, -higher-order functions, pattern matching. The module system bears resemblance -to ML (functors) but also to object-oriented languages (inheritance). -The type theory used in the abstract syntax part of GF is inherited from -logical frameworks, in particular ALF ("Another Logical Framework"; in a -sense, GF is Yet Another ALF). From ALF comes also the use of dependent -types, including the use of explicit type variables instead of -Hindley-Milner polymorphism. -</P> -<P> -The look and feel of GF is close to Java and -C, due to the use of curly brackets and semicolons in structuring the code; -the expression syntax, however, follows Haskell in using juxtaposition for -function application and parentheses only for grouping. -</P> -<P> -To understand the constructs of GF, and especially their limitations in comparison -to general-purpose programming languages, it is essential to keep in mind that -GF is a special-purpose and non-turing-complete language. Every GF program is -ultimately compiled to a <B>multilingual grammar</B>, which consists of an -<B>abstract syntax</B> and a set of <B>concrete syntaxes</B>. The abstract syntax -defines a system of <B>syntax trees</B>, and each concrete syntax defines a -mapping from those syntax trees to <B>nested tuples</B> of strings and integers. -This mapping is <B>compositional</B>, i.e. <B>homomorphic</B>, and moreover -<B>reversible</B>: given a nested tuple, there exists an effective way of finding -the set of syntax trees that map to this tuple. The procedure of applying the -mapping to a tree to produce a tuple is called <B>linearization</B>, and the -reverse search procedure is called <B>parsing</B>. It is ultimately the requirement -of reversibility that restricts GF to be less than turing-complete. This is -reflected in restrictions to recursion in concrete syntax. Tree formation in -abstract syntax, however, is fully recursive. -</P> -<P> -Even though run-time GF grammars manipulate just nested tuples, at compile -time these are represented by by the more fine-grained labelled records -and finite functions over algebraic datatypes. This enables the programmer -to write on a higher abstraction level, and also adds type distinctions -and hence raises the level of checking of programs. -</P> -<A NAME="toc2"></A> -<H2>The module system</H2> -<A NAME="toc3"></A> -<H3>Top-level and supplementary module structure</H3> -<P> -The big picture of GF as a programming language for multilingual grammars -explains its principal module structure. Any GF grammar must have an -abstract syntax module; it can in addition have any number of concrete -syntax modules matching that abstract syntax. Before going to details, -we give a simple example: a module defining the <B>category</B> <CODE>A</CODE> -of adjectives and one adjective-forming <B>function</B>, the zero-place function -<CODE>Even</CODE>. We give the module the name <CODE>Adj</CODE>. The GF code for the -module looks as follows: -</P> -<PRE> - abstract Adj = { - cat A ; - fun Even : A ; - } -</PRE> -<P> -Here are two concrete syntax modules, one intended for mapping the trees -to English, the other to Swedish. The mappling is defined by -<CODE>lincat</CODE> definitions assigning a <B>linearization type</B> to each category, -and <CODE>lin</CODE> definitions assigning a <B>linearization</B> to each function. -</P> -<PRE> - concrete AdjEng of Adj = { - lincat A = {s : Str} ; - lin Even = {s = "even"} ; - } - - concrete AdjSwe of Adj = { - lincat A = {s : AForm => Str} ; - lin Even = {s = table { - ASg Utr => "jämn" ; - ASg Neutr => "jämnt" ; - APl => "jämna" - } - } ; - param AForm = ASg Gender | APl ; - param Gender = Utr | Neutr ; - } -</PRE> -<P> -These examples illustrate the main ideas of multilingual grammars: -</P> -<UL> -<LI>the concrete syntax must match the abstract syntax: - <UL> - <LI>every <CODE>cat</CODE> is given a <CODE>lincat</CODE> - <LI>every <CODE>fun</CODE> is given a <CODE>lin</CODE> - </UL> -</UL> - -<UL> -<LI>the concrete syntax is internally coherent: - <UL> - <LI>the <CODE>lin</CODE> rules respect the types defined by <CODE>lincat</CODE> rules - </UL> -</UL> - -<UL> -<LI>concrete syntaxes are independent of each other - <UL> - <LI>they can use different <CODE>lincat</CODE> and <CODE>lin</CODE> definitions - <LI>they can define their own <B>parameter types</B> (<CODE>param</CODE>) - </UL> -</UL> - -<P> -The first two ideas form the core of the <B>static checking</B> of GF -grammars, eliminating the possibility of run-time errors in -linearization and parsing. The third idea gives GF the expressive -power needed to map abstract syntax to vastly different languages. -</P> -<P> -Abstract and concrete modules are called <B>top-level grammar modules</B>, -since they are the ones that remain in grammar systems at run time. -However, in order to support <B>modular grammar engineering</B>, GF provides -much more module structure than strictly required in top-level grammars. -</P> -<P> -<B>Inheritance</B>, also known as <B>extension</B>, means that a module can inherit the -contents of one or more other modules to which new judgements are added, -e.g. -</P> -<PRE> - abstract MoreAdj = Adj ** { - fun Odd : A ; - } -</PRE> -<P> -<B>Resource modules</B> define parameter types and <B>operations</B> usable -in several concrete syntaxes, -</P> -<PRE> - resource MorphoFre = { - param Number = Sg | Pl ; - param Gender = Masc | Fem ; - oper regA : Str -> {s : Gender => Number => Str} = - \fin -> { - s = table { - Masc => table {Sg => fin ; Pl => fin + "s"} ; - Fem => table {Sg => fin + "e" ; Pl => fin + "es"} - } - } ; - } -</PRE> -<P> -By <B>opening</B>, a module can use the contents of a resource module -without inheriting them, e.g. -</P> -<PRE> - concrete AdjFre of Adj = open MorphoFre in { - lincat A = {s : Gender => Number => Str} ; - lin Even = regA "pair" ; - } -</PRE> -<P> -<B>Interfaces</B> and <B>instances</B> separate the contents of a resource module -to type signatures and definitions, in a way analogous to abstract vs. concrete -modules, e.g. -</P> -<PRE> - interface Lexicon = { - oper Adjective : Type ; - oper even_A : Adjective ; - } - - instance LexiconEng of Lexicon = { - oper Adjective = {s : Str} ; - oper even_A = {s = "even"} ; - } -</PRE> -<P> -<B>Functors</B> i.e. <B>parametrized modules</B> i.e. <B>incomplete modules</B>, defining -a concrete syntax in terms of an interface. -</P> -<PRE> - incomplete concrete AdjI of Adj = open Lexicon in { - lincat A = Adjective ; - lin Even = even_A ; - } -</PRE> -<P> -A functor can be <B>instantiated</B> by providing instances of its open interfaces. -</P> -<PRE> - concrete AdjEng of Adj = AdjI with (Lexicon = LexiconEng) ; -</PRE> -<P></P> -<A NAME="toc4"></A> -<H3>Compilation units</H3> -<P> -The compilation unit of GF source code is a file that contains a module. -Judgements outside modules are supported only for backward compatibility, -as explained <a href="#oldgf">here</a>. -Every source file, suffixed <CODE>.gf</CODE>, is compiled to a "GF object file", -suffixed <CODE>.gfo</CODE> (as of GF Version 3.0 and later). For runtime grammar objects -used for parsing and linearization, a set of <CODE>.gfo</CODE> files is linked to -a single file suffixed <CODE>.pgf</CODE>. While <CODE>.gf</CODE> and <CODE>.gfo</CODE> files may contain -modules of any kinds, a <CODE>.pgf</CODE> file always contains a multilingual grammar -with one abstract and a set of concrete syntaxes. -</P> -<P> -The following diagram summarizes the files involved in the compilation process. -<center> -<CODE>module1.gf module2.gf ... modulen.gf</CODE> -</P> -<P> -==> -</P> -<P> -<CODE>module1.gfo module2.gfo ... modulen.gfo</CODE> -</P> -<P> -==> -</P> -<P> -grammar.pgf -</center> -Both <CODE>.gf</CODE> and <CODE>.gfo</CODE> files are written in the GF source language; -<CODE>.pgf</CODE> files are written in a lower-level format. The process of translating -<CODE>.gf</CODE> to <CODE>.gfo</CODE> consists of <B>name resolution</B>, <B>type annotation</B>, -<B>partial evaluation</B>, and <B>optimization</B>. -There is a great advantage in the possibility to do this -separately for GF modules and saving the result in <CODE>.gfo</CODE> files. The partial -evaluation phase, in particular, is time and memory consuming, and GF libraries -are therefore distributed in <CODE>.gfo</CODE> to make their use less arduous. -</P> -<P> -<I>In GF before version 3.0, the object files are in a format called <CODE>.gfc</CODE>,</I> -<I>and the multilingual runtime grammar is in a format called <CODE>.gfcm</CODE>.</I> -</P> -<P> -The standard compiler has a built-in <B>make facility</B>, which finds out what -other modules are needed when compiling an explicitly given module. -This facility builds a dependency graph and decides which of the involved -modules need recompilation (from <CODE>.gf</CODE> to <CODE>.gfo</CODE>), and for which the -GF object can be used directly. -</P> -<A NAME="toc5"></A> -<H3>Names</H3> -<P> -Each module <I>M</I> defines a set of <B>names</B>, which are visible in <I>M</I> -itself, in all modules extending <I>M</I> (unless excluded, as explained -<a href="#restrictedinheritance">here</a>), and -all modules opening <I>M</I>. These names can stand for abstract syntax -categories and functions, parameter types and parameter constructors, -and operations. All these names live in the same <B>name space</B>, which -means that a name entering a module more than once due to inheritance or -opening can lead to a <B>conflict</B>. It is specified -<a href="#renaming">here</a> how these -conflicts are resolved. -</P> -<P> -The names of modules live in a name space separate from the other names. -Even here, all names must be distinct in a set of files compiled to a -multilingual grammar. In particular, even files residing in different directories -must have different names, since GF has no notion of hierarchic -module names. -</P> -<P> -Lexically, names belong to the class of <B>identifiers</B>. An idenfifier is -a letter followed by any number of letters, digits, undercores (<CODE>_</CODE>) and -primes (<CODE>'</CODE>). Upper- and lower-case letters are treated as distinct. -Nothing dictates the choice of upper or lower-case initials, but -the standard libraries follow conventions similar to Haskell: -</P> -<UL> -<LI>upper case is used for modules, abstract syntax categories and functions, - parameter types and constructors, and type synonyms -<LI>lower case is used for non-type-valued operations and for variables -</UL> - -<P> -<a name="identifiers"></a> -</P> -<P> -"Letters" as mentioned in the identifier syntax include all 7-bit ASCII -letters. Iso-latin-1 and Unicode letters are supported in varying degrees -by different tools and platforms, and are hence not recommended in identifiers. -</P> -<A NAME="toc6"></A> -<H3>The structure of a module</H3> -<P> -Modules of all types have the following structure: -<center> -<I>moduletype</I> <I>name</I> <CODE>=</CODE> <I>extends</I> <I>opens</I> <I>body</I> -</center> -The part of the module preceding the body is its <B>header</B>. The header -defines the type of the module and tells what other modules it inherits -and opens. The body consists of the judgements that introduce all the new -names defined by the module. -</P> -<P> -Any of the parts <I>extends</I>, <I>opens</I>, and <I>body</I> may be empty. -If they are all filled, delimiters and keywords separate the parts in the -following way: -<center> -<I>moduletype</I> <I>name</I> <CODE>=</CODE> - <I>extends</I> <CODE>**</CODE> <CODE>open</CODE> <I>opens</I> <CODE>in</CODE> <CODE>{</CODE> <I>body</I> <CODE>}</CODE> -</center> -The part <I>moduletype</I> <I>name</I> looks slightly different if the -type is <CODE>concrete</CODE> or <CODE>instance</CODE>: the <I>name</I> intrudes between -the type keyword and the name of the module being implemented and which -really belongs to the type of the module: -<center> - <CODE>concrete</CODE> <I>name</I> <CODE>of</CODE> <I>abstractname</I> -</center> -The only exception to the schema of functor syntax -is functor instantiations: the instantiation -list is given in a special way between <I>extends</I> and <I>opens</I>: -<center> -<CODE>incomplete concrete</CODE> <I>name</I> <CODE>of</CODE> <I>abstractname</I> <CODE>=</CODE> - <I>extends</I> <CODE>**</CODE> <I>functorname</I> <CODE>with</CODE> <I>instantiations</I> <CODE>**</CODE> - <CODE>open</CODE> <I>opens</I> <CODE>in</CODE> <CODE>{</CODE> <I>body</I> <CODE>}</CODE> -</center> -Logically, the part "<I>functorname</I> <CODE>with</CODE> <I>instantiations</I>" should -really be one of the <I>extends</I>. This is also shown by the fact that -it can have restricted inheritance (concept defined <a href="#restrictedinheritance">here</a>). -</P> -<A NAME="toc7"></A> -<H3>Module types, headers, and bodies</H3> -<P> -The <I>extends</I> and <I>opens</I> parts of a module header are lists of -module names (with possible qualifications, as defined below <a href="#qualifiednames">here</a>). -The first step of type checking a module consists of verifying that -these names stand for modules of approptiate module types. As a rule -of thumb, -</P> -<UL> -<LI>the <I>extends</I> of a module must have the same <I>moduletype</I> -<LI>the <I>opens</I> of a module must be of type <CODE>resource</CODE> -</UL> - -<P> -However, the precise rules are a little more fine-grained, because -of the presence of interfaces and their instances, and the possibility -to reuse abstract and concrete modules as resources. The following table -gives, for all module types, the possible module types of their <I>extends</I> -and <I>opens</I>, as well as the forms of judgement legal in that module type. -</P> -<TABLE class="table"> -<TR> -<TH>module type</TH> -<TH>extends</TH> -<TH>opens</TH> -<TH COLSPAN="2">body</TH> -</TR> -<TR> -<TD><CODE>abstract</CODE></TD> -<TD>abstract</TD> -<TD>-</TD> -<TD><CODE>cat, fun, def, data</CODE></TD> -</TR> -<TR> -<TD><CODE>concrete of</CODE> <I>abstract</I></TD> -<TD>concrete</TD> -<TD>resource*</TD> -<TD><CODE>lincat, cat, oper, param</CODE></TD> -</TR> -<TR> -<TD><CODE>resource</CODE></TD> -<TD>resource*</TD> -<TD>resource*</TD> -<TD><CODE>oper, param</CODE></TD> -</TR> -<TR> -<TD><CODE>interface</CODE></TD> -<TD>resource+</TD> -<TD>resource*</TD> -<TD><CODE>oper, param</CODE></TD> -</TR> -<TR> -<TD><CODE>instance of</CODE> <I>interface</I></TD> -<TD>resource*</TD> -<TD>resource*</TD> -<TD><CODE>oper, param</CODE></TD> -</TR> -<TR> -<TD><CODE>incomplete</CODE> concrete</TD> -<TD>concrete+</TD> -<TD>resource+</TD> -<TD><CODE>lincat, cat, oper, param</CODE></TD> -</TR> -</TABLE> - -<P></P> -<P> -The table uses the following shorthands for lists of module types: -</P> -<UL> -<LI>resource*: resource, instance, concrete -<LI>resource+: resource*, interface, abstract -<LI>concrete+: concrete, incomplete concrete -</UL> - -<P> -The legality of judgements in the body is checked before the judgements -themselves are checked. -</P> -<P> -The forms of judgement are explained <a href="#judgementforms">here</a>. -</P> -<A NAME="toc8"></A> -<H3>Digression: the logic of module types</H3> -<P> -Why are the legality conditions of opens and extends so complicated? The best way -to grasp them is probably to consider a simplified logical model of the module -system, replacing modules by types and functions. This model could actually -be developed towards treating modules in GF as first-class objects; so far, -however, this step has not been motivated by any practical needs. -</P> -<TABLE class="table"> -<TR> -<TH>module</TH> -<TH COLSPAN="2">object and type</TH> -</TR> -<TR> -<TD>abstract A = B</TD> -<TD>A = B : type</TD> -</TR> -<TR> -<TD>concrete C of A = B</TD> -<TD>C = B : A -> S</TD> -</TR> -<TR> -<TD>interface I = B</TD> -<TD>I = B : type</TD> -</TR> -<TR> -<TD>instance J of I = B</TD> -<TD>J = B : I</TD> -</TR> -<TR> -<TD>incomplete concrete C of A = open I in B</TD> -<TD>C = B : I -> A -> S</TD> -</TR> -<TR> -<TD>concrete K of A = C with (I=J)</TD> -<TD>K = B(J) : A -> S</TD> -</TR> -<TR> -<TD>resource R = B</TD> -<TD>R = B : I</TD> -</TR> -<TR> -<TD>concrete C of A = open R in B</TD> -<TD>C = B(R) : A -> S</TD> -</TR> -</TABLE> - -<P></P> -<P> -A further step of defining modules as first-class objects would use -GADTs and record types: -</P> -<UL> -<LI>an abstract syntax is a Generalized Algebraic Datatype (GADT) -<LI>the target type <CODE>S</CODE> of concrete syntax is the type of nested - tuples over strings and integers -<LI>an interface is a labelled record type -<LI>an instance is a record of the type defined by the interface -<LI>a functor, with a module body opening an interface, is a function - on its instances -<LI>the instantiation of a functor is an application of the function to - some instance -<LI>a resource is a typed labelled record, putting together an interface and - an instance of it -<LI>the body of a module opening a resource is as a function on the interface - implicit in the resource; this function is immediately applied to the instance - defined in the resource -</UL> - -<P> -Slightly unexpectedly, interfaces and instances are easier to understand -in this way than resources - a resource is, indeed, more complex, since -it fuses together an interface and an instance. -</P> -<P> -<a name="openabstract"></a> -</P> -<P> -When an abstract is used as an interface and a concrete as its instance, they -are actually reinterpreted so that they match the model. Then the abstract is -no longer a GADT, but a system of <I>abstract</I> datatypes, with a record field -of type <CODE>Type</CODE> for each category, and a function among these types for each -abstract syntax function. A concrete syntax instantiates this record with -linearization types and linearizations. -</P> -<A NAME="toc9"></A> -<H3>Inheritance</H3> -<P> -After checking that the <I>extends</I> of a module are of appropriate -module types, the compiler adds the inherited judgements to the -judgements included in the body. The inherited judgements are -not copied entirely, but their names with links to the inherited module. -Conflicts may arise in this process: a name can have two definitions in the combined -pool of inherited and added judgements. Such a conflict is always an -error: GF provides no way to redefine an inherited constant. -</P> -<P> -Simple as the definition of a conflict may sound, it has to take care of the -inheritance hierarchy. A very common pattern of inheritance is the -<B>diamond</B>: inheritance from two modules which themselves inherit a common -base module. Assume that the base module defines a name <CODE>f</CODE>: -</P> -<PRE> - N - / \ - M1 M2 - \ / - Base {f} -</PRE> -<P> -Now, <CODE>N</CODE> inherits <CODE>f</CODE> from both <CODE>M1</CODE> and <CODE>M2</CODE>, so is there a -conflict? The answer in GF is <I>no</I>, because the "two" <CODE>f</CODE>'s are in the -end the same: the one defined in <CODE>Base</CODE>. The situation is thus simpler -than in <B>multiple inheritance</B> in languages like C++, because definitions in -GF are <B>immutable</B>: neither <CODE>M1</CODE> nor <CODE>M2</CODE> can possibly have changed -the definition of <CODE>f</CODE> given in <CODE>Base</CODE>. In practice, the compiler manages -inheritance through hierarchy in a very simple way, by just always creating -a link not to the immediate parent, but the original ancestor; this ancestor -can be read from the link provided by the immediate parent. Here is how -links are created from source modules by the compiler: -</P> -<PRE> - Base {f} - M1 {m1} ===> M1 {Base.f, m1} - M2 {m2} ===> M2 {Base.f, m2} - N {n} ===> N {Base.f, M1.m1, M2.m2, n} -</PRE> -<P></P> -<P> -<a name="restrictedinheritance"></a> -</P> -<P> -Inheritance can be <B>restricted</B>. This means that a module can be specified -as inheriting <I>only</I> explicitly listed constants, or all constants -<I>except</I> ones explicitly listed. The syntax uses constant names in brackets, -prefixed by a minus sign in the case of an exclusion list. In the following -configuration, N inherits <CODE>a,b,c</CODE> from <CODE>M1</CODE>, and all names but <CODE>d</CODE> -from <CODE>M2</CODE> -</P> -<PRE> - N = M1 {a,b,c}, M2-{d} -</PRE> -<P> -Restrictions are performed as a part of inheritance linking, module by module: -the link is created for a constant if and only if it is both -included in the module and compatible with the restriction. Thus, -for instance, an inadvertent usage can exclude a constant from one module -but inherit it from another one. In the following -configuration, <CODE>f</CODE> is inherited via <CODE>M1</CODE>, if <CODE>M1</CODE> inherits it. -</P> -<PRE> - N = M1 [a,b,c], M2-[f] -</PRE> -<P> -Unintended inheritance may cause problems later in compilation, in the -judgement-level dependency analysis phase. For instance, suppose a function -<CODE>f</CODE> has category <CODE>C</CODE> as its type in <CODE>M</CODE>, and we only include <CODE>f</CODE>. The -exclusion has the effect of creating an ill-formed module: -</P> -<PRE> - abstract M = {cat C ; fun f : C ;} - M [f] ===> {fun f : C ;} -</PRE> -<P> -One might expect inheritance restriction to be transitive: if an included -constant <I>b</I> depends on some other constant <I>a</I>, then <I>a</I> should be -included automatically. However, this rule would leave to hard-to-detect -inheritances. And it could only be applied later in the compilation phase, -when the compiler has not only collected the names defined, but also -resolved the names used in definitions. -</P> -<P> -Yet another pitfall with restricted inheritance is that it must be stated -for each module separately. For instance, a concrete syntax of an abstract -must exclude all those names that the abstract does, and a functor instantiation -must replicate all restrictions of the functor. -</P> -<A NAME="toc10"></A> -<H3>Opening</H3> -<P> -Opening makes constants from other modules usable in judgements, without -inheriting them. This means that, unlike inheritance, opening is not -transitive. -</P> -<P> -<a name="qualifiednames"></a> -</P> -<P> -Opening cannot be restricted as inheritance can, but it can be <B>qualified</B>. -This means that the names from the opened modules cannot be used as such, but -only as prefixed by a qualifier and a dot (<CODE>.</CODE>). The qualifier can be any -identifier, including the name of the module. Here is an example of -an <I>opens</I> list: -</P> -<PRE> - open A, (X = XSLTS), (Y = XSLTS), B -</PRE> -<P> -If <CODE>A</CODE> defines the constant <CODE>a</CODE>, it can be accessed by the names -</P> -<PRE> - a A.a -</PRE> -<P> -If <CODE>XSLTS</CODE> defines the constant <CODE>x</CODE>, it can be accessed by the names -</P> -<PRE> - X.x Y.x XSLTS.x -</PRE> -<P> -Thus qualification by real module name is always possible, and one and the same -module can be qualified in different ways at the same time (the latter can -be useful if you want to be able to change the implementations of some -constants to a different resource later). Since the qualification with real -module name is always possible, it is not possible to "swap" the names of -modules locally: -</P> -<PRE> - open (A=B), (B=A) -- NOT POSSIBLE! -</PRE> -<P> -The list of qualifiers names and module names in a module header may -thus not contain any duplicates. -</P> -<A NAME="toc11"></A> -<H3>Name resolution</H3> -<P> -<a name="renaming"></a> -</P> -<P> -<B>Name resolution</B> is the compiler phase taking place after inheritance -linking. It qualifies all names occurring in the definition parts of judgements -(that is, just excluding the defined names themselves) with the names of -the modules they come from. If a name can come from different modules (that is, -not from their common ancestor), a conflict is reported; this decision is -hence not dependent on e.g. types, which are known only at a later phase. -</P> -<P> -Qualification of names is the main device for avoiding conflicts in -name resolution. No other information is used, such as priorities between -modules. However, if a name is defined in different opened modules -but never used in the module body, -a conflict does not arise: conflicts arise only -when names are used. Also in this respect, opening is thus different from -inheritance, where conflicts are checked independently of use. -</P> -<P> -As usual, inner scope has priority in name resolution. This means that -if an identifier is in scope as a bound variable, it will not be -interpreted as a constant, unless qualified by a module name -(variable bindings are explained <a href="#variablebinding">here</a>). -</P> -<A NAME="toc12"></A> -<H3>Functor instantiations</H3> -<P> -We have dealt with the principles of module headers, inheritance, and -names in a general way that applies to all module types. The exception -is functor instantiations, that have an extra part of the instantiating -equations, assigning an instance to every interface. Here is a typical -example, displaying the full generality: -</P> -<PRE> - concrete FoodsEng of Foods = PhrasesEng ** - FoodsI-[Pizza] with - (Syntax = SyntaxEng), - (LexFoods = LexFoodsEng) ** - open SyntaxEng, ParadigmsEng in { - lin Pizza = mkCN (mkA "Italian") (mkN "pie") ; - } -</PRE> -<P> -(The example is modified from Section 5.9 in the GF Tutorial.) -</P> -<P> -The instantiation syntax is similar to qualified <I>opens</I>. The left-hand-side -names must be interfaces, the right-hand-side names their instances. (Recall -that <CODE>abstract</CODE> can be use as <CODE>interface</CODE> and <CODE>concrete</CODE> as its -<CODE>instance</CODE>.) Inheritance from the functor can be restricted, typically -in the purpose of defining some excluded functions in language-specific -ways in the module body. -</P> -<A NAME="toc13"></A> -<H3>Completeness</H3> -<P> -<a name="completeness"></a> -</P> -<P> -(This section refers to the forms of judgement introduced <a href="#judgementforms">here</a>.) -</P> -<P> -A <CODE>concrete</CODE> is complete with respect to an <CODE>abstract</CODE>, if it -contains a <CODE>lincat</CODE> definition for every <CODE>cat</CODE> declaration, and -a <CODE>lin</CODE> definition for every <CODE>fun</CODE> declaration. -</P> -<P> -The same completeness criterion applies to functor instantiations. -It is not possible to use a partial functor instantiation, leading -to another functor. -</P> -<P> -Functors do not need to be complete in the sense concrete modules need. -The missing definitions can then be provided in the body of each -functor instantiation. -</P> -<P> -A <CODE>resource</CODE> is complete, if all its <CODE>oper</CODE> and <CODE>param</CODE> judgements -have a definition part. While a <CODE>resource</CODE> must be complete, an -<CODE>interface</CODE> need not. For an <CODE>interface</CODE>, it is the definition -parts of judgements are optional. -</P> -<P> -An <CODE>instance</CODE> is complete with respect to an <CODE>interface</CODE>, if it -gives the definition parts of all <CODE>oper</CODE> and <CODE>param</CODE> judgements -that are omitted in the <CODE>interface</CODE>. Giving definitions to judgements -that have already been defined in the <CODE>interface</CODE> is illegal. -Type signatures, on the other hand, can be repeated if the same types -are used. -</P> -<P> -In addition to completing the definitions in an <CODE>interface</CODE>, -its instance may contain other judgements, but these must all -be complete with definitions. -</P> -<P> -Here is an example of an instance and its interface showing the -above variations: -</P> -<PRE> - interface Pos = { - param Case ; -- no definition - param Number = Sg | Pl ; -- definition given - oper Noun : Type = { -- relative definition given - s : Number => Case => Str - } ; - oper regNoun : Str -> Noun ; -- no definition - } - - instance PosEng of Pos = { - param Case = Nom | Gen ; -- definition of Case - -- Number and Noun inherited - oper regNoun = \dog -> { -- type of regNoun inherited - s = table { -- definition of regNoun - Sg => table { - Nom => dog - -- etc - } - } ; - oper house_N : Noun = -- new definition - regNoun "house" ; - } -</PRE> -<P></P> -<A NAME="toc14"></A> -<H2>Judgements</H2> -<A NAME="toc15"></A> -<H3>Overview of the forms of judgement</H3> -<P> -<a name="judgementforms"></a> -</P> -<P> -A module body in GF is a set of <B>judgements</B>. Judgements are -definitions or declarations, sometimes combinations of the two; the -common feature is that every judgement introduces a name, which is -available in the module and whenever the module is extended or opened. -</P> -<P> -There are several different <B>forms of judgement</B>, identified by different -<B>judgement keywords</B>. Here is a list of all these forms, together -with syntax descriptions and the types of modules in which each form can occur. -The table moreover indicates whether the judgement has a default value, and -whether it contributes to the <B>name base</B>, i.e. introduces a new -name to the scope. -</P> -<TABLE class="table"> -<TR> -<TH>judgement</TH> -<TH>where</TH> -<TH>module</TH> -<TH>default</TH> -<TH COLSPAN="2">base</TH> -</TR> -<TR> -<TD><CODE>cat</CODE> C G</TD> -<TD>G context</TD> -<TD>abstract</TD> -<TD>N/A</TD> -<TD>yes</TD> -</TR> -<TR> -<TD><CODE>fun</CODE> f : A</TD> -<TD>A type</TD> -<TD>abstract</TD> -<TD>N/A</TD> -<TD>yes</TD> -</TR> -<TR> -<TD><CODE>def</CODE> f ps = t</TD> -<TD>f fun, ps patterns, t term</TD> -<TD>abstract</TD> -<TD>yes</TD> -<TD>no</TD> -</TR> -<TR> -<TD><CODE>data</CODE> C = f <CODE>|</CODE> ... <CODE>|</CODE> g</TD> -<TD>C cat, f...g fun</TD> -<TD>abstract</TD> -<TD>yes</TD> -<TD>no</TD> -</TR> -<TR> -<TD><CODE>lincat</CODE> C = T</TD> -<TD>C cat, T type</TD> -<TD>concrete*</TD> -<TD>yes</TD> -<TD>yes</TD> -</TR> -<TR> -<TD><CODE>lin</CODE> f = t</TD> -<TD>f fun, t term</TD> -<TD>concrete*</TD> -<TD>no</TD> -<TD>yes</TD> -</TR> -<TR> -<TD><CODE>lindef</CODE> C = t</TD> -<TD>C cat, t term</TD> -<TD>concrete*</TD> -<TD>yes</TD> -<TD>no</TD> -</TR> -<TR> -<TD><CODE>linref</CODE> C = t</TD> -<TD>C cat, t term</TD> -<TD>concrete*</TD> -<TD>yes</TD> -<TD>no</TD> -</TR> -<TR> -<TD><CODE>printname cat</CODE> C = t</TD> -<TD>C cat, t term</TD> -<TD>concrete*</TD> -<TD>yes</TD> -<TD>no</TD> -</TR> -<TR> -<TD><CODE>printname fun</CODE> f = t</TD> -<TD>f fun, t term</TD> -<TD>concrete*</TD> -<TD>yes</TD> -<TD>no</TD> -</TR> -<TR> -<TD><CODE>param</CODE> P = C<CODE>|</CODE> ... <CODE>|</CODE> D</TD> -<TD>C...D constructors</TD> -<TD>resource*</TD> -<TD>N/A</TD> -<TD>yes</TD> -</TR> -<TR> -<TD><CODE>oper</CODE> f : T = t</TD> -<TD>T type, t term</TD> -<TD>resource*</TD> -<TD>N/A</TD> -<TD>yes</TD> -</TR> -<TR> -<TD><CODE>flags</CODE> o = v</TD> -<TD>o flag, v value</TD> -<TD>all</TD> -<TD>yes</TD> -<TD>N/A</TD> -</TR> -</TABLE> - -<P></P> -<P> -Judgements that have default values are rarely used, except <CODE>lincat</CODE> and -<CODE>flags</CODE>, which often need values different from the defaults. -</P> -<P> -Introducing a name twice in the same module is an error. In other words, -all judgements that have a "yes" in the name base column, must -have distinct identifiers on their left-hand sides. -</P> -<P> -All judgement end with semicolons (<CODE>;</CODE>). -</P> -<P> -In addition to the syntax given in the table, many of the forms have -syntactic sugar. This sugar will be explained below in connection to -each form. There are moreover two kinds of syntactic sugar common to all forms: -</P> -<UL> -<LI>the judgement keyword is shared between consecutive judgements - until a new keyword appears: -<center> -<CODE>keyw J ; K ;</CODE> === <CODE>keyw J ; keyw K ;</CODE> -</center> -<LI>the right-hand sides of colon (<CODE>:</CODE>) and equality (<CODE>=</CODE>) - can be shared, by using comma (<CODE>,</CODE>) as separator of left-hand sides, which - must consist of identifiers -<center> -<CODE>c,d : T</CODE> === <CODE>c : T ; d : T ;</CODE> -<P></P> -<CODE>c,d = t</CODE> === <CODE>c = t ; d = t ;</CODE> -</center> -</UL> - -<P> -These conventions, like all syntactic sugar, are performed at an -early compilation phase, directly after parsing. This means that e.g. -</P> -<PRE> - lin f,g = \x -> x ; -</PRE> -<P> -can be correct even though <CODE>f</CODE> and <CODE>g</CODE> required different -function types. -</P> -<P> -Within a module, judgements can occur in any order. In particular, -a name can be used before it is introduced. -</P> -<P> -The explanations of judgement forms refer to the notions -of <B>type</B> and <B>term</B> (the latter also called <B>expression</B>). -These notions will be explained in detail <a href="#expressions">here</a>. -</P> -<A NAME="toc16"></A> -<H3>Category declarations, cat</H3> -<P> -<a name="catjudgements"></a> -</P> -<P> -Category declarations -<center> -<CODE>cat</CODE> <I>C</I> <I>G</I> -</center> -define the <B>basic types</B> of abstract syntax. -A basic type is formed from a category by giving values to all variables -in the <B>context</B> <I>G</I>. If the context is empty, the -basic type looks the same as the category itself. Otherwise, application -syntax is used: -<center> -<I>C</I> <i>a</i><sub>1</sub>...<i>a</i><sub>n</sub> -</center> -</P> -<A NAME="toc17"></A> -<H3>Hypotheses and contexts</H3> -<P> -<a name="contexts"></a> -</P> -<P> -A context is a sequence of <B>hypotheses</B>, i.e. variable-type pairs. -A hypothesis is written -<center> -<CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> -</center> -and a sequence does not have any separator symbols. As syntactic sugar, -</P> -<UL> -<LI>variables can share a type, -<center> -<CODE>(</CODE> <I>x,y</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> === <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> <CODE>(</CODE> <I>y</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> -</center> -<LI>a <B>wildcard</B> can be used for a variable not occurring in types - later in the context, -<center> -<CODE>(</CODE> <CODE>_</CODE> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> === <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> -</center> -<LI>if the variable does not occur later, it can be omitted altogether, and - parentheses are not used, -<center> - <I>T</I> === <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>)</CODE> -</center> - But if <I>T</I> is more complex than an identifier, it needs parentheses to - be separated from the rest of the context. -</UL> - -<P> -An abstract syntax has <B>dependent types</B>, if any of its categories has -a non-empty context. -</P> -<A NAME="toc18"></A> -<H3>Function declarations, fun</H3> -<P> -Function declarations, -<center> - <CODE>fun</CODE> <I>f</I> <CODE>:</CODE> <I>T</I> -</center> -define the <B>syntactic constructors</B> of abstract -syntax. The type <I>T</I> of <I>f</I> -is built built from basic types (formed from categories) by using -the function type constructor <CODE>-></CODE>. Thus its form is -<center> - (<i>x</i><sub>1</sub> <CODE>:</CODE> <i>A</i><sub>1</sub>) <CODE>-></CODE> ... <CODE>-></CODE> (<i>x</i><sub>n</sub> <CODE>:</CODE> <i>A</i><sub>n</sub>) <CODE>-></CODE> <I>B</I> -</center> -where <I>Ai</I> are types, called the <B>argument types</B>, and <I>B</I> is a -basic type, called the <B>value type</B> of <I>f</I>. The <B>value category</B> of -<I>f</I> is the category that forms the type <I>B</I>. -</P> -<P> -A <B>syntax tree</B> is formed from <I>f</I> by applying it to a full list of -arguments, so that the result is of a basic type. -</P> -<P> -A <B>higher-order function</B> is one that has a function type as an -argument. The concrete syntax of GF does not support displaying the -bound variables of functions of higher than second order, but they are -legal in abstract syntax. -</P> -<P> -An abstract syntax is <B>context-free</B>, if it has neither dependent -types nor higher-order functions. Grammars with context-free abstract -syntax are an important subclass of GF, with more limited complexity -than full GF. Whether the <I>concrete</I> syntax is context-free in the sense -of the Chomsky hierarchy is independent of the context-freeness of -the abstract syntax. -</P> -<A NAME="toc19"></A> -<H3>Function definitions, def</H3> -<P> -Function definitions, -<center> - <CODE>def</CODE> <I>f</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub> <CODE>=</CODE> <I>t</I> -</center> -where <I>f</I> is a <CODE>fun</CODE> function and <i>p</i><sub>i</sub># are patterns, -impose a relation of <B>definitional equality</B> on abstract syntax -trees. They form the basis of <B>computation</B>, which is used -when comparing whether two types are equal; this notion is relevant -only if the types are dependent. Computation can also be used for -the <B>normalization</B> of syntax trees, which applies even in -context-free abstract syntax. -</P> -<P> -The set of <CODE>def</CODE> definitions for <I>f</I> can be scattered around -the module in which <I>f</I> is introduced as a function. The compiler -builds the set of pattern equations in the order in which the -equations appear; this order is significant in the case of -overlapping patterns. All equations must appear in the same module in -which <I>f</I> itself declared. -</P> -<P> -The syntax of patterns will be specified <a href="#patternmatching">here</a>, commonly for -abstract and concrete syntax. In abstract -syntax, <B>constructor patterns</B> are those of the form -<center> - <I>C</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub> -</center> -where <I>C</I> is declared as <CODE>data</CODE> for some abstract syntax category -(see next section). A <B>variable pattern</B> is either an identifier or -a wildcard. -</P> -<P> -A common pitfall is to forget to declare a constructor as data, which -causes it to be interpreted as a variable pattern in definitions. -</P> -<P> -Computation is performed by applying definitions and beta conversions, -and in general by using <B>pattern matching</B>. Computation and pattern matching -are explained commonly for abstract and concrete syntax <a href="#patternmatching">here</a>. -</P> -<P> -In contrast to concrete syntax, abstract syntax computation is -completely <B>symbolic</B>: it does not produce a value, but just another -term. Hence it is not an error to have incomplete systems of -pattern equations for a function. In addition, the definitions -can be <B>recursive</B>, which means that computation can fail to terminate; -this can never happen in concrete syntax. -</P> -<A NAME="toc20"></A> -<H3>Data constructor definitions, data</H3> -<P> -A data constructor definition, -<center> - <CODE>data</CODE> <I>C</I> <CODE>=</CODE> <i>f</i><sub>1</sub> <CODE>|</CODE> ... <CODE>|</CODE> <i>f</i><sub>n</sub> -</center> -defines the functions <I>f1</I>...<I>fn</I> to be <B>constructors</B> -of the category <I>C</I>. This means that they are recognized as constructor -patterns when used in function definitions. -</P> -<P> -In order for the data constructor definition to be correct, -<i>f</i><sub>1</sub>...<i>f</i><sub>n</sub> must be functions with <I>C</I> as their value category. -</P> -<P> -The complete set of constructors for a category <I>C</I> is the union of -all its data constructor definitions. Thus a category can be "extended" -by new constructors afterwards. However, all these constructor definitions -must appear in the same module in which the category is itself defined. -</P> -<P> -There is syntactic sugar for declaring a function as a constructor at -the same time as introducing it: -<center> -<CODE>data</CODE> <I>f</I> : <i>A</i><sub>1</sub> <CODE>-></CODE> ... <CODE>-></CODE> <i>A</i><sub>n</sub> <CODE>-></CODE> <I>C</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>m</sub> -</P> -<P> - === -</P> -<P> -<CODE>fun</CODE> <I>f</I> : <i>A</i><sub>1</sub> <CODE>-></CODE> ... <CODE>-></CODE> <i>A</i><sub>n</sub> <CODE>-></CODE> <I>C</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>m</sub> ; - <CODE>data</CODE> <I>C</I> = <I>f</I> -</center> -</P> -<A NAME="toc21"></A> -<H3>The semantic status of an abstract syntax function</H3> -<P> -There are three possible statuses for a function declared in a <CODE>fun</CODE> judgement: -</P> -<UL> -<LI>primitive notion: the default status -<LI>constructor: the function appears on the right-hand side in <CODE>data</CODE> judgement -<LI>defined: the function has a <CODE>def</CODE> definition -</UL> - -<P> -The "constructor" and "defined" statuses are in contradiction with each other, -whereas the primitive notion status is overridden by any of the two others. -</P> -<P> -This distinction is relevant for the semantics of abstract syntax, not -for concrete syntax. It shows in the way patterns are treated in -equations in <CODE>def</CODE> definitions: a constructor -in a pattern matches only itself, whereas -any other name is treated as a variable pattern, which matches -anything. -</P> -<A NAME="toc22"></A> -<H3>Linearization type definitions, lincat</H3> -<P> -A linearization type definition, -<center> - <CODE>lincat</CODE> <I>C</I> <CODE>=</CODE> <I>T</I> -</center> -defines the type of linearizations of trees whose type has category <I>C</I>. -Type dependences have no effect on the linearization type. -</P> -<P> -The type <I>T</I> must be a <B>legal linearization type</B>, which means that it -is a <I>record type</I> whose fields have either parameter types, the type Str -of strings, or table or record types of these. In particular, function types -may not appear in <I>T</I>. A detailed explanation of types in concrete syntax -will be given <a href="#cnctypes">here</a>. -</P> -<P> -If <I>K</I> is the concrete syntax of an abstract syntax <I>A</I>, then <I>K</I> must -define the linearization type of all categories declared in <I>A</I>. However, -the definition can be omitted from the source code, in which case the default -type <CODE>{s : Str}</CODE> is used. -</P> -<A NAME="toc23"></A> -<H3>Linearization definitions, lin</H3> -<P> -A linearization definition, -<center> - <CODE>lin</CODE> <I>f</I> <CODE>=</CODE> <I>t</I> -</center> -defines the linearizations function of function <I>f</I>, i.e. the function -used for linearizing trees formed by <I>f</I>. -</P> -<P> -The type of <I>t</I> must be the homomorphic image of the type of <I>f</I>. -In other words, if -<center> - <CODE>fun</CODE> <I>f</I> <CODE>:</CODE> <i>A</i><sub>1</sub> <CODE>-></CODE> ... <CODE>-></CODE> <i>A</i><sub>n</sub> <CODE>-></CODE> <I>A</I> -</center> -then -<center> - <CODE>lin</CODE> <I>f</I> <CODE>:</CODE> <i>A</i><sub>1</sub>* <CODE>-></CODE> ... <CODE>-></CODE> <i>A</i><sub>n</sub>* <CODE>-></CODE> <I>A</I>* -</center> -where the type <I>T</I>* is defined as follows depending on <I>T</I>: -</P> -<UL> -<LI>(<I>C</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>n</sub>)* = <I>T</I>, if <CODE>lincat</CODE> <I>C</I> <CODE>=</CODE> <I>T</I> -<LI>(<i>B</i><sub>1</sub> <CODE>-></CODE> ... <CODE>-></CODE> <i>B</i><sub>m</sub> <CODE>-></CODE> <I>B</I>)* = <I>B</I>* <CODE>** {$0,...,$m : Str}</CODE> -</UL> - -<P> -The second case is relevant for higher-order functions only. It says that -the linearization type of the value type is extended by adding a string field -for each argument types; these fields store the variable symbol used for -the binding of each variable. -</P> -<P> -<a name="HOAS"></a> -</P> -<P> -Since the arguments of a function argument are treated as bare strings, -orders higher than the second are irrelevant for concrete syntax. -</P> -<P> -There is syntactic sugar for binding the variables of the linearization -of a function on the left-hand side: -<center> - <CODE>lin</CODE> <I>f</I> <I>p</I> <CODE>=</CODE> <I>t</I> === <CODE>lin</CODE> <I>f</I> <CODE>= \</CODE><I>p</I> <CODE>-></CODE> <I>t</I> -</center> -The pattern <I>p</I> must be either a variable or a wildcard (<CODE>_</CODE>); this is -what the syntax of lambda abstracts (<CODE>\p -> t</CODE>) requires. -</P> -<A NAME="toc24"></A> -<H3>Linearization default definitions, lindef</H3> -<P> -<a name="lindefjudgements"></a> -</P> -<P> -A linearization default definition, -<center> - <CODE>lindef</CODE> <I>C</I> <CODE>=</CODE> <I>t</I> -</center> -defines the default linearization of category <I>C</I>, i.e. the function -applicable to a string to make it into an object of the linearization -type of <I>C</I>. -</P> -<P> -Linearization defaults are invoked when linearizing variable bindings -in higher-order abstract syntax. A variable symbol is then presented -as a string, which must be converted to correct type in order for -the linearization not to fail with an error. -</P> -<P> -The other use of the defaults is for linearizing metavariables -and abstract functions without linearization in the concrete syntax. -In the first case the default linearization is applied to -the string <CODE>"?X"</CODE> where <CODE>X</CODE> is the unique index -of the metavariable, and in the second case the string is -<CODE>"[f]"</CODE> where <CODE>f</CODE> is the name of the abstract -function with missing linearization. - -</P> -<P> -Usually, linearization defaults are generated by using the default -rule that "uses the symbol itself for every string, and the -first value of the parameter type for every parameter". The precise -definition is by structural recursion on the type: -</P> -<UL> -<LI>default(Str,s) = s -<LI>default(P,s) = #1(P) -<LI>default(P => T,s) = <CODE>\\_ =></CODE> default(T,s) -<LI>default(<CODE>{</CODE>... ; r : R ; ...<CODE>}</CODE>,s) = <CODE>{</CODE>... ; r : default(R,s) ; ...<CODE>}</CODE> -</UL> - -<P> -The notion of the first value of a parameter type (#1(P)) is defined -<a href="#paramvalues">below</a>. -</P> -<A NAME="toc24_r"></A> -<H3>Linearization reference definitions, linref</H3> -<P> -<a name="linrefjudgements"></a> -</P> -<P> -A linearization reference definition, -<center> - <CODE>linref</CODE> <I>C</I> <CODE>=</CODE> <I>t</I> -</center> -defines the reference linearization of category <I>C</I>, i.e. the function -applicable to an object of the linearization type of <I>C</I> to make it into a string. -</P> -<P> -The reference linearization is always applied to the top-level node -of the abstract syntax tree. For example when we linearize the -tree <CODE>f x1 x2 .. xn</CODE>, then we first apply <CODE>f</CODE> -to its arguments which gives us an object of the linearization type of -its category. After that we apply the reference linearization -for the same category to get a string out of the object. This -is particularly useful when the linearization type of <I>C</I> -contains discontious constituents. In this case usually the reference -linearization glues the constituents together to produce an -intuitive linearization string. -</P> -<P> -The reference linearization is also used for linearizing metavariables -which stand in function position. For example the tree -<CODE>f (? x1 x2 .. xn)</CODE> is linearized as follows. Each -of the arguments <CODE>x1 x2 .. xn</CODE> is linearized, and after that -the reference linearization of the its category is applied -to the output of the linearization. The result is a sequence of <CODE>n</CODE> -strings which are concatenated into a single string. The final string -is the input to the default linearization of the category -for the argument of <CODE>f</CODE>. After applying the default linearization -we get an object that we could safely pass to <CODE>f</CODE>. -</P> -<P> -Usually, linearization references are generated by using the -rule that "picks the first string in the linearization type". The precise -definition is by structural recursion on the type: -</P> -<UL> -<LI>reference(Str,o) = Just o -<LI>reference(P,s) = Nothing -<LI>reference(P => T,o) = reference(T,o ! #1(P)) || reference(T,o ! #2(P)) || ... || reference(T,o ! #n(P)) -<LI>reference({r1 : R1; ... rn : Rn},o) = reference(R1, o.r1) || reference(R2, o.r2) || ... || reference(Rn, o.rn) -</UL> -Here each call to reference returns either <CODE>(Just o)</CODE> or <CODE>Nothing</CODE>. -When we compute the reference for a table or a record then we pick -the reference for the first expression for which the recursive call -gives us <CODE>Just</CODE>. If we get <CODE>Nothing</CODE> for -all of them then the final result is <CODE>Nothing</CODE> too. - -<A NAME="toc25"></A> -<H3>Printname definitions, printname cat and printname fun</H3> -<P> -A category printname definition, -<center> - <CODE>printname cat</CODE> <I>C</I> <CODE>=</CODE> <I>s</I> -</center> -defines the printname of category <I>C</I>, i.e. the name used -in some abstract syntax information shown to the user. -</P> -<P> -Likewise, a function printname definition, -<center> - <CODE>printname fun</CODE> <I>f</I> <CODE>=</CODE> <I>s</I> -</center> -defines the printname of function <I>f</I>, i.e. the name used -in some abstract syntax information shown to the user. -</P> -<P> -The most common use of printnames is in the interactive syntax -editor, where printnames are displayed in menus. It is possible -e.g. to adapt them to each language, or to embed HTML tooltips -in them (as is used in some HTML-based editor GUIs). -</P> -<P> -Usually, printnames are generated automatically from the symbol -and/or concrete syntax information. -</P> -<A NAME="toc26"></A> -<H3>Parameter type definitions, param</H3> -<P> -<a name="paramjudgements"></a> -</P> -<P> -A parameter type definition, -<center> - <CODE>param</CODE> <I>P</I> <CODE>=</CODE> <i>C</i><sub>1</sub> <i>G</i><sub>1</sub> <CODE>|</CODE> ... <CODE>|</CODE> <i>C</i><sub>n</sub> <i>G</i><sub>n</sub> -</center> -defines a parameter type <I>P</I> with the <B>parameter constructors</B> -<i>C</i><sub>1</sub>...<i>C</i><sub>n</sub>, with their respective contexts <i>G</i><sub>1</sub>...<i>G</i><sub>n</sub>. -</P> -<P> -<a name="paramtypes"></a> -</P> -<P> -Contexts have the same syntax as in <CODE>cat</CODE> judgements, explained -<a href="#catjudgements">here</a>. Since dependent types are not available in -parameter type definitions, the use of variables is never -necessary. The types in the context must themselves be <B>parameter types</B>, -which are defined as follows: -</P> -<UL> -<LI>Given the judgement <CODE>param</CODE> <I>P</I> ..., <I>P</I> is a parameter type. -<LI>A record type of parameter types is a parameter type. -<LI><CODE>Ints</CODE> <I>n</I> (an initial segment of integers) is a parameter type. -</UL> - -<P> -The names defined by a parameter type definition include both the -type name <I>P</I> and the constructor names <i>C</i><sub>i</sub>. Therefore all these -names must be distinct in a module. -</P> -<P> -A parameter type may not be recursive, i.e. <I>P</I> itself may not occur in -the contexts of its constructors. This restriction extends to mutual -recursion: we say that <I>P</I> <B>depends</B> on the types that occur -in the contexts of its constructors and on all types that those types -depend on, and state that <I>P</I> may not depend on itself. -</P> -<P> -In an <CODE>interface module</CODE>, it is possible to declare a parameter type -without defining it, -<center> - <CODE>param</CODE> <I>P</I> <CODE>;</CODE> -</center> -</P> -<A NAME="toc27"></A> -<H3>Parameter values</H3> -<P> -<a name="paramvalues"></a> -</P> -<P> -All parameter types are finite, and the GF compiler will internally -compute them to <B>lists of parameter values</B>. These lists are formed by -traversing the <CODE>param</CODE> definitions, usually respecting the -order of constructors in the source code. For records, bibliographical -sorting is applied. However, both the order of traversal of <CODE>param</CODE> -definitions and the order of fields in a record are specified -in a compiler-internal way, which means that the programmer should not -rely on any particular order. -</P> -<P> -The order of the list of parameter values can affect the program in two -cases: -</P> -<UL> -<LI>in the default <CODE>lindef</CODE> definition (<a href="#lindefjudgements">here</a>), - the first value is chosen -<LI>in course-of-value tables (<a href="#tables">here</a>), the compiler-internal order is - followed -</UL> - -<P> -The first usage implies that, if <CODE>lindef</CODE> definitions are essential for -the application, they should be given manually. The second usage implies that -course-of-value tables should be avoided in hand-written GF code. -</P> -<P> -In run-time grammar generation, all parameter values are translated to -integers denotions positions in these parameter lists. -</P> -<A NAME="toc28"></A> -<H3>Operation definitions, oper</H3> -<P> -An operation definition, -<center> - <CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> <CODE>=</CODE> <I>t</I> -</center> -defines an <B>operation</B> <I>h</I> of type <I>T</I>, with the computation rule -<center> - <I>h</I> ==> <I>t</I> -</center> -The type <I>T</I> can be any concrete syntax type, including function -types of any order. The term <I>t</I> must have the type <I>T</I>, as -defined <a href="#expressions">here</a>. -</P> -<P> -As syntactic sugar, the type can be omitted, -<center> - <CODE>oper</CODE> <I>h</I> <CODE>=</CODE> <I>t</I> -</center> -which works in two cases -</P> -<UL> -<LI>the type can be inferred from <I>t</I> (compiler-dependent) -<LI>the definition occurs in an <CODE>instance</CODE> and the type is given in - the <CODE>interface</CODE> -</UL> - -<P> -It is also possible to give the type and the definition separately: -<center> -<CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> ; <CODE>oper</CODE> <I>h</I> <CODE>=</CODE> <I>t</I> === - <CODE>oper</CODE> <I>h</I> <CODE>:</CODE> <I>T</I> <CODE>=</CODE> <I>t</I> -</center> -The order of the type part and the definition part is free, and there -can be other judgements in between. However, they must occur in the -same <CODE>resource</CODE> module for it to be complete (as defined <a href="#completeness">here</a>). -In an <CODE>interface</CODE> module, it is enough to give the type. -</P> -<P> -When only the definition is given, it is possible to use a shorthand -similar to <CODE>lin</CODE> judgements: -<center> -<CODE>oper</CODE> <I>h</I> <I>p</I> <CODE>=</CODE> <I>t</I> === <CODE>oper</CODE> <I>h</I> <CODE>=</CODE> <CODE>\</CODE><I>p</I> <CODE>-></CODE> <I>t</I> -</center> -The pattern <I>p</I> is either a variable or a wildcard (<CODE>_</CODE>). -</P> -<P> -Operation definitions may not be recursive, not even mutually recursive. -This condition ensures that functions can in the end be eliminated from -concrete syntax code (as explained <a href="#functionelimination">here</a>). -</P> -<A NAME="toc29"></A> -<H3>Operation overloading</H3> -<P> -<a name="overloading"></a> -</P> -<P> -One and the same operation name <I>h</I> can be used for different operations, -which have to have different types. For each call of <I>h</I>, the type checker -selects one of these operations depending on what type is expected in the -context of the call. The syntax of overloaded operation definitions is -<center> -<CODE>oper</CODE> <I>h</I> - <CODE>= overload {</CODE><I>h</I> : <i>T</i><sub>1</sub> = <i>t</i><sub>1</sub> ; ... ; <I>h</I> : <i>T</i><sub>n</sub> = <i>t</i><sub>n</sub><CODE>}</CODE> -</center> -Notice that <I>h</I> must be the same in all cases. -This format can be used to give the complete implementation; to give just -the types, e.g. in an interface, one can use the form -<center> -<CODE>oper</CODE> <I>h</I> - <CODE>: overload {</CODE><I>h</I> : <i>T</i><sub>1</sub> ; ... ; <I>h</I> : <i>T</i><sub>n</sub><CODE>}</CODE> -</center> -The implementation of this operation typing is given by a judgement of -the first form. The order of branches need not be the same. -</P> -<A NAME="toc30"></A> -<H3>Flag definitions, flags</H3> -<P> -A flag definition, -<center> - <CODE>flags</CODE> <I>o</I> <CODE>=</CODE> <I>v</I> -</center> -sets the value of the flag <I>o</I>, to be used when compiling or using -the module. -</P> -<P> -The flag <I>o</I> is an identifier, and the value <I>v</I> is either an identifier -or a quoted string. -</P> -<P> -Flags are a kind of metadata, which do not strictly belong to the GF -language. For instance, compilers do not necessarily check the -consistency of flags, or the meaningfulness of their values. -The inheritance of flags is not well-defined; the only certain rule -is that flags set in the module body override the settings from -inherited modules. -</P> -<P> -Here are some flags commonly included in grammars. -</P> -<TABLE class="table"> -<TR> -<TH>flag</TH> -<TH>value</TH> -<TH>description</TH> -<TH COLSPAN="2">module</TH> -</TR> -<TR> -<TD><CODE>coding</CODE></TD> -<TD>character encoding</TD> -<TD>encoding used in string literals</TD> -<TD>concrete</TD> -</TR> -<TR> -<TD><CODE>startcat</CODE></TD> -<TD>category</TD> -<TD>default target of parsing</TD> -<TD>abstract</TD> -</TR> -</TABLE> - -<P></P> -<P> -The possible values of these flags are -specified <a href="#flagvalues">here</a>. Note that -the <code>lexer</code> and <code>unlexer</code> flags are -deprecated. If you need their functionality, you should use supply -them to GF shell commands like so: - -<center><pre><code>put_string -lextext "страви, напої" | parse</code></pre></center> - -A summary of their possible values can be found at the <a href="http://www.grammaticalframework.org/doc/gf-shell-reference.html">GF shell - reference</a>. -</p> -</P> -<A NAME="toc31"></A> -<H2>Types and expressions</H2> -<A NAME="toc32"></A> -<H3>Overview of expression forms</H3> -<P> -<a name="expressions"></a> -</P> -<P> -Like many dependently typed languages, GF makes no syntactic distinction -between expressions and types. An illegal use of a type as an expression or -vice versa comes out as a type error. Whether a variable, for instance, -stands for a type or an expression value, can only be resolved from its -context of use. -</P> -<P> -One practical consequence of the common syntax is that global and local definitions -(<CODE>oper</CODE> judgements and <CODE>let</CODE> expressions, respectively) work in the same way -for types and expressions. Thus it is possible to abbreviate a type -occurring in a type expression: -</P> -<PRE> - let A = {s : Str ; b : Bool} in A -> A -> A -</PRE> -<P> -Type and other expressions have a system of <B>precedences</B>. The following table -summarizes all expression forms, from the highest to the lowest precedence. -Some expressions are moreover left- or right-associative. -</P> -<TABLE class="table"> -<TR> -<TH>prec</TH> -<TH>expression example</TH> -<TH COLSPAN="2">explanation</TH> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>c</CODE></TD> -<TD>constant or variable</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>Type</CODE></TD> -<TD>the type of types</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>PType</CODE></TD> -<TD>the type of parameter types</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>Str</CODE></TD> -<TD>the type of strings/token lists</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>"foo"</CODE></TD> -<TD>string literal</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>123</CODE></TD> -<TD>integer literal</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>0.123</CODE></TD> -<TD>floating point literal</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>?</CODE></TD> -<TD>metavariable</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>[]</CODE></TD> -<TD>empty token list</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>[C a b]</CODE></TD> -<TD>list category</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>["foo bar"]</CODE></TD> -<TD>token list</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>{"s : Str ; n : Num}</CODE></TD> -<TD>record type</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE>{"s = "foo" ; n = Sg}</CODE></TD> -<TD>record</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE><Sg,Fem,Gen></CODE></TD> -<TD>tuple</TD> -</TR> -<TR> -<TD>7</TD> -<TD><CODE><n : Num></CODE></TD> -<TD>type-annotated expression</TD> -</TR> -<TR> -<TD>6 left</TD> -<TD><CODE>t.r</CODE></TD> -<TD>projection or qualification</TD> -</TR> -<TR> -<TD>5 left</TD> -<TD><CODE>f a</CODE></TD> -<TD>function application</TD> -</TR> -<TR> -<TD>5</TD> -<TD><CODE>table {Sg => [] ; _ => "xs"}</CODE></TD> -<TD>table</TD> -</TR> -<TR> -<TD>5</TD> -<TD><CODE>table P [a ; b ; c]</CODE></TD> -<TD>course-of-values table</TD> -</TR> -<TR> -<TD>5</TD> -<TD><CODE>case n of {Sg => [] ; _ => "xs"}</CODE></TD> -<TD>case expression</TD> -</TR> -<TR> -<TD>5</TD> -<TD><CODE>variants {"color" ; "colour"}</CODE></TD> -<TD>free variation</TD> -</TR> -<TR> -<TD>5</TD> -<TD><CODE>pre {vowel => "an" ; _ => "a"}</CODE></TD> -<TD>prefix-dependent choice</TD> -</TR> -<TR> -<TD>4 left</TD> -<TD><CODE>t ! v</CODE></TD> -<TD>table selection</TD> -</TR> -<TR> -<TD>4 left</TD> -<TD><CODE>A * B</CODE></TD> -<TD>tuple type</TD> -</TR> -<TR> -<TD>4 left</TD> -<TD><CODE>R ** {b : Bool}</CODE></TD> -<TD>record (type) extension</TD> -</TR> -<TR> -<TD>3 left</TD> -<TD><CODE>t + s</CODE></TD> -<TD>token gluing</TD> -</TR> -<TR> -<TD>2 left</TD> -<TD><CODE>t ++ s</CODE></TD> -<TD>token list concatenation</TD> -</TR> -<TR> -<TD>1 right</TD> -<TD><CODE>\x,y -> t</CODE></TD> -<TD>function abstraction ("lambda")</TD> -</TR> -<TR> -<TD>1 right</TD> -<TD><CODE>\\x,y => t</CODE></TD> -<TD>table abstraction</TD> -</TR> -<TR> -<TD>1 right</TD> -<TD><CODE>(x : A) -> B</CODE></TD> -<TD>dependent function type</TD> -</TR> -<TR> -<TD>1 right</TD> -<TD><CODE>A -> B</CODE></TD> -<TD>function type</TD> -</TR> -<TR> -<TD>1 right</TD> -<TD><CODE>P => T</CODE></TD> -<TD>table type</TD> -</TR> -<TR> -<TD>1 right</TD> -<TD><CODE>let x = v in t</CODE></TD> -<TD>local definition</TD> -</TR> -<TR> -<TD>1</TD> -<TD><CODE>t where {x = v}</CODE></TD> -<TD>local definition</TD> -</TR> -<TR> -<TD>1</TD> -<TD><CODE>in M.C "foo"</CODE></TD> -<TD>rule by example</TD> -</TR> -</TABLE> - -<P></P> -<P> -Any expression in parentheses (<CODE>(</CODE><I>exp</I><CODE>)</CODE>) is in the highest -precedence class. -</P> -<A NAME="toc33"></A> -<H3>The functional fragment: expressions in abstract syntax</H3> -<P> -<a name="functiontype"></a> -</P> -<P> -The expression syntax is the same in abstract and concrete syntax, although -only a part of the syntax is actually usable in well-typed expressions in -abstract syntax. An abstract syntax is essentially used for defining a set -of types and a set of functions between those types. Therefore it needs -essentially the <B>functional fragment</B> -of the syntax. This fragment comprises two kinds of types: -</P> -<UL> -<LI><B>basic types</B>, of form <I>C a1...an</I> where - <UL> - <LI><CODE>cat</CODE> <I>C</I> (<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>)...(<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>), including the predefined - categories <CODE>Int</CODE>, <CODE>Float</CODE>, and <CODE>String</CODE> explained <a href="#predefabs">here</a> - <LI><i>a</i><sub>1</sub> : <i>A</i><sub>1</sub>,...,<i>a</i><sub>n</sub> : <i>A</i><sub>n</sub>{<i>x</i><sub>1</sub> = <i>a</i><sub>1</sub>,...,<i>x</i><sub>n-1</sub>=<i>a</i><sub>n-1</sub>} - </UL> -</UL> - -<UL> -<LI><B>function types</B>, of form (<I>x</I> : <I>A</I>) <CODE>-></CODE> <I>B</I>, where - <UL> - <LI><I>A</I> is a type - <LI><I>B</I> is a type possibly depending on <I>x</I> : <I>A</I> - </UL> -</UL> - -<P> -When defining basic types, we used the notation -<I>t</I>{<i>x</i><sub>1</sub> = <i>t</i><sub>1</sub>,...,<i>x</i><sub>n</sub>=<i>t</i><sub>n</sub>} -for the <B>substitution</B> of values to variables. This is a metalevel notation, -which denotes a term that is formed by replacing the free occurrences of -each variable <i>x</i><sub>i</sub> by <i>t</i><sub>i</sub>. -</P> -<P> -These types have six kinds of expressions: -</P> -<UL> -<LI><B>constants</B>, <I>f</I> : <I>A</I> where - <UL> - <LI><CODE>fun</CODE> <I>f</I> : <I>A</I> - </UL> -</UL> - -<UL> -<LI><B>literals</B> for integers, floats, and strings (defined in <a href="#predefabs">here</a>) -</UL> - -<UL> -<LI><B>variables</B>, <I>x</I> : <I>A</I> where - <UL> - <LI><I>x</I> has been introduced by a binding - </UL> -</UL> - -<UL> -<LI><B>applications</B>, <I>f a</I> : <I>B</I>{<I>x</I>=<I>a</I>}, where - <UL> - <LI><I>f</I> : (<I>x</I> : <I>A</I>) <CODE>-></CODE> <I>B</I> - <LI><I>a</I> : <I>A</I> - </UL> -</UL> - -<UL> -<LI><B>abstractions</B>, <CODE>\</CODE><I>x</I> <CODE>-></CODE> <I>b</I> : (<I>x</I> : <I>A</I>) <CODE>-></CODE> <I>B</I>, where - <UL> - <LI><I>b</I> : <I>B</I> possibly depending on <I>x</I> : <I>A</I> - </UL> -</UL> - -<UL> -<LI><B>metavariables</B>, <CODE>?</CODE>, as introduced in intermediate phases of - incremental type checking; metavariables are not permitted - in GF source code -</UL> - -<P> -<a name="variablebinding"></a> -</P> -<P> -The notion of <B>binding</B> is defined for occurrences of variables in -subexpressions as follows: -</P> -<UL> -<LI>in (<I>x</I> : <I>A</I>) <CODE>-></CODE> <I>B</I>, <I>x</I> is bound in <I>B</I> -<LI>in <CODE>\</CODE><I>x</I> <CODE>-></CODE> <I>b</I>, <I>x</I> is bound in <I>b</I> -<LI>in <CODE>def</CODE> <I>f</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub> = <I>t</I>, any pattern variable introduced in - any <I>pi</I> is bound in <I>t</I> (as defined <a href="#patternmatching">here</a>) -</UL> - -<P> -As syntactic sugar, function types have sharing of types and -suppression of variables, in the same way as contexts -(defined <a href="#contexts">here</a>): -</P> -<UL> -<LI>variables can share a type, -<center> -<CODE>(</CODE> <I>x,y</I> <CODE>:</CODE> <I>A</I> <CODE>)</CODE> <CODE>-></CODE> <I>B</I> === - <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>A</I> <CODE>) -> (</CODE> <I>y</I> <CODE>:</CODE> <I>A</I> <CODE>) -></CODE> <I>B</I> -</center> -<LI>a <B>wildcard</B> can be used for a variable not occurring later in the type, -<center> -<CODE>(</CODE> <CODE>_</CODE> <CODE>:</CODE> <I>A</I> <CODE>) -></CODE> <I>B</I> === - <CODE>(</CODE> <I>x</I> <CODE>:</CODE> <I>T</I> <CODE>) -></CODE> <I>B</I> -</center> -<LI>if the variable does not occur later, it can be omitted altogether, and - parentheses are not used, -<center> - <I>A</I> <CODE>-></CODE> <I>B</I> === <CODE>(</CODE> <I>_</I> <CODE>:</CODE> <I>A</I> <CODE>) -></CODE> <I>B</I> -</center> -</UL> - -<P> -There is analogous syntactic sugar for constant functions, -<center> -<CODE>\</CODE><I>_</I> <CODE>-></CODE> <I>t</I> === <CODE>\</CODE><I>x</I> <CODE>-></CODE> <I>t</I> -</center> -where <I>x</I> does not occur in <I>t</I>, and for multiple lambda abstractions: -<center> -<CODE>\</CODE><I>p,q</I> <CODE>-></CODE> <I>t</I> === <CODE>\</CODE><I>p</I> <CODE>-></CODE> <CODE>\</CODE><I>q</I> <CODE>-></CODE> <I>t</I> -</center> -where <I>p</I> and <I>q</I> are variables or wild cards (<CODE>_</CODE>). -</P> -<A NAME="toc34"></A> -<H3>Conversions</H3> -<P> -<a name="conversions"></a> -</P> -<P> -Among expressions, there is a relation of <B>definitional equality</B> defined -by four <B>conversion rules</B>: -</P> -<UL> -<LI><B>alpha conversion</B>: - <CODE>\</CODE><I>x</I> <CODE>-></CODE> <I>b</I> = <CODE>\</CODE><I>y</I> <CODE>-></CODE> <I>b</I>{<I>x</I>=<I>y</I>} -</UL> - -<UL> -<LI><B>beta conversion</B>: (<CODE>\</CODE><I>x</I> <CODE>-></CODE> <I>b</I>) <I>a</I> = <I>b</I>{<I>x</I>=<I>a</I>} -</UL> - -<UL> -<LI><B>delta conversion</B>: <I>f</I> <i>a</i><sub>1</sub> ... <i>a</i><sub>n</sub> = <I>tg</I>, if - <UL> - <LI>there is a definition <CODE>def</CODE> <I>f</I> <i>p</i><sub>1</sub> ... <i>p</i><sub>n</sub> = <I>t</I> - <LI>this definition is the first for <I>f</I> that matches the sequence - <i>a</i><sub>1</sub> .... <i>a</i><sub>n</sub>, with the substitution <I>g</I> - </UL> -</UL> - -<UL> -<LI><B>eta conversion</B>: <I>c</I> = <CODE>\</CODE><I>x</I> <CODE>-></CODE> <I>c x</I>, - if <I>c</I> : (<I>x</I> : <I>A</I>) <CODE>-></CODE> <I>B</I> -</UL> - -<P> -Pattern matching substitution used in delta conversion -is defined <a href="#patternmatching">here</a>. -</P> -<P> -An expression is in <B>beta-eta-normal form</B> if -</P> -<UL> -<LI>it has no subexpressions to which beta conversion applies (beta normality) -<LI>each constant or variable whose type is a function type must be - <B>eta-expanded</B>, i.e. made into an abstract equal to it by eta conversion - (eta normality) -</UL> - -<P> -Notice that the iteration of eta expansion would lead to an expression not -in beta-normal form. -</P> -<A NAME="toc35"></A> -<H3>Syntax trees</H3> -<P> -<a name="syntaxtrees"></a> -</P> -<P> -The <B>syntax trees</B> defined by an abstract syntax are well-typed -expressions of basic types in beta-eta normal form. -Linearization defined in concrete -syntax applies to all and only these expressions. -</P> -<P> -There is also a direct definition of syntax trees, which does not -refer to beta and eta conversions: keeping in mind that a type always has -the form -<center> -(<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>) <CODE>-></CODE> ... <CODE>-></CODE> (<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>) <CODE>-></CODE> <I>B</I> -</center> -where <I>Ai</I> are types and <I>B</I> is a basic type, a syntax tree is an expression -<center> -<I>b</I> <i>t</i><sub>1</sub> ... <i>t</i><sub>n</sub> : <I>B'</I> -</center> -where -</P> -<UL> -<LI><I>B'</I> is the basic type <I>B</I>{<i>x</i><sub>1</sub> = <i>t</i><sub>1</sub>,...,<i>x</i><sub>n</sub> = <i>t</i><sub>n</sub>} -<LI><CODE>fun</CODE> <I>b</I> : (<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>) <CODE>-></CODE> ... <CODE>-></CODE> (<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>) <CODE>-></CODE> <I>B</I> -<LI>each <i>t</i><sub>i</sub> has the form <CODE>\</CODE><i>z</i><sub>1</sub>,...,<i>z</i><sub>m</sub> <CODE>-></CODE> <I>c</I> where <i>A</i><sub>i</sub> is -<center> -(<i>y</i><sub>1</sub> : <i>B</i><sub>1</sub>) <CODE>-></CODE> ... <CODE>-></CODE> (<i>y</i><sub>m</sub> : <i>B</i><sub>m</sub>) <CODE>-></CODE> <I>B</I> -</center> -</UL> - -<A NAME="toc36"></A> -<H3>Predefined types in abstract syntax</H3> -<P> -<a name="predefabs"></a> -</P> -<P> -GF provides three predefined categories for abstract syntax, with predefined -expressions: -</P> -<TABLE class="table"> -<TR> -<TH>category</TH> -<TH COLSPAN="2">expressions</TH> -</TR> -<TR> -<TD ALIGN="center"><CODE>Int</CODE></TD> -<TD>integer literals, e.g. <CODE>123</CODE></TD> -</TR> -<TR> -<TD ALIGN="center"><CODE>Float</CODE></TD> -<TD>floating point literals, e.g. <CODE>12.34</CODE></TD> -</TR> -<TR> -<TD ALIGN="center"><CODE>String</CODE></TD> -<TD>string literals, e.g. <CODE>"foo"</CODE></TD> -</TR> -</TABLE> - -<P></P> -<P> -These categories take no arguments, and they can be used as basic -types in the same way as if they were introduced in <CODE>cat</CODE> judgements. -However, it is not legal to define <CODE>fun</CODE> functions that have any -of these types as value type: their only well-typed expressions are -literals as defined in the above table. -</P> -<A NAME="toc37"></A> -<H3>Overview of expressions in concrete syntax</H3> -<P> -<a name="cnctypes"></a> -</P> -<P> -Concrete syntax is about defining mappings from abstract syntax trees -to <B>concrete syntax objects</B>. These objects comprise -</P> -<UL> -<LI>records -<LI>tables -<LI>strings -<LI>parameter values -</UL> - -<P> -Thus functions are not concrete syntax objects; however, the -mappings themselves are expressed as functions, and the source code -of a concrete syntax can use functions under the condition that -they can be eliminated from the final compiled grammar (which they -can; this is one of the fundamental properties of compilation, as -explained in more detail in the <I>JFP</I> article). -</P> -<P> -Concrete syntax thus has the same function types and expression forms as -abstract syntax, specified <a href="#functiontype">here</a>. The basic types defined -by categories (<CODE>cat</CODE> judgements) are available via grammar reuse -explained <a href="#reuse">here</a>; this also comprises the -predefined categories <CODE>Float</CODE> and <CODE>String</CODE>. -</P> -<A NAME="toc38"></A> -<H3>Values, canonical forms, and run-time variables</H3> -<P> -In abstract syntax, the conversion rules fiven <a href="#conversions">here</a> -define a computational relation -among expressions, but there is no separate notion of a <B>value</B> of -computation: the value (the end point) of a computation chain is -simply an expression to which no more conversions apply. In general, -we are interested in expressions that satisfy the conditions of being -syntax trees (as defined <a href="#syntaxtrees">here</a>), but there can be many computationally -equivalent syntax trees which nonetheless are distinct syntax trees -and hence have different linearizations. The main use of computation -in abstract syntax is to compare types in dependent type checking. -</P> -<P> -In concrete syntax, the notion of values is central. At run time, -we want to compute the values of linearizations; at compile time, we want -to perform <B>partial evaluation</B>, which computes expressions as far as -possible. -To specify what happens -in computation we therefore have to distinguish between <B>canonical forms</B> -and other forms of expressions. The canonical forms are defined separately -for each form of type, whereas the other forms may usually produce expressions -of any type. -</P> -<P> -<a name="linexpansion"></a> -<a name="runtimevariables"></a> -</P> -<P> -What is done at compile time is the elimination of any noncanonical forms, -except for those depending on <B>run-time variables</B>. Run-time variables are -the same as the <B>argument variables</B> of linearization rules, i.e. the -variables <i>x</i><sub>1</sub>,...,<i>x</i><sub>n</sub> in -<center> -<CODE>lin</CODE> <I>f</I> <CODE>= \</CODE> <i>x</i><sub>1</sub>,...,<i>x</i><sub>n</sub> <CODE>-></CODE> <I>t</I> -</center> -where -<center> -<CODE>fun</CODE> <I>f</I> <CODE>:</CODE> -(<i>x</i><sub>1</sub> : <i>A</i><sub>1</sub>) <CODE>-></CODE> ... <CODE>-></CODE> (<i>x</i><sub>n</sub> : <i>A</i><sub>n</sub>) <CODE>-></CODE> <I>B</I> -</center> -Notice that this definition refers to the <B>eta-expanded</B> linearization term, -which has one abstracted variable for each argument type of <I>f</I>. These variables -are not necessarily explicit in GF source code, but introduced by the compiler. -</P> -<P> -Since certain expression forms should be eliminated in compilation but -cannot be eliminated if run-time variables appear in them, errors can -appear late in compilation. This is an issue with the following -expression forms: -</P> -<UL> -<LI>gluing (<CODE>s + t</CODE>), defined <a href="#gluing">here</a> -<LI>pattern matching on strings, defined <a href="#patternmatching">here</a> -<LI>predefined string operations, defined <a href="#predefcnc">here</a> (those taking - <CODE>Str</CODE> arguments) -</UL> - -<A NAME="toc39"></A> -<H3>Token lists, tokens, and strings</H3> -<P> -<a name="strtype"></a> -</P> -<P> -The most prominent basic type is <CODE>Str</CODE>, the type of <B>token lists</B>. -This type is often sloppily referred to as the type of <B>strings</B>; -but it should be kept in mind that the objects of <CODE>Str</CODE> are -<I>lists</I> of strings rather than single strings. -</P> -<P> -Expressions of type <CODE>Str</CODE> have the following canonical forms: -</P> -<UL> -<LI><B>tokens</B>, i.e. <B>string literals</B>, in double quotes, e.g. <CODE>"foo"</CODE> -<LI><B>the empty token list</B>, <CODE>[]</CODE> -<LI><B>concatenation</B>, <I>s</I> <CODE>++</CODE> <I>t</I>, where <I>s,t</I> : <CODE>Str</CODE> -<LI><B>prefix-dependent choice</B>, - <CODE>pre {p<sub>1</sub> => s<sub>1</sub> ; ... ; p<sub>n</sub> => s<sub>n</sub> ; _ => s }</CODE>, where - <UL> - <LI><I>s</I>, <i>s</i><sub>1</sub>,...,<i>s</i><sub>n</sub>, <i>p</i><sub>1</sub>,...,<i>p</i><sub>n</sub> : <CODE>Str</CODE> - </UL> -</UL> - -<P> -For convenience, the notation is overloaded so that tokens are identified -with singleton token lists, and there is no separate type of tokens -(this is a change from the <I>JFP</I> article). -The notion of a token -is still important for compilation: all tokens introduced by -the grammar must be known at compile time. This, in turn, is -required by the parsing algorithms used for parsing with GF grammars. -</P> -<P> -In addition to string literals, tokens can be formed by a specific -non-canonical operator: -</P> -<UL> -<LI><B>gluing</B>, <I>s</I> <CODE>+</CODE> <I>t</I>, where <I>s,t</I> : <CODE>Str</CODE> -</UL> - -<P> -<a name="gluing"></a> -</P> -<P> -Being noncanonical, gluing is equipped with a computation rule: -string literals are glued by forming a new string literal, and -empty token lists can be ignored: -</P> -<UL> -<LI><CODE>"foo" + "bar"</CODE> ==> <CODE>"foobar"</CODE> -<LI><I>t</I> <CODE>+ []</CODE> ==> <I>t</I> -<LI><CODE>[] +</CODE> <I>t</I> ==> <I>t</I> -</UL> - -<P> -Since tokens must be known at compile time, -the operands of gluing may not depend on run-time variables, -as defined <a href="#runtimevariables">here</a>. -</P> -<P> -As syntactic sugar, token lists can be given as bracketed string literals, where -spaces separate tokens: -</P> -<UL> -<LI><B>token lists</B>, <CODE>["one two three"]</CODE> === <CODE>"one" ++ "two" ++ "three"</CODE> -</UL> - -<P> -Notice that there are no empty tokens, but the expression <CODE>[]</CODE> -can be used in a context requiring a token, in particular in gluing expression -below. Since <CODE>[]</CODE> denotes an empty token list, the following computation laws -are valid: -</P> -<UL> -<LI><I>t</I> <CODE>++ []</CODE> ==> <I>t</I> -<LI><CODE>[] ++</CODE> <I>t</I> ==> <I>t</I> -</UL> - -<P> -Moreover, concatenation and gluing are associative: -</P> -<UL> -<LI>s <CODE>+</CODE> (t <CODE>+</CODE> u) ==> s <CODE>+</CODE> t <CODE>+</CODE> u -<LI>s <CODE>++</CODE> (t <CODE>++</CODE> u) ==> s <CODE>++</CODE> t <CODE>++</CODE> u -</UL> - -<P> -For the programmer, associativity and the empty token laws mean -that the compiler can use them to simplify string expressions. -It also means that these laws are respected in pattern matching -on strings. -</P> -<P> -A prime example of prefix-dependent choice operation is the following -approximative expression for the English indefinite article: -</P> -<PRE> - pre { - ("a" | "e" | "i" | "o") => "an" ; - _ => "a" - } ; -</PRE> -<P> -This expression can be computed in the context of a subsequent token: -</P> -<UL> -<LI><CODE>pre {p<sub>1</sub> => s<sub>1</sub> ; ... ; p<sub>n</sub> => s<sub>n</sub> ; _ => s } ++ t</CODE> - ==> - <UL> - <LI><i>s</i><sub>i</sub> for the first <I>i</I> such that the prefix <i>p</i><sub>i</sub> - matches <I>t</I>, if it exists - <LI><I>s</I> otherwise - </UL> -</UL> - -<P> -The <B>matching prefix</B> is defined by comparing the string with the prefix of -the token. If the prefix is a variant list of strings, then it matches -the token if any of the strings in the list matches it. -</P> -<P> -The computation rule can sometimes be applied at compile time, but it general, -prefix-dependent choices need to be passed to the run-time grammar, because -they are not given a subsequent token to compare with, or because the -subsequent token depends on a run-time variable. -</P> -<P> -The prefix-dependent choice expression itself may not depend on run-time -variables. -</P> -<P> - <I>There is an older syntax for prefix-dependent choice, - namely: <code>pre { s ; s1 / p1 ; ... ; sn / pn}</code>. This syntax - will not accept strings as patterns.</I> -</P> -<P> -<I>In GF prior to 3.0, a specific type</I> <CODE>Strs</CODE> -<I>is used for defining prefixes,</I> -<I>instead of just</I> <CODE>variants</CODE> <I>of</I> <CODE>Str</CODE>. -</P> -<A NAME="toc40"></A> -<H3>Records and record types</H3> -<P> -A <B>record</B> is a collection of objects of possibly different types, -accessible by <B>projections</B> from the record with <B>labels</B> pointing -to these objects. A record is also itself an object, whose type is -a <B>record type</B>. Record types have the form -<center> - <CODE>{</CODE> <i>r</i><sub>1</sub> : <i>A</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> : <i>A</i><sub>n</sub> <CODE>}</CODE> -</center> -where <I>n</I> >= 0, each <i>A</i><sub>i</sub> is a type, and the labels <i>r</i><sub>i</sub> are -distinct. A record of this type has the form -<center> - <CODE>{</CODE> <i>r</i><sub>1</sub> = <i>a</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> = <i>a</i><sub>n</sub> <CODE>}</CODE> -</center> -where each #aii : "Aii. A limiting case is the <B>empty record type</B> -<CODE>{}</CODE>, which has the object <CODE>{}</CODE>, the <B>empty record</B>. -</P> -<P> -The <B>fields</B> of a record type are its parts of the form <I>r</I> : <I>A</I>, -also called <B>typings</B>. The <B>fields</B> of a record are of the form -<I>r</I> = <I>a</I>, also called <B>value assignments</B>. Value assignments -may optionally indicate the type, as in <I>r</I> : <I>A</I> = <I>a</I>. -</P> -<P> -The order of fields in record types and records is insignificant: two record -types (or records) are equal if they have the same fields, in any order, and a -record is an object of a record type, if it has type-correct value assignments -for all fields of the record type. -The latter definition implies the even stronger -principle of <B>record subtyping</B>: a record can have any type that has some -subset of its fields. This principle is explained further -<a href="#subtyping">here</a>. -</P> -<P> -All fields in a record must have distinct labels. Thus it is not possible -e.g. to "redefine" a field "later" in a record. -</P> -<P> -Lexically, labels are identifiers (defined <a href="#identifiers">here</a>). -This is with the exception -of the labels selecting bound variables in the linearization of higher-order -abstract syntax, which have the form <CODE>$</CODE><I>i</I> for an integer <I>i</I>, -as specified <a href="#HOAS">here</a>. -In source code, these labels should not appear in records fields, -but only in selections. -</P> -<P> -Labels occur only in syntactic positions where they cannot be confused with -constants or variables. Therefore it is safe to write, as in <CODE>Prelude</CODE>, -</P> -<PRE> - ss : Str -> {s : Str} = \s -> {s = s} ; -</PRE> -<P> -A <B>projection</B> is an expression of the form -<center> - <I>t</I>.<I>r</I> -</center> -where <I>t</I> must be a record and <I>r</I> must be a label defined in it. -The type of the projection is the type of that field. -The computation rule for projection returns the value assigned to that field: -<center> -<CODE>{</CODE> ... <CODE>;</CODE> <I>r</I> = <I>a</I> <CODE>;</CODE> ... <CODE>}.</CODE><I>r</I> ==> <I>a</I> -</center> -Notice that the dot notation <I>t</I>.<I>r</I> is also used for qualified names -as specified <a href="#qualifiednames">here</a>. -This ambiguity follows tradition and convenience. It is -resolved by the following rules (before type checking): -</P> -<OL> -<LI>if <I>t</I> is a bound variable or a constant in scope, - <I>t</I>.<I>r</I> is type-checked as a projection -<LI>otherwise, <I>t</I>.<I>r</I> is type-checked as a qualified name -</OL> - -<P> -As syntactic sugar, types and values can be shared: -</P> -<UL> -<LI><CODE>{</CODE> ... <CODE>;</CODE> <I>r,s</I> : <I>A</I> <CODE>;</CODE> ... <CODE>}</CODE> === - <CODE>{</CODE> ... <CODE>;</CODE> <I>r</I> : <I>A</I> <CODE>;</CODE> <I>s</I> : <I>A</I> <CODE>;</CODE> ... <CODE>}</CODE> -<LI><CODE>{</CODE> ... <CODE>;</CODE> <I>r,s</I> = <I>a</I> <CODE>;</CODE> ... <CODE>}</CODE> === - <CODE>{</CODE> ... <CODE>;</CODE> <I>r</I> = <I>a</I> <CODE>;</CODE> <I>s</I> = <I>a</I> <CODE>;</CODE> ... <CODE>}</CODE> -</UL> - -<P> -Another syntactic sugar are <B>tuple types</B> and <B>tuples</B>, which are translated -by endowing their unlabelled fields by the labels <CODE>p1</CODE>, <CODE>p2</CODE>,... in the -order of appearance of the fields: -</P> -<UL> -<LI><i>A</i><sub>1</sub> <CODE>*</CODE> ... <CODE>*</CODE> <i>A</i><sub>n</sub> === - <CODE>{</CODE> <CODE>p1</CODE> : <i>A</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <CODE>pn</CODE> : <i>A</i><sub>n</sub> <CODE>}</CODE> -<LI><CODE><</CODE><i>a</i><sub>1</sub> <CODE>,</CODE> ... <CODE>,</CODE> <i>a</i><sub>n</sub> <CODE>></CODE> === - <CODE>{</CODE> <CODE>p1</CODE> = <i>a</i><sub>1</sub><CODE>;</CODE> ... <CODE>;</CODE> <CODE>pn</CODE> = <i>a</i><sub>n</sub> <CODE>}</CODE> -</UL> - -<P> -A <B>record extension</B> is formed by adding fields to a record or a record type. -The general syntax involves two expressions, -<center> - <I>R</I> <CODE>**</CODE> <I>S</I> -</center> -The result is a record type or a record with a union of the fields of <I>R</I> and -<I>S</I>. It is therefore well-formed if -</P> -<UL> -<LI>both <I>R</I> and <I>S</I> are either records or record types -<LI>the labels in <I>R</I> and <I>S</I> are disjoint, if <I>R</I> and <I>S</I> are record types -</UL> -(Since GF version 3.6) If <I>R</I> and <I>S</I> are record objects, -then the labels in them need not be disjoint. Labels defined in -<I>S</I> are then given priority, so that record extensions in fact -works as <B>record update</B>. A common pattern of using this feature -is -<pre> - lin F x ... = x ** {r = ... x.r ...} -</pre> -where <tt>x</tt> is a record with many fields, just one of which is -updated. Following the normal binding conditions, <tt>x.r</tt> on the -right hand side still refers to the old value of the <tt>r</tt> field. - - -<A NAME="toc41"></A> -<H3>Subtyping</H3> -<P> -<a name="subtyping"></a> -</P> -<P> -The possibility of having superfluous fields in a record forms the basis of -the <B>subtyping</B> relation. -That <I>A</I> is a subtype of <I>B</I> means that <I>a : A</I> implies <I>a : B</I>. -This is clearly satisfied for records with superfluous fields: -</P> -<UL> -<LI>if <I>R</I> is a record type without the label <I>r</I>, - then <I>R</I> <CODE>** {</CODE> <I>r</I> : <I>A</I> <CODE>}</CODE> is a subtype of <I>R</I> -</UL> - -<P> -The GF grammar compiler extends subtyping to function types by <B>covariance</B> -and <B>contravariance</B>: -</P> -<UL> -<LI>covariance: if <I>A</I> is a subtype of <I>B</I>, - then <I>C</I> <CODE>-></CODE> <I>A</I> is a subtype of <I>C</I> <CODE>-></CODE> <I>B</I> -<LI>contravariance: if <I>A</I> is a subtype of <I>B</I>, - then <I>B</I> <CODE>-></CODE> <I>C</I> is a subtype of <I>A</I> <CODE>-></CODE> <I>C</I> -</UL> - -<P> -The logic of these rules is natural: if a function is returns a value -in a subtype, then this value is <I>a fortiori</I> in the supertype. -If a function is defined for some type, then it is <I>a fortiori</I> defined -for any subtype. -</P> -<P> -In addition to the well-known principles of record subtyping and co- and -contravariance, GF implements subtyping for initial segments of integers: -</P> -<UL> -<LI>if <I>m</I> < <I>n</I>, then <CODE>Ints</CODE> <I>m</I> is a subtype of <CODE>Ints</CODE> <I>n</I> -<LI><CODE>Ints</CODE> <I>n</I> is a subtype of <CODE>Integer</CODE> -</UL> - -<P> -As the last rule, subtyping is transitive: -</P> -<UL> -<LI>if <I>A</I> is a subtype of <I>B</I> and <I>B</I> is a subtype of <I>C</I>, then - <I>A</I> is a subtype of <I>C</I>. -</UL> - -<A NAME="toc42"></A> -<H3>Tables and table types</H3> -<P> -<a name="tables"></a> -</P> -<P> -One of the most characteristic constructs of GF is <B>tables</B>, also called -<B>finite functions</B>. That these functions are finite means that it -is possible to finitely enumerate all argument-value pairs; this, in -turn, is possible because the argument types are finite. -</P> -<P> -A <B>table type</B> has the form -<center> -<I>P</I> <CODE>=></CODE> <I>T</I> -</center> -where <I>P</I> must be a parameter type in the sense defined <a href="#paramtypes">here</a>, whereas -<I>T</I> can be any type. -</P> -<P> -Canonical expressions of table types are <B>tables</B>, of the form -<center> -<CODE>table</CODE> <CODE>{</CODE> <i>V</i><sub>1</sub> <CODE>=></CODE> <i>t</i><sub>1</sub> ; ... ; <i>V</i><sub>n</sub> <CODE>=></CODE> <i>t</i><sub>n</sub> <CODE>}</CODE> -</center> -where <i>V</i><sub>1</sub>,...,<i>V</i><sub>n</sub> is the complete list of the parameter values of -the argument type <I>P</I> (defined <a href="#paramvalues">here</a>), and each <i>t</i><sub>i</sub> is -an expression of the value type <I>T</I>. -</P> -<P> -In addition to explicit enumerations, -tables can be given by <B>pattern matching</B>, -<center> -<CODE>table</CODE> <CODE>{</CODE><i>p</i><sub>1</sub> <CODE>=></CODE> <i>t</i><sub>1</sub> ; ... ; <i>p</i><sub>m</sub> <CODE>=></CODE> <i>t</i><sub>m</sub><CODE>}</CODE> -</center> -where <i>p</i><sub>1</sub>,....,<i>p</i><sub>m</sub> is a list of patterns that covers all values of type <I>P</I>. -Each pattern <i>p</i><sub>i</sub> may bind some variables, on which the expression <i>t</i><sub>i</sub> -may depend. A complete account of patterns and pattern matching is given -<a href="#patternmatching">here</a>. -</P> -<P> -A <B>course-of-values table</B> omits the patterns and just lists all -values. It uses the enumeration of all values of the argument type <I>P</I> -to pair the values with arguments: -<center> -<CODE>table</CODE> <I>P</I> <CODE>[</CODE><i>t</i><sub>1</sub> ; ... ; <i>t</i><sub>n</sub><CODE>]</CODE> -</center> -This format is not recommended for GF source code, since the -ordering of parameter values is not specified and therefore a -compiler-internal decision. -</P> -<P> -The argument type can be indicated in ordinary tables as well, which is -sometimes helpful for type inference: -<center> -<CODE>table</CODE> <I>P</I> <CODE>{</CODE> ... <CODE>}</CODE> -</center> -</P> -<P> -The <B>selection</B> operator <CODE>!</CODE>, applied to a table <I>t</I> and to an expression -<I>v</I> of its argument type -<center> -<I>t</I> <CODE>!</CODE> <I>v</I> -</center> -returns the first pattern matching result from <I>t</I> with <I>v</I>, as defined -<a href="#patternmatching">here</a>. The order of patterns is thus significant as long as the -patterns contain variables or wildcards. When the compiler reorders the -patterns following the enumeration of all values of the argument type, -this order no longer matters, because no overlap remains between patterns. -</P> -<P> -The GF compiler performs <B>table expansion</B>, i.e. an analogue of -eta expansion defined <a href="#conversions">here</a>, where a table is applied to all -values to its argument type: -<center> -<I>t</I> : <I>P</I> <CODE>=></CODE> <I>T</I> ==> -<CODE>table</CODE> <I>P</I> <CODE>[</CODE><I>t</I> <CODE>!</CODE> <i>V</i><sub>1</sub> ; ... ; <I>t</I> <CODE>!</CODE> <i>V</i><sub>n</sub><CODE>]</CODE> -</center> -As syntactic sugar, one-branch tables can be written in a way similar to -lambda abstractions: -<center> -<CODE>\\</CODE><I>p</I> <CODE>=></CODE> <I>t</I> === <CODE>table {</CODE><I>p</I> <CODE>=></CODE> <I>t</I> <CODE>}</CODE> -</center> -where <I>p</I> is either a variable or a wildcard (<CODE>_</CODE>). Multiple bindings -can be abbreviated: -<center> -<CODE>\\</CODE><I>p,q</I> <CODE>=></CODE> <I>t</I> === <CODE>\\</CODE><I>p</I> <CODE>=></CODE> <CODE>\\</CODE><I>q</I> <CODE>=></CODE> <I>t</I> -</center> -<B>Case expressions</B> are syntactic sugar for selections: -<center> -<CODE>case</CODE> <I>e</I> <CODE>of {</CODE>...<CODE>}</CODE> === <CODE>table {</CODE>...<CODE>} !</CODE> <I>e</I> -</center> -</P> -<A NAME="toc43"></A> -<H3>Pattern matching</H3> -<P> -<a name="patternmatching"></a> -</P> -<P> -We will list all forms of patterns that can be used in table branches. -We define their <B>variable bindings</B> and <B>matching substitutions</B>. -</P> -<P> -We start with the patterns available for all parameter types, as well -as for the types <CODE>Integer</CODE> and <CODE>Str</CODE>. -</P> -<UL> -<LI>A constructor pattern <I>C</I> <i>p</i><sub>1</sub>...<i>p</i><sub>n</sub> - binds the union of all variables bound in the subpatterns - <i>p</i><sub>1</sub>,...,<i>p</i><sub>n</sub>. - It matches any value - <I>C</I> <i>V</i><sub>1</sub>...<i>V</i><sub>n</sub> where each <i>p</i><sub>i</sub># matches <i>V</i><sub>i</sub>, - and the matching substitution is the union of these substitutions. -<LI>A record pattern - <CODE>{</CODE> <i>r</i><sub>1</sub> <CODE>=</CODE> <i>p</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> <CODE>=</CODE> <i>p</i><sub>n</sub> <CODE>}</CODE> - binds the union of all variables bound in the subpatterns - <i>p</i><sub>1</sub>,...,<i>p</i><sub>n</sub>. - It matches any value - <CODE>{</CODE> <i>r</i><sub>1</sub> <CODE>=</CODE> <i>V</i><sub>1</sub> <CODE>;</CODE> ... <CODE>;</CODE> <i>r</i><sub>n</sub> <CODE>=</CODE> <i>V</i><sub>n</sub> <CODE>;</CODE> ...<CODE>}</CODE> - where each <i>p</i><sub>i</sub># matches <i>V</i><sub>i</sub>, - and the matching substitution is the union of these substitutions. -<LI>A variable pattern <I>x</I> - (identifier other than parameter constructor) - binds the variable <I>x</I>. - It matches any value <I>V</I>, with the substitution {<I>x</I> = <I>V</I>}. -<LI>The wild card <CODE>_</CODE> binds no variables. - It matches any value, with the empty substitution. -<LI>A disjunctive pattern <I>p</I> <CODE>|</CODE> <I>q</I> binds the intersection of - the variables bound by <I>p</I> and <I>q</I>. - It matches anything that - either <I>p</I> or <I>q</I> matches, with the first substitution starting - with <I>p</I> matches, from which those - variables that are not bound by both patterns are removed. -<LI>A negative pattern <CODE>-</CODE> <I>p</I> binds no variables. - It matches anything that <I>p</I> does <I>not</I> match, with the empty - substitution. -<LI>An alias pattern <I>x</I> <CODE>@</CODE> <I>p</I> binds <I>x</I> and all the variables - bound by <I>p</I>. It matches any value <I>V</I> that <I>p</I> matches, with - the same substition extended by {<I>x</I> = <I>V</I>}. -</UL> - -<P> -The following patterns are only available for the type <CODE>Str</CODE>: -</P> -<UL> -<LI>A string literal pattern, e.g. <CODE>"s"</CODE>, binds no variables. - It matches the same string, with the empty substitution. -<LI>A concatenation pattern, <I>p</I> <CODE>+</CODE> <I>q</I>, - binds the union of variables bound by <I>p</I> and <I>q</I>. - It matches any string that consists - of a prefix matching <I>p</I> and a suffix matching <I>q</I>, - with the union of substitutions corresponding to the first match (see below). -<LI>A repetition pattern <I>p</I><CODE>*</CODE> binds no variables. - It matches any string that can be decomposed - into strings that match <I>p</I>, with the empty substitution. -</UL> - -<P> -The following pattern is only available for the types <CODE>Integer</CODE> -and <CODE>Ints</CODE> <I>n</I>: -</P> -<UL> -<LI>An integer literal pattern, e.g. <CODE>214</CODE>, binds no variables. - It matches the same integer, with - the empty substitution. -</UL> - -<P> -All patterns must be <B>linear</B>: the same pattern variable may occur -only once in them. This is what makes it straightforward to speak -about unions of binding sets and substitutions. -</P> -<P> -Pattern matching is performed in the order in which the branches -appear in the source code: the branch of the first matching pattern is followed. -In concrete syntax, the type checker reject sets of patterns that are -not exhaustive, and warns for completely overshadowed patterns. -It also checks the type correctness of patterns with respect to the -argument type. In abstract syntax, only type correctness is checked, -no exhaustiveness or overshadowing. -</P> -<P> -It follows from the definition of record pattern matching -that it can utilize partial records: the branch -</P> -<PRE> - {g = Fem} => t -</PRE> -<P> -in a table of type <CODE>{g : Gender ; n : Number} => T</CODE> means the same as -</P> -<PRE> - {g = Fem ; n = _} => t -</PRE> -<P> -Variables in regular expression patterns -are always bound to the <B>first match</B>, which is the first -in the sequence of binding lists. For example: -</P> -<UL> -<LI><CODE>x + "e" + y</CODE> matches <CODE>"peter"</CODE> with <CODE>x = "p", y = "ter"</CODE> -<LI><CODE>x + "er"*</CODE> matches <CODE>"burgerer"</CODE> with <CODE>x = "burg"</CODE> -</UL> - -<A NAME="toc44"></A> -<H3>Free variation</H3> -<P> -An expressions of the form -<center> -<CODE>variants</CODE> <CODE>{</CODE><i>t</i><sub>1</sub> ; ... ; <i>t</i><sub>n</sub><CODE>}</CODE> -</center> -where all <i>t</i><sub>i</sub> are of the same type <I>T</I>, has itseld type <I>T</I>. -This expression presents <i>t</i><sub>i</sub>,...,<i>t</i><sub>n</sub> as being in <B>free variation</B>: -the choice between them is not determined by semantics or parameters. -A limiting case is -<center> -<CODE>variants {}</CODE> -</center> -which encodes a rule saying that there is no way to express a certain -thing, e.g. that a certain inflectional form does not exist. -</P> -<P> -A common wisdom in linguistics is that "there is no free variation", which -refers to the situation where <I>all</I> aspects are taken into account. For -instance, the English negation contraction could be expressed as free variation, -</P> -<PRE> - variants {"don't" ; "do" ++ "not"} -</PRE> -<P> -if only semantics is taken into account, but if stylistic aspects are included, -then the proper formulation might be with a parameter distinguishing between -informal and formal style: -</P> -<PRE> - case style of {Informal => "don't" ; Formal => "do" ++ "not"} -</PRE> -<P> -Since there is not way to choose a particular element from a ``variants` list, -free variants is normally not adequate in libraries, nor in grammars meant for -natural language generation. In application grammars -meant to parse user input, free variation is a way to avoid cluttering the -abstract syntax with semantically insignificant distinctions and even to -tolerate some grammatical errors. -</P> -<P> -Permitting <CODE>variants</CODE> in all types involves a major modification of the -semantics of GF expressions. All computation rules have to be lifted to -deal with lists of expressions and values. For instance, -<center> -<I>t</I> <CODE>!</CODE> <CODE>variants</CODE> <CODE>{</CODE><i>t</i><sub>1</sub> ; ... ; <i>t</i><sub>n</sub><CODE>}</CODE> ==> -<CODE>variants</CODE> <CODE>{</CODE><I>t</I> <CODE>!</CODE> <i>t</i><sub>1</sub> ; ... ; <I>t</I> <CODE>!</CODE> <i>t</i><sub>n</sub><CODE>}</CODE> -</center> -This is done in such a way that -variation does not distribute to records (or other product-like structures). -For instance, variants of records, -</P> -<PRE> - variants {{s = "Auto" ; g = Neutr} ; {s = "Wagen" ; g = Masc}} -</PRE> -<P> -is <I>not</I> the same as a record of variants, -</P> -<PRE> - {s = variants {"Auto" ; "Wagen"} ; g = variants {Neutr ; Masc}} -</PRE> -<P> -Variants of variants are flattened, -<center> -<CODE>variants</CODE> <CODE>{</CODE>...; <CODE>variants</CODE> <CODE>{</CODE><i>t</i><sub>1</sub> ;...; <i>t</i><sub>n</sub><CODE>}</CODE> ;...<CODE>}</CODE> -==> -<CODE>variants</CODE> <CODE>{</CODE>...; <i>t</i><sub>1</sub> ;...; <i>t</i><sub>n</sub> ;...<CODE>}</CODE> -</center> -and singleton variants are eliminated, -<center> -<CODE>variants</CODE> <CODE>{</CODE><I>t</I><CODE>}</CODE> ==> <I>t</I> -</center> -</P> -<A NAME="toc45"></A> -<H3>Local definitions</H3> -<P> -A <B>local definition</B>, i.e. a <B>let expression</B> has the form -<center> -<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>in</CODE> <I>e</I> -</center> -The type of <I>x</I> must be <I>T</I>, which also has to be the type of <I>t</I>. -Computation is performed by substituting <I>t</I> for <I>x</I> in <I>e</I>: -<center> -<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>in</CODE> <I>e</I> ==> <I>e</I> {<I>x</I> = <I>t</I>} -</center> -As syntactic sugar, the type can be omitted if the type checker is -able to infer it: -<center> -<CODE>let</CODE> <I>x</I> = <I>t</I> <CODE>in</CODE> <I>e</I> -</center> -It is possible to compress several local definitions into one block: -<center> -<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>;</CODE> <I>y</I> : <I>U</I> = <I>u</I> <CODE>in</CODE> <I>e</I> -=== -<CODE>let</CODE> <I>x</I> : <I>T</I> = <I>t</I> <CODE>in</CODE> <CODE>let</CODE> <I>y</I> : <I>U</I> = <I>u</I> <CODE>in</CODE> <I>e</I> -</center> -Another notational variant is a definition block appearing after the main -expression: -<center> -<I>e</I> <CODE>where</CODE> <CODE>{</CODE>...<CODE>}</CODE> === <CODE>let</CODE> <CODE>{</CODE>...<CODE>}</CODE> <CODE>in</CODE> <I>e</I> -</center> -Curly brackets are obligatory in the <CODE>where</CODE> form, and can -also be optionally used in the <CODE>let</CODE> form. -</P> -<P> -Since a block of definitions is treated as syntactic sugar -for a nested <CODE>let</CODE> expression, a constant must be defined before it -is used: the scope is not mutual, as in a module body. -Furthermore, unlike in <CODE>lin</CODE> and <CODE>oper</CODE> definitions, it is <I>not</I> possible -to bind variables on the left of the equality sign. -</P> -<A NAME="toc46"></A> -<H3>Function applications in concrete syntax</H3> -<P> -<a name="functionelimination"></a> -</P> -<P> -Fully compiled concrete syntax may not include expressions of function types -except on the outermost level of <CODE>lin</CODE> rules, as defined <a href="#linexpansion">here</a>. -However, -in the source code, and especially in <CODE>oper</CODE> definitions, functions -are the main vehicle of code reuse and abstraction. Thus function types and -functions follow the same rules as in abstract syntax, as specified -<a href="#functiontype">here</a>. In -particular, the application of a lambda abstract is computed by beta conversion. -</P> -<P> -To ensure the elimination of functions, GF uses a special computation rule -for pushing function applications inside tables, since otherwise run-time -variables could block their applications: -<center> -(<CODE>table</CODE> <CODE>{</CODE><i>p</i><sub>1</sub> <CODE>=></CODE> <i>f</i><sub>1</sub> ; ... ; - <i>p</i><sub>n</sub> <CODE>=></CODE> <i>f</i><sub>n</sub> <CODE>}</CODE> <CODE>!</CODE> <I>e</I>) <I>a</I> - ==> - <CODE>table</CODE> <CODE>{</CODE><i>p</i><sub>1</sub> <CODE>=></CODE> <i>f</i><sub>1</sub> <I>a</I> ; ... ; - <i>p</i><sub>n</sub> <CODE>=></CODE> <i>f</i><sub>n</sub> <I>a</I><CODE>}</CODE> <CODE>!</CODE> <I>e</I> -</center> -Also parameter constructors with non-empty contexts, as defined -<a href="#paramjudgements">here</a>, -result in expressions in application form. These expressions are never -a problem if their arguments are just constructors, because they can then -be translated to integers corresponding to the position of the expression -in the enumaration of the values of its type. -However, a constructor -applied to a run-time variable may need to be converted as follows: -<center> -<I>C</I>...<I>x</I>... ==> <CODE>case</CODE> <I>x</I> of <CODE>{_ =></CODE> <I>C</I>...<I>x</I><CODE>}</CODE> -</center> -The resulting expression, when processed by table expansion as explained -<a href="#tables">here</a>, -results in <I>C</I> being applied to just values of the type of <I>x</I>, and the -application thereby disappears. -</P> -<A NAME="toc47"></A> -<H3>Reusing top-level grammars as resources</H3> -<P> -<a name="reuse"></a> -</P> -<P> -<I>This section is valid for GF 3.0, which abandons the "lock field"</I> -<I>discipline of GF 2.8.</I> -</P> -<P> -As explained <a href="#openabstract">here</a>, -abstract syntax modules can be opened as interfaces -and concrete syntaxes as their instances. This means that judgements are, -as it were, translated in the following way: -</P> -<UL> -<LI><CODE>cat</CODE> <I>C</I> <I>G</I> ===> <CODE>oper</CODE> <I>C</I> : <CODE>Type</CODE> -<LI><CODE>fun</CODE> <I>f</I> : <I>T</I> ===> <CODE>oper</CODE> <I>f</I> : <I>T</I> -<LI><CODE>lincat</CODE> <I>C</I> = <I>T</I> ===> <CODE>oper</CODE> <I>C</I> : <CODE>Type</CODE> = <I>C</I> -<LI><CODE>lin</CODE> <I>f</I> = <I>t</I> ===> <CODE>oper</CODE> <I>f</I> = <I>t</I> -</UL> - -<P> -Notice that the value <I>T</I> of <CODE>lincat</CODE> definitions is not disclosed -in the translation. This means that the type <I>C</I> remains abstract: the -only ways of building an object of type <I>C</I> are the operations <I>f</I> -obtained from <I>fun</I> and <I>lin</I> rules. -</P> -<P> -The purpose of keeping linearization types abstract is to enforce -<B>grammar checking via type checking</B>. This means that any well-typed -operation application is also well-typed in the sense of the original -grammar. If the types were disclosed, then we could for instance easily -confuse all categories that have the linearization -type <CODE>{s : Str}</CODE>. Yet another reason is that revealing the types -makes it impossible for the library programmers to change their type -definitions afterwards. -</P> -<P> -Library writers may occasionally want to have access to the values of -linearization types. The way to make it possible is to add an extra -construction operation to a module in which the linearization type -is available: -</P> -<PRE> - oper MkC : T -> C = \x -> x -</PRE> -<P> -In object-oriented terms, the type <I>C</I> itself is <B>protected</B>, whereas -<I>MkC</I> is a <B>public constructor</B> of <I>C</I>. Of course, it is possible to -make these constructors overloaded (concept explained <a href="#overloading">here</a>), -to enable easy access to special cases. -</P> -<A NAME="toc48"></A> -<H3>Predefined concrete syntax types</H3> -<P> -<a name="predefcnc"></a> -</P> -<P> -The following concrete syntax types are predefined: -</P> -<UL> -<LI><CODE>Str</CODE>, the type of tokens and token lists (defined <a href="#strtype">here</a>) -<LI><CODE>Integer</CODE>, the type of nonnegative integers -<LI><CODE>Ints</CODE> <I>n</I>, the type of integers from <I>0</I> to <I>n</I> -<LI><CODE>Type</CODE>, the type of (concrete syntax) types -<LI><CODE>PType</CODE>, the type of parameter types -</UL> - -<P> -The last two types are, in a way, extended by user-written grammars, -since new parameter types can be defined in the way shown <a href="#paramjudgements">here</a>, -and every paramater type is also a type. From the point of view of the values -of expressions, however, a <CODE>param</CODE> declaration does not extend -<CODE>PType</CODE>, since all parameter types get compiled to initial -segments of integers. -</P> -<P> -Notice the difference between the concrete syntax types -<CODE>Str</CODE> and <CODE>Integer</CODE> on the one hand, and the abstract -syntax categories <CODE>String</CODE> and <CODE>Int</CODE>, on the other. -As <I>concrete syntax</I> types, the latter are treated in -the same way as any reused categories: their objects -can be formed by using syntax trees (string and integer -literals). -</P> -<P> -<I>The type name</I> <CODE>Integer</CODE> <I>replaces in GF 3.0 the name</I> <CODE>Int</CODE>, -<I>to avoid confusion with the abstract syntax type and to be analogous</I> -<I>with the</I> <CODE>Str</CODE> <I>vs.</I> <CODE>String</CODE> <I>distinction.</I> -</P> -<A NAME="toc49"></A> -<H3>Predefined concrete syntax operations</H3> -<P> -The following predefined operations are defined in the resource module -<CODE>prelude/Predef.gf</CODE>. Their implementations are defined as -a part of the GF grammar compiler. -</P> -<TABLE class="table"> -<TR> -<TH>operation</TH> -<TH>type</TH> -<TH COLSPAN="2">explanation</TH> -</TR> -<TR> -<TD><CODE>PBool</CODE></TD> -<TD><CODE>PType</CODE></TD> -<TD><CODE>PTrue | PFalse</CODE></TD> -</TR> -<TR> -<TD><CODE>Error</CODE></TD> -<TD><CODE>Type</CODE></TD> -<TD>the empty type</TD> -</TR> -<TR> -<TD><CODE>Int</CODE></TD> -<TD><CODE>Type</CODE></TD> -<TD>the type of integers</TD> -</TR> -<TR> -<TD><CODE>Ints</CODE></TD> -<TD><CODE>Integer -> Type</CODE></TD> -<TD>the type of integers from 0 to n</TD> -</TR> -<TR> -<TD><CODE>error</CODE></TD> -<TD><CODE>Str -> Error</CODE></TD> -<TD>forms error message</TD> -</TR> -<TR> -<TD><CODE>length</CODE></TD> -<TD><CODE>Str -> Int</CODE></TD> -<TD>length of string</TD> -</TR> -<TR> -<TD><CODE>drop</CODE></TD> -<TD><CODE>Integer -> Str -> Str</CODE></TD> -<TD>drop prefix of length</TD> -</TR> -<TR> -<TD><CODE>take</CODE></TD> -<TD><CODE>Integer -> Str -> Str</CODE></TD> -<TD>take prefix of length</TD> -</TR> -<TR> -<TD><CODE>tk</CODE></TD> -<TD><CODE>Integer -> Str -> Str</CODE></TD> -<TD>drop suffix of length</TD> -</TR> -<TR> -<TD><CODE>dp</CODE></TD> -<TD><CODE>Integer -> Str -> Str</CODE></TD> -<TD>take suffix of length</TD> -</TR> -<TR> -<TD><CODE>eqInt</CODE></TD> -<TD><CODE>Integer -> Integer -> PBool</CODE></TD> -<TD>test if equal integers</TD> -</TR> -<TR> -<TD><CODE>lessInt</CODE></TD> -<TD><CODE>Integer -> Integer -> PBool</CODE></TD> -<TD>test order of integers</TD> -</TR> -<TR> -<TD><CODE>plus</CODE></TD> -<TD><CODE>Integer -> Integer -> Integer</CODE></TD> -<TD>add integers</TD> -</TR> -<TR> -<TD><CODE>eqStr</CODE></TD> -<TD><CODE>Str -> Str -> PBool</CODE></TD> -<TD>test if equal strings</TD> -</TR> -<TR> -<TD><CODE>occur</CODE></TD> -<TD><CODE>Str -> Str -> PBool</CODE></TD> -<TD>test if occurs as substring</TD> -</TR> -<TR> -<TD><CODE>occurs</CODE></TD> -<TD><CODE>Str -> Str -> PBool</CODE></TD> -<TD>test if any char occurs</TD> -</TR> -<TR> -<TD><CODE>show</CODE></TD> -<TD><CODE>(P : Type) -> P -> Str</CODE></TD> -<TD>convert param to string</TD> -</TR> -<TR> -<TD><CODE>read</CODE></TD> -<TD><CODE>(P : Type) -> Str -> P</CODE></TD> -<TD>convert string to param</TD> -</TR> -<TR> -<TD><CODE>toStr</CODE></TD> -<TD><CODE>(L : Type) -> L -> Str</CODE></TD> -<TD>find the "first" string</TD> -</TR> -<TR> -<TD><CODE>nonExist</CODE></TD> -<TD><CODE>Str</CODE></TD> -<TD>a special token marking<BR/> -non-existing morphological forms</TD> -</TR> -<TR> -<TD><CODE>BIND</CODE></TD> -<TD><CODE>Str</CODE></TD> -<TD>a special token marking<BR/> -that the surrounding tokens should not<BR/> -be separated by space</TD> -</TR> -<TR> -<TD><CODE>SOFT_BIND</CODE></TD> -<TD><CODE>Str</CODE></TD> -<TD>a special token marking<BR/> -that the surrounding tokens may not<BR/> -be separated by space</TD> -</TR> -<TR> -<TD><CODE>SOFT_SPACE</CODE></TD> -<TD><CODE>Str</CODE></TD> -<TD>a special token marking<BR/> -that the space between the surrounding tokens<BR/> -is optional</TD> -</TR> -<TR> -<TD><CODE>CAPIT</CODE></TD> -<TD><CODE>Str</CODE></TD> -<TD>a special token marking<BR/> -that the first character in the next token<BR/> -should be capitalized</TD> -</TR> -<TR> -<TD><CODE>ALL_CAPIT</CODE></TD> -<TD><CODE>Str</CODE></TD> -<TD>a special token marking<BR/> -that the next word should be<BR/> -in all capital letters</TD> -</TR> -</TABLE> - -<P></P> -<P> -Compilation eliminates these operations, and they may therefore not -take arguments that depend on run-time variables. -</P> -<P> -The module <CODE>Predef</CODE> is included in the <I>opens</I> list of all -modules, and therefore does not need to be opened explicitly. -</P> -<A NAME="toc50"></A> -<H2>Flags and pragmas</H2> -<A NAME="toc51"></A> -<H3>Some flags and their values</H3> -<P> -<a name="flagvalues"></a> -</P> -<P> -The flag <CODE>coding</CODE> in concrete syntax sets the <B>character encoding</B> -used in the grammar. Internally, GF uses unicode, and <CODE>.pgf</CODE> files -are always written in UTF8 encoding. The presence of the flag -<CODE>coding=utf8</CODE> prevents GF from encoding an already encoded -file. -</P> -<P> - -<P></P> -<P> -The flag <CODE>startcat</CODE> in abstract syntax sets the default start category for -parsing, random generation, and any other grammar operation that depends -on category. Its legal values are the categories defined or inherited in -the abstract syntax. -</P> -<P> - - -<P></P> -<A NAME="toc52"></A> -<H3>Compiler pragmas</H3> -<P> -<B>Compiler pragmas</B> are a special form of comments prefixed with <CODE>--#</CODE>. -Currently GF interprets the following pragmas. -</P> -<TABLE class="table"> -<TR> -<TH>pragma</TH> -<TH COLSPAN="2">explanation</TH> -</TR> -<TR> -<TD><CODE>-path=</CODE>PATH</TD> -<TD>path list for searching modules</TD> -</TR> -</TABLE> - -<P></P> -<P> -For instance, the line -</P> -<PRE> - --# -path=.:present:prelude:/home/aarne/GF/tmp -</PRE> -<P> -in the top of <CODE>FILE.gf</CODE> causes the GF compiler, when invoked on <CODE>FILE.gf</CODE>, -to search through the current directory (<CODE>.</CODE>) and the directories -<CODE>present</CODE>, <CODE>prelude</CODE>, and <CODE>/home/aarne/GF/tmp</CODE>, in this order. -If a directory <CODE>DIR</CODE> is not found relative to the working directory, -<CODE>$(GF_LIB_PATH)/DIR</CODE> is searched. <CODE>$GF_LIB_PATH</CODE> -can be a colon-separated list of directories, in which case each directory -in the list contributes to the search path expansion. - -</P> -<A NAME="toc53"></A> -<H2>Alternative grammar input formats</H2> -<P> -While the GF language as specified in this document is the most versatile -and powerful way of writing GF grammars, there are several other formats -that a GF compiler may make available for users, either to get started -with small grammars or to semiautomatically convert grammars from other -formats to GF. Here are the ones supported by GF 2.8 and 3.0. -</P> -<A NAME="toc54"></A> -<H3>Old GF without modules</H3> -<P> -<a name="oldgf"></a> -</P> -<P> -Before GF compiler version 2.0, there was no module system, and -all kinds of judgement could be written in all files, without -any headers. This format is still available, and the compiler -(version 2.8) detects automatically if a file is in the current -or the old format. However, the old format is not recommended -because of pure modularity and missing separate compilation, -and also because libraries are not available, since the old -and the new format cannot be mixed. With version 2.8, grammars -in the old format can be converted to modular grammar with the -command -</P> -<PRE> - > import -o FILE.gf -</PRE> -<P> -which rewrites the grammar divided into three files: -an abstract, a concrete, and a resource module. -</P> -<A NAME="toc55"></A> -<H3>Context-free grammars</H3> -<P> -A quick way to write a GF grammar is to use the context-free format, -also known as BNF. Files of this form are recognized by the suffix -<CODE>.cf</CODE>. Rules in these files have the form -<center> -<I>Label</I> <CODE>.</CODE> <I>Cat</I> <CODE>::=</CODE> (<I>String</I> | <I>Cat</I>)* <CODE>;</CODE> -</center> -where <I>Label</I> and <I>Cat</I> are identifiers and <I>String</I> quoted strings. -</P> -<P> -There is a shortcut form generating labels automatically, -<center> -<I>Cat</I> <CODE>::=</CODE> (<I>String</I> | <I>Cat</I>)* <CODE>;</CODE> -</center> -In the shortcut form, vertical bars (<CODE>|</CODE>) can be used to give -several right-hand-sides at a time. An empty right-hand side -means the singleton of an empty sequence, and not an empty union. -</P> -<P> -Just like old-style GF files (previous section), contex-free grammar -files can be converted to modular GF by using the <CODE>-o</CODE> option to -the compiler in GF 2.8. -</P> -<A NAME="toc56"></A> -<H3>Extended BNF grammars</H3> -<P> -Extended BNF (<CODE>FILE.ebnf</CODE>) -goes one step further from the shortcut notation of previous section. -The rules have the form -<center> -<I>Cat</I> <CODE>::=</CODE> <I>RHS</I> <CODE>;</CODE> -</center> -where an <I>RHS</I> can be any regular expression -built from quoted strings and category symbols, in the following ways: -</P> -<TABLE class="table"> -<TR> -<TH>RHS item</TH> -<TH COLSPAN="2">explanation</TH> -</TR> -<TR> -<TD><I>Cat</I></TD> -<TD>nonterminal</TD> -</TR> -<TR> -<TD><I>String</I></TD> -<TD>terminal</TD> -</TR> -<TR> -<TD><I>RHS</I> <I>RHS</I></TD> -<TD>sequence</TD> -</TR> -<TR> -<TD><I>RHS</I> <CODE>|</CODE> <I>RHS</I></TD> -<TD>alternatives</TD> -</TR> -<TR> -<TD><I>RHS</I> <CODE>?</CODE></TD> -<TD>optional</TD> -</TR> -<TR> -<TD><I>RHS</I> <CODE>*</CODE></TD> -<TD>repetition</TD> -</TR> -<TR> -<TD><I>RHS</I> <CODE>+</CODE></TD> -<TD>non-empty repetition|</TD> -</TR> -</TABLE> - -<P></P> -<P> -Parentheses are used to override standard precedences, where -<CODE>|</CODE> binds weaker than sequencing, which binds weaker than the unary operations. -</P> -<P> -The compiler generates not only labels, but also new categories corresponding -to the regular expression combinations actually in use. -</P> -<P> -Just like <CODE>.cf</CODE> files (previous section), <CODE>.ebnf</CODE> -files can be converted to modular GF by using the <CODE>-o</CODE> option to -the compiler in GF 2.8. -</P> -<A NAME="toc57"></A> -<H3>Example-based grammars</H3> -<P> -<B>Example-based grammars</B> (<CODE>.gfe</CODE>) provide a way to use -resource grammar libraries without having to know the names -of functions in them. The compiler works as a preprocessor, -saving the result in a (<CODE>.gf</CODE>) file, which can be compiled -as usual. -</P> -<P> -If a library is implemented as an abstract and concrete syntax, -it can be used for parsing. Calls of library functions can therefore -be formed by parsing strings in the library. GF has an expression -format for this, -<center> -<CODE>in</CODE> <I>C</I> <I>String</I> -</center> -where <I>C</I> is the category in which to parse (it can be qualified by -the module name) and the string is the input to parser. Expressions -of this form are replaced by the syntax trees that result. These -trees are always type-correct. If several parses are found, all but -the first one are given in comments. -</P> -<P> -Here is an example, from <CODE>GF/examples/animal/</CODE>: -</P> -<PRE> - --# -resource=../../lib/present/LangEng.gfc - --# -path=.:present:prelude - - incomplete concrete QuestionsI of Questions = open Lang in { - lincat - Phrase = Phr ; - Entity = N ; - Action = V2 ; - lin - Who love_V2 man_N = in Phr "who loves men" ; - Whom man_N love_V2 = in Phr "whom does the man love" ; - Answer woman_N love_V2 man_N = in Phr "the woman loves men" ; - } -</PRE> -<P> -The <CODE>resource</CODE> pragma shows the grammar that is used for parsing -the examples. -</P> -<P> -Notice that the variables <CODE>love_V2</CODE>, <CODE>man_N</CODE>, etc, are -actually constants in the library. In the resulting rules, such as -</P> -<PRE> - lin Whom = \man_N -> \love_V2 -> - PhrUtt NoPConj (UttQS (UseQCl TPres ASimul PPos - (QuestSlash whoPl_IP (SlashV2 (DetCN (DetSg (SgQuant - DefArt)NoOrd)(UseN man_N)) love_V2)))) NoVoc ; -</PRE> -<P> -those constants are nonetheless treated as variables, following -the normal binding conventions, as stated <a href="#renaming">here</a>. -</P> -<A NAME="toc58"></A> -<H2>The grammar of GF</H2> -<P> -The following grammar is actually used in the parser of GF, although we have -omitted -some obsolete rules still included in the parser for backward compatibility -reasons. -</P> -<P> -This document was automatically generated by the <I>BNF-Converter</I>. It was generated together with the lexer, the parser, and the abstract syntax module, which guarantees that the document matches with the implementation of the language (provided no hand-hacking has taken place). -</P> -<A NAME="toc59"></A> -<H2>The lexical structure of GF</H2> -<A NAME="toc60"></A> -<H3>Identifiers</H3> -<P> -Identifiers <I>Ident</I> are unquoted strings beginning with a letter, -followed by any combination of letters, digits, and the characters <CODE>_ '</CODE> -reserved words excluded. -</P> -<A NAME="toc61"></A> -<H3>Literals</H3> -<P> -Integer literals <I>Integer</I> are nonempty sequences of digits. -</P> -<P> -String literals <I>String</I> have the form -<CODE>"</CODE><I>x</I><CODE>"</CODE>}, where <I>x</I> is any sequence of any characters -except <CODE>"</CODE> unless preceded by <CODE>\</CODE>. -</P> -<P> -Double-precision float literals <I>Double</I> have the structure -indicated by the regular expression <CODE>digit+ '.' digit+ ('e' ('-')? digit+)?</CODE> i.e.\ -two sequences of digits separated by a decimal point, optionally -followed by an unsigned or negative exponent. -</P> -<A NAME="toc62"></A> -<H3>Reserved words and symbols</H3> -<P> -The set of reserved words is the set of terminals appearing in the grammar. Those reserved words that consist of non-letter characters are called symbols, and they are treated in a different way from those that are similar to identifiers. The lexer follows rules familiar from languages like Haskell, C, and Java, including longest match and spacing conventions. -</P> -<P> -The reserved words used in GF are the following: -</P> -<TABLE ALIGN="center" CELLPADDING="4"> -<TR> -<TD><CODE>PType</CODE></TD> -<TD><CODE>Str</CODE></TD> -<TD><CODE>Strs</CODE></TD> -<TD><CODE>Type</CODE></TD> -</TR> -<TR> -<TD><CODE>abstract</CODE></TD> -<TD><CODE>case</CODE></TD> -<TD><CODE>cat</CODE></TD> -<TD><CODE>concrete</CODE></TD> -</TR> -<TR> -<TD><CODE>data</CODE></TD> -<TD><CODE>def</CODE></TD> -<TD><CODE>flags</CODE></TD> -<TD><CODE>fun</CODE></TD> -</TR> -<TR> -<TD><CODE>in</CODE></TD> -<TD><CODE>incomplete</CODE></TD> -<TD><CODE>instance</CODE></TD> -<TD><CODE>interface</CODE></TD> -</TR> -<TR> -<TD><CODE>let</CODE></TD> -<TD><CODE>lin</CODE></TD> -<TD><CODE>lincat</CODE></TD> -<TD><CODE>lindef</CODE></TD> -</TR> -<TR> -<TD><CODE>linref</CODE></TD> -<TD><CODE>of</CODE></TD> -<TD><CODE>open</CODE></TD> -<TD><CODE>oper</CODE></TD> -</TR> -<TR> -<TD><CODE>param</CODE></TD> -<TD><CODE>pre</CODE></TD> -<TD><CODE>printname</CODE></TD> -<TD><CODE>resource</CODE></TD> -</TR> -<TR> -<TD><CODE>strs</CODE></TD> -<TD><CODE>table</CODE></TD> -<TD><CODE>transfer</CODE></TD> -<TD><CODE>variants</CODE></TD> -</TR> -<TR> -<TD><CODE>where</CODE></TD> -<TD><CODE>with</CODE></TD> -<TD></TD> -<TD></TD> -</TR> -</TABLE> - -<P></P> -<P> -The symbols used in GF are the following: -</P> -<TABLE ALIGN="center" CELLPADDING="4"> -<TR> -<TD>;</TD> -<TD>=</TD> -<TD>:</TD> -<TD>-></TD> -</TR> -<TR> -<TD>{</TD> -<TD>}</TD> -<TD>**</TD> -<TD>,</TD> -</TR> -<TR> -<TD>(</TD> -<TD>)</TD> -<TD>[</TD> -<TD>]</TD> -</TR> -<TR> -<TD>-</TD> -<TD>.</TD> -<TD>|</TD> -<TD>?</TD> -</TR> -<TR> -<TD><</TD> -<TD>></TD> -<TD>@</TD> -<TD>!</TD> -</TR> -<TR> -<TD>*</TD> -<TD>+</TD> -<TD>++</TD> -<TD>\</TD> -</TR> -<TR> -<TD>=></TD> -<TD>_</TD> -<TD>$</TD> -<TD>/</TD> -</TR> -</TABLE> - -<P></P> -<A NAME="toc63"></A> -<H3>Comments</H3> -<P> -Single-line comments begin with --.Multiple-line comments are enclosed with {- and -}. -</P> -<A NAME="toc64"></A> -<H2>The syntactic structure of GF</H2> -<P> -Non-terminals are enclosed between < and >. -The symbols -> (production), <B>|</B> (union) -and <B>eps</B> (empty rule) belong to the BNF notation. -All other symbols are terminals. -</P> -<TABLE ALIGN="center" CELLPADDING="4"> -<TR> -<TD><I>Grammar</I></TD> -<TD>-></TD> -<TD><I>[ModDef]</I></TD> -</TR> -<TR> -<TD><I>[ModDef]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>ModDef</I> <I>[ModDef]</I></TD> -</TR> -<TR> -<TD><I>ModDef</I></TD> -<TD>-></TD> -<TD><I>ModDef</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>ComplMod</I> <I>ModType</I> <CODE>=</CODE> <I>ModBody</I></TD> -</TR> -<TR> -<TD><I>ModType</I></TD> -<TD>-></TD> -<TD><CODE>abstract</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>resource</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>interface</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>concrete</CODE> <I>Ident</I> <CODE>of</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>instance</CODE> <I>Ident</I> <CODE>of</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>transfer</CODE> <I>Ident</I> <CODE>:</CODE> <I>Open</I> <CODE>-></CODE> <I>Open</I></TD> -</TR> -<TR> -<TD><I>ModBody</I></TD> -<TD>-></TD> -<TD><I>Extend</I> <I>Opens</I> <CODE>{</CODE> <I>[TopDef]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>[Included]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Included</I> <CODE>with</CODE> <I>[Open]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Included</I> <CODE>with</CODE> <I>[Open]</I> <CODE>**</CODE> <I>Opens</I> <CODE>{</CODE> <I>[TopDef]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>[Included]</I> <CODE>**</CODE> <I>Included</I> <CODE>with</CODE> <I>[Open]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>[Included]</I> <CODE>**</CODE> <I>Included</I> <CODE>with</CODE> <I>[Open]</I> <CODE>**</CODE> <I>Opens</I> <CODE>{</CODE> <I>[TopDef]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD><I>[TopDef]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>TopDef</I> <I>[TopDef]</I></TD> -</TR> -<TR> -<TD><I>Extend</I></TD> -<TD>-></TD> -<TD><I>[Included]</I> <CODE>**</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD><I>[Open]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Open</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Open</I> <CODE>,</CODE> <I>[Open]</I></TD> -</TR> -<TR> -<TD><I>Opens</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>open</CODE> <I>[Open]</I> <CODE>in</CODE></TD> -</TR> -<TR> -<TD><I>Open</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>(</CODE> <I>QualOpen</I> <I>Ident</I> <CODE>)</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>(</CODE> <I>QualOpen</I> <I>Ident</I> <CODE>=</CODE> <I>Ident</I> <CODE>)</CODE></TD> -</TR> -<TR> -<TD><I>ComplMod</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>incomplete</CODE></TD> -</TR> -<TR> -<TD><I>QualOpen</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD><I>[Included]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Included</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Included</I> <CODE>,</CODE> <I>[Included]</I></TD> -</TR> -<TR> -<TD><I>Included</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>[</CODE> <I>[Ident]</I> <CODE>]</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>-</CODE> <CODE>[</CODE> <I>[Ident]</I> <CODE>]</CODE></TD> -</TR> -<TR> -<TD><I>Def</I></TD> -<TD>-></TD> -<TD><I>[Name]</I> <CODE>:</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>[Name]</I> <CODE>=</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Name</I> <I>[Patt]</I> <CODE>=</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>[Name]</I> <CODE>:</CODE> <I>Exp</I> <CODE>=</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD><I>TopDef</I></TD> -<TD>-></TD> -<TD><CODE>cat</CODE> <I>[CatDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>fun</CODE> <I>[FunDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>data</CODE> <I>[FunDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>def</CODE> <I>[Def]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>data</CODE> <I>[DataDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>param</CODE> <I>[ParDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>oper</CODE> <I>[Def]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>lincat</CODE> <I>[PrintDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>lindef</CODE> <I>[Def]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>linref</CODE> <I>[Def]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>lin</CODE> <I>[Def]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>printname</CODE> <CODE>cat</CODE> <I>[PrintDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>printname</CODE> <CODE>fun</CODE> <I>[PrintDef]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>flags</CODE> <I>[FlagDef]</I></TD> -</TR> -<TR> -<TD><I>CatDef</I></TD> -<TD>-></TD> -<TD><I>Ident</I> <I>[DDecl]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>[</CODE> <I>Ident</I> <I>[DDecl]</I> <CODE>]</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>[</CODE> <I>Ident</I> <I>[DDecl]</I> <CODE>]</CODE> <CODE>{</CODE> <I>Integer</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD><I>FunDef</I></TD> -<TD>-></TD> -<TD><I>[Ident]</I> <CODE>:</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD><I>DataDef</I></TD> -<TD>-></TD> -<TD><I>Ident</I> <CODE>=</CODE> <I>[DataConstr]</I></TD> -</TR> -<TR> -<TD><I>DataConstr</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>.</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD><I>[DataConstr]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>DataConstr</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>DataConstr</I> <CODE>|</CODE> <I>[DataConstr]</I></TD> -</TR> -<TR> -<TD><I>ParDef</I></TD> -<TD>-></TD> -<TD><I>Ident</I> <CODE>=</CODE> <I>[ParConstr]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>=</CODE> <CODE>(</CODE> <CODE>in</CODE> <I>Ident</I> <CODE>)</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD><I>ParConstr</I></TD> -<TD>-></TD> -<TD><I>Ident</I> <I>[DDecl]</I></TD> -</TR> -<TR> -<TD><I>PrintDef</I></TD> -<TD>-></TD> -<TD><I>[Name]</I> <CODE>=</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD><I>FlagDef</I></TD> -<TD>-></TD> -<TD><I>Ident</I> <CODE>=</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD><I>[Def]</I></TD> -<TD>-></TD> -<TD><I>Def</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Def</I> <CODE>;</CODE> <I>[Def]</I></TD> -</TR> -<TR> -<TD><I>[CatDef]</I></TD> -<TD>-></TD> -<TD><I>CatDef</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>CatDef</I> <CODE>;</CODE> <I>[CatDef]</I></TD> -</TR> -<TR> -<TD><I>[FunDef]</I></TD> -<TD>-></TD> -<TD><I>FunDef</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>FunDef</I> <CODE>;</CODE> <I>[FunDef]</I></TD> -</TR> -<TR> -<TD><I>[DataDef]</I></TD> -<TD>-></TD> -<TD><I>DataDef</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>DataDef</I> <CODE>;</CODE> <I>[DataDef]</I></TD> -</TR> -<TR> -<TD><I>[ParDef]</I></TD> -<TD>-></TD> -<TD><I>ParDef</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>ParDef</I> <CODE>;</CODE> <I>[ParDef]</I></TD> -</TR> -<TR> -<TD><I>[PrintDef]</I></TD> -<TD>-></TD> -<TD><I>PrintDef</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>PrintDef</I> <CODE>;</CODE> <I>[PrintDef]</I></TD> -</TR> -<TR> -<TD><I>[FlagDef]</I></TD> -<TD>-></TD> -<TD><I>FlagDef</I> <CODE>;</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>FlagDef</I> <CODE>;</CODE> <I>[FlagDef]</I></TD> -</TR> -<TR> -<TD><I>[ParConstr]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>ParConstr</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>ParConstr</I> <CODE>|</CODE> <I>[ParConstr]</I></TD> -</TR> -<TR> -<TD><I>[Ident]</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>,</CODE> <I>[Ident]</I></TD> -</TR> -<TR> -<TD><I>Name</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>[</CODE> <I>Ident</I> <CODE>]</CODE></TD> -</TR> -<TR> -<TD><I>[Name]</I></TD> -<TD>-></TD> -<TD><I>Name</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Name</I> <CODE>,</CODE> <I>[Name]</I></TD> -</TR> -<TR> -<TD><I>LocDef</I></TD> -<TD>-></TD> -<TD><I>[Ident]</I> <CODE>:</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>[Ident]</I> <CODE>=</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>[Ident]</I> <CODE>:</CODE> <I>Exp</I> <CODE>=</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD><I>[LocDef]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>LocDef</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>LocDef</I> <CODE>;</CODE> <I>[LocDef]</I></TD> -</TR> -<TR> -<TD><I>Exp6</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Sort</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>String</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Integer</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Double</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>?</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>[</CODE> <CODE>]</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>data</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>[</CODE> <I>Ident</I> <I>Exps</I> <CODE>]</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>[</CODE> <I>String</I> <CODE>]</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>{</CODE> <I>[LocDef]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE><</CODE> <I>[TupleComp]</I> <CODE>></CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE><</CODE> <I>Exp</I> <CODE>:</CODE> <I>Exp</I> <CODE>></CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>(</CODE> <I>Exp</I> <CODE>)</CODE></TD> -</TR> -<TR> -<TD><I>Exp5</I></TD> -<TD>-></TD> -<TD><I>Exp5</I> <CODE>.</CODE> <I>Label</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp6</I></TD> -</TR> -<TR> -<TD><I>Exp4</I></TD> -<TD>-></TD> -<TD><I>Exp4</I> <I>Exp5</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>table</CODE> <CODE>{</CODE> <I>[Case]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>table</CODE> <I>Exp6</I> <CODE>{</CODE> <I>[Case]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>table</CODE> <I>Exp6</I> <CODE>[</CODE> <I>[Exp]</I> <CODE>]</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>case</CODE> <I>Exp</I> <CODE>of</CODE> <CODE>{</CODE> <I>[Case]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>variants</CODE> <CODE>{</CODE> <I>[Exp]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>pre</CODE> <CODE>{</CODE> <I>Exp</I> <CODE>;</CODE> <I>[Altern]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>strs</CODE> <CODE>{</CODE> <I>[Exp]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>@</CODE> <I>Exp6</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp5</I></TD> -</TR> -<TR> -<TD><I>Exp3</I></TD> -<TD>-></TD> -<TD><I>Exp3</I> <CODE>!</CODE> <I>Exp4</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp3</I> <CODE>*</CODE> <I>Exp4</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp3</I> <CODE>**</CODE> <I>Exp4</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp4</I></TD> -</TR> -<TR> -<TD><I>Exp1</I></TD> -<TD>-></TD> -<TD><I>Exp2</I> <CODE>+</CODE> <I>Exp1</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp2</I></TD> -</TR> -<TR> -<TD><I>Exp</I></TD> -<TD>-></TD> -<TD><I>Exp1</I> <CODE>++</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>\</CODE> <I>[Bind]</I> <CODE>-></CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>\</CODE> <CODE>\</CODE> <I>[Bind]</I> <CODE>=></CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Decl</I> <CODE>-></CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp3</I> <CODE>=></CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>let</CODE> <CODE>{</CODE> <I>[LocDef]</I> <CODE>}</CODE> <CODE>in</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>let</CODE> <I>[LocDef]</I> <CODE>in</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp3</I> <CODE>where</CODE> <CODE>{</CODE> <I>[LocDef]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>in</CODE> <I>Exp5</I> <I>String</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp1</I></TD> -</TR> -<TR> -<TD><I>Exp2</I></TD> -<TD>-></TD> -<TD><I>Exp3</I></TD> -</TR> -<TR> -<TD><I>[Exp]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp</I> <CODE>;</CODE> <I>[Exp]</I></TD> -</TR> -<TR> -<TD><I>Exps</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp6</I> <I>Exps</I></TD> -</TR> -<TR> -<TD><I>Patt2</I></TD> -<TD>-></TD> -<TD><CODE>_</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>.</CODE> <I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Integer</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Double</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>String</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>{</CODE> <I>[PattAss]</I> <CODE>}</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE><</CODE> <I>[PattTupleComp]</I> <CODE>></CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>(</CODE> <I>Patt</I> <CODE>)</CODE></TD> -</TR> -<TR> -<TD><I>Patt1</I></TD> -<TD>-></TD> -<TD><I>Ident</I> <I>[Patt]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>.</CODE> <I>Ident</I> <I>[Patt]</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Patt2</I> <CODE>*</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Ident</I> <CODE>@</CODE> <I>Patt2</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>-</CODE> <I>Patt2</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Patt2</I></TD> -</TR> -<TR> -<TD><I>Patt</I></TD> -<TD>-></TD> -<TD><I>Patt</I> <CODE>|</CODE> <I>Patt1</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Patt</I> <CODE>+</CODE> <I>Patt1</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Patt1</I></TD> -</TR> -<TR> -<TD><I>PattAss</I></TD> -<TD>-></TD> -<TD><I>[Ident]</I> <CODE>=</CODE> <I>Patt</I></TD> -</TR> -<TR> -<TD><I>Label</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>$</CODE> <I>Integer</I></TD> -</TR> -<TR> -<TD><I>Sort</I></TD> -<TD>-></TD> -<TD><CODE>Type</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>PType</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>Str</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>Strs</CODE></TD> -</TR> -<TR> -<TD><I>[PattAss]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>PattAss</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>PattAss</I> <CODE>;</CODE> <I>[PattAss]</I></TD> -</TR> -<TR> -<TD><I>[Patt]</I></TD> -<TD>-></TD> -<TD><I>Patt2</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Patt2</I> <I>[Patt]</I></TD> -</TR> -<TR> -<TD><I>Bind</I></TD> -<TD>-></TD> -<TD><I>Ident</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><CODE>_</CODE></TD> -</TR> -<TR> -<TD><I>[Bind]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Bind</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Bind</I> <CODE>,</CODE> <I>[Bind]</I></TD> -</TR> -<TR> -<TD><I>Decl</I></TD> -<TD>-></TD> -<TD><CODE>(</CODE> <I>[Bind]</I> <CODE>:</CODE> <I>Exp</I> <CODE>)</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp4</I></TD> -</TR> -<TR> -<TD><I>TupleComp</I></TD> -<TD>-></TD> -<TD><I>Exp</I></TD> -</TR> -<TR> -<TD><I>PattTupleComp</I></TD> -<TD>-></TD> -<TD><I>Patt</I></TD> -</TR> -<TR> -<TD><I>[TupleComp]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>TupleComp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>TupleComp</I> <CODE>,</CODE> <I>[TupleComp]</I></TD> -</TR> -<TR> -<TD><I>[PattTupleComp]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>PattTupleComp</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>PattTupleComp</I> <CODE>,</CODE> <I>[PattTupleComp]</I></TD> -</TR> -<TR> -<TD><I>Case</I></TD> -<TD>-></TD> -<TD><I>Patt</I> <CODE>=></CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD><I>[Case]</I></TD> -<TD>-></TD> -<TD><I>Case</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Case</I> <CODE>;</CODE> <I>[Case]</I></TD> -</TR> -<TR> -<TD><I>Altern</I></TD> -<TD>-></TD> -<TD><I>Exp</I> <CODE>/</CODE> <I>Exp</I></TD> -</TR> -<TR> -<TD><I>[Altern]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Altern</I></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Altern</I> <CODE>;</CODE> <I>[Altern]</I></TD> -</TR> -<TR> -<TD><I>DDecl</I></TD> -<TD>-></TD> -<TD><CODE>(</CODE> <I>[Bind]</I> <CODE>:</CODE> <I>Exp</I> <CODE>)</CODE></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>Exp6</I></TD> -</TR> -<TR> -<TD><I>[DDecl]</I></TD> -<TD>-></TD> -<TD><B>eps</B></TD> -</TR> -<TR> -<TD></TD> -<TD ALIGN="center"><B>|</B></TD> -<TD><I>DDecl</I> <I>[DDecl]</I></TD> -</TR> -</TABLE> - -</div> -</BODY> -</HTML> diff --git a/doc/gf-refman.md b/doc/gf-refman.md new file mode 100644 index 000000000..0462cf185 --- /dev/null +++ b/doc/gf-refman.md @@ -0,0 +1,2770 @@ +--- +title: GF Language Reference Manual +author: Aarne Ranta, Krasimir Angelov +date: June 2014, GF 3.6 +toc: true +--- + +This document is a reference manual to the GF programming language. GF, +Grammatical Framework, is a special-purpose programming language, +designed to support definitions of grammars. + +This document is not an introduction to GF; such introduction can be +found in the GF tutorial available on line on the GF web page, + +[`grammaticalframework.org`](http://grammaticalframework.org) + +This manual covers only the language, not the GF compiler or interactive +system. We will however make some references to different compiler +versions, if they involve changes of behaviour having to do with the +language specification. + +This manual is meant to be fully compatible with GF version 3.0. Main +discrepancies with version 2.8 are indicated, as well as with the +reference article on GF, + +A. Ranta, \"Grammatical Framework. A Type Theoretical Grammar +Formalism\", *The Journal of Functional Programming* 14(2), 2004, pp. +145-189. + +This article will referred to as \"the JFP article\". + +As metalinguistic notation, we will use the symbols + +- *a* === *b* to say that *a* is syntactic sugar for *b* +- *a* ==\> *b* to say that *a* is computed (or compiled) to *b* + + +Overview of GF +-------------- + +GF is a typed functional language, borrowing many of its constructs from +ML and Haskell: algebraic datatypes, higher-order functions, pattern +matching. The module system bears resemblance to ML (functors) but also +to object-oriented languages (inheritance). The type theory used in the +abstract syntax part of GF is inherited from logical frameworks, in +particular ALF (\"Another Logical Framework\"; in a sense, GF is Yet +Another ALF). From ALF comes also the use of dependent types, including +the use of explicit type variables instead of Hindley-Milner +polymorphism. + +The look and feel of GF is close to Java and C, due to the use of curly +brackets and semicolons in structuring the code; the expression syntax, +however, follows Haskell in using juxtaposition for function application +and parentheses only for grouping. + +To understand the constructs of GF, and especially their limitations in +comparison to general-purpose programming languages, it is essential to +keep in mind that GF is a special-purpose and non-turing-complete +language. Every GF program is ultimately compiled to a **multilingual +grammar**, which consists of an **abstract syntax** and a set of +**concrete syntaxes**. The abstract syntax defines a system of **syntax +trees**, and each concrete syntax defines a mapping from those syntax +trees to **nested tuples** of strings and integers. This mapping is +**compositional**, i.e. **homomorphic**, and moreover **reversible**: +given a nested tuple, there exists an effective way of finding the set +of syntax trees that map to this tuple. The procedure of applying the +mapping to a tree to produce a tuple is called **linearization**, and +the reverse search procedure is called **parsing**. It is ultimately the +requirement of reversibility that restricts GF to be less than +turing-complete. This is reflected in restrictions to recursion in +concrete syntax. Tree formation in abstract syntax, however, is fully +recursive. + +Even though run-time GF grammars manipulate just nested tuples, at +compile time these are represented by by the more fine-grained labelled +records and finite functions over algebraic datatypes. This enables the +programmer to write on a higher abstraction level, and also adds type +distinctions and hence raises the level of checking of programs. + + +The module system +----------------- + + +### Top-level and supplementary module structure + +The big picture of GF as a programming language for multilingual +grammars explains its principal module structure. Any GF grammar must +have an abstract syntax module; it can in addition have any number of +concrete syntax modules matching that abstract syntax. Before going to +details, we give a simple example: a module defining the **category** +`A` of adjectives and one adjective-forming **function**, the zero-place +function `Even`. We give the module the name `Adj`. The GF code for the +module looks as follows: + + abstract Adj = { + cat A ; + fun Even : A ; + } + +Here are two concrete syntax modules, one intended for mapping the trees +to English, the other to Swedish. The mappling is defined by `lincat` +definitions assigning a **linearization type** to each category, and +`lin` definitions assigning a **linearization** to each function. + + concrete AdjEng of Adj = { + lincat A = {s : Str} ; + lin Even = {s = "even"} ; + } + + concrete AdjSwe of Adj = { + lincat A = {s : AForm => Str} ; + lin Even = {s = table { + ASg Utr => "jämn" ; + ASg Neutr => "jämnt" ; + APl => "jämna" + } + } ; + param AForm = ASg Gender | APl ; + param Gender = Utr | Neutr ; + } + +These examples illustrate the main ideas of multilingual grammars: + +- the concrete syntax must match the abstract syntax: + - every `cat` is given a `lincat` + - every `fun` is given a `lin` + +<!-- --> + +- the concrete syntax is internally coherent: + - the `lin` rules respect the types defined by `lincat` rules + +<!-- --> + +- concrete syntaxes are independent of each other + - they can use different `lincat` and `lin` definitions + - they can define their own **parameter types** (`param`) + +The first two ideas form the core of the **static checking** of GF +grammars, eliminating the possibility of run-time errors in +linearization and parsing. The third idea gives GF the expressive power +needed to map abstract syntax to vastly different languages. + +Abstract and concrete modules are called **top-level grammar modules**, +since they are the ones that remain in grammar systems at run time. +However, in order to support **modular grammar engineering**, GF +provides much more module structure than strictly required in top-level +grammars. + +**Inheritance**, also known as **extension**, means that a module can +inherit the contents of one or more other modules to which new +judgements are added, e.g. + + abstract MoreAdj = Adj ** { + fun Odd : A ; + } + +**Resource modules** define parameter types and **operations** usable in +several concrete syntaxes, + + resource MorphoFre = { + param Number = Sg | Pl ; + param Gender = Masc | Fem ; + oper regA : Str -> {s : Gender => Number => Str} = + \fin -> { + s = table { + Masc => table {Sg => fin ; Pl => fin + "s"} ; + Fem => table {Sg => fin + "e" ; Pl => fin + "es"} + } + } ; + } + +By **opening**, a module can use the contents of a resource module +without inheriting them, e.g. + + concrete AdjFre of Adj = open MorphoFre in { + lincat A = {s : Gender => Number => Str} ; + lin Even = regA "pair" ; + } + +**Interfaces** and **instances** separate the contents of a resource +module to type signatures and definitions, in a way analogous to +abstract vs. concrete modules, e.g. + + interface Lexicon = { + oper Adjective : Type ; + oper even_A : Adjective ; + } + + instance LexiconEng of Lexicon = { + oper Adjective = {s : Str} ; + oper even_A = {s = "even"} ; + } + +**Functors** i.e. **parametrized modules** i.e. **incomplete modules**, +defining a concrete syntax in terms of an interface. + + incomplete concrete AdjI of Adj = open Lexicon in { + lincat A = Adjective ; + lin Even = even_A ; + } + +A functor can be **instantiated** by providing instances of its open +interfaces. + + concrete AdjEng of Adj = AdjI with (Lexicon = LexiconEng) ; + + +### Compilation units + +The compilation unit of GF source code is a file that contains a module. +Judgements outside modules are supported only for backward +compatibility, as explained [here](#oldgf). Every source file, suffixed +`.gf`, is compiled to a \"GF object file\", suffixed `.gfo` (as of GF +Version 3.0 and later). For runtime grammar objects used for parsing and +linearization, a set of `.gfo` files is linked to a single file suffixed +`.pgf`. While `.gf` and `.gfo` files may contain modules of any kinds, a +`.pgf` file always contains a multilingual grammar with one abstract and +a set of concrete syntaxes. + +The following diagram summarizes the files involved in the compilation +process. + +`module1.gf module2.gf ... modulen.gf` + +==\> + +`module1.gfo module2.gfo ... modulen.gfo` + +==\> + +grammar.pgf + +Both `.gf` and `.gfo` files are written in the GF source language; +`.pgf` files are written in a lower-level format. The process of +translating `.gf` to `.gfo` consists of **name resolution**, **type +annotation**, **partial evaluation**, and **optimization**. There is a +great advantage in the possibility to do this separately for GF modules +and saving the result in `.gfo` files. The partial evaluation phase, in +particular, is time and memory consuming, and GF libraries are therefore +distributed in `.gfo` to make their use less arduous. + +*In GF before version 3.0, the object files are in a format called +`.gfc`,* *and the multilingual runtime grammar is in a format called +`.gfcm`.* + +The standard compiler has a built-in **make facility**, which finds out +what other modules are needed when compiling an explicitly given module. +This facility builds a dependency graph and decides which of the +involved modules need recompilation (from `.gf` to `.gfo`), and for +which the GF object can be used directly. + + +### Names + +Each module *M* defines a set of **names**, which are visible in *M* +itself, in all modules extending *M* (unless excluded, as explained +[here](#restrictedinheritance)), and all modules opening *M*. These +names can stand for abstract syntax categories and functions, parameter +types and parameter constructors, and operations. All these names live +in the same **name space**, which means that a name entering a module +more than once due to inheritance or opening can lead to a **conflict**. +It is specified [here](#renaming) how these conflicts are resolved. + +The names of modules live in a name space separate from the other names. +Even here, all names must be distinct in a set of files compiled to a +multilingual grammar. In particular, even files residing in different +directories must have different names, since GF has no notion of +hierarchic module names. + +Lexically, names belong to the class of **identifiers**. An idenfifier +is a letter followed by any number of letters, digits, undercores (`_`) +and primes (`'`). Upper- and lower-case letters are treated as distinct. +Nothing dictates the choice of upper or lower-case initials, but the +standard libraries follow conventions similar to Haskell: + +- upper case is used for modules, abstract syntax categories and + functions, parameter types and constructors, and type synonyms +- lower case is used for non-type-valued operations and for variables + +[]{#identifiers} + +\"Letters\" as mentioned in the identifier syntax include all 7-bit +ASCII letters. Iso-latin-1 and Unicode letters are supported in varying +degrees by different tools and platforms, and are hence not recommended +in identifiers. + + +### The structure of a module + +Modules of all types have the following structure: + +*moduletype* *name* `=` *extends* *opens* *body* + +The part of the module preceding the body is its **header**. The header +defines the type of the module and tells what other modules it inherits +and opens. The body consists of the judgements that introduce all the +new names defined by the module. + +Any of the parts *extends*, *opens*, and *body* may be empty. If they +are all filled, delimiters and keywords separate the parts in the +following way: + +*moduletype* *name* `=` *extends* `**` `open` *opens* `in` `{` *body* +`}` + +The part *moduletype* *name* looks slightly different if the type is +`concrete` or `instance`: the *name* intrudes between the type keyword +and the name of the module being implemented and which really belongs to +the type of the module: + +`concrete` *name* `of` *abstractname* + +The only exception to the schema of functor syntax is functor +instantiations: the instantiation list is given in a special way between +*extends* and *opens*: + +`incomplete concrete` *name* `of` *abstractname* `=` *extends* `**` +*functorname* `with` *instantiations* `**` `open` *opens* `in` `{` +*body* `}` + +Logically, the part \"*functorname* `with` *instantiations*\" should +really be one of the *extends*. This is also shown by the fact that it +can have restricted inheritance (concept defined +[here](#restrictedinheritance)). + + +### Module types, headers, and bodies + +The *extends* and *opens* parts of a module header are lists of module +names (with possible qualifications, as defined below +[here](#qualifiednames)). The first step of type checking a module +consists of verifying that these names stand for modules of approptiate +module types. As a rule of thumb, + +- the *extends* of a module must have the same *moduletype* +- the *opens* of a module must be of type `resource` + +However, the precise rules are a little more fine-grained, because of +the presence of interfaces and their instances, and the possibility to +reuse abstract and concrete modules as resources. The following table +gives, for all module types, the possible module types of their +*extends* and *opens*, as well as the forms of judgement legal in that +module type. + + module type extends opens body + --------------------------- ------------ ------------ ---------------------------- + `abstract` abstract \- `cat, fun, def, data` + `concrete of` *abstract* concrete resource\* `lincat, cat, oper, param` + `resource` resource\* resource\* `oper, param` + `interface` resource+ resource\* `oper, param` + `instance of` *interface* resource\* resource\* `oper, param` + `incomplete` concrete concrete+ resource+ `lincat, cat, oper, param` + +The table uses the following shorthands for lists of module types: + +- resource\*: resource, instance, concrete +- resource+: resource\*, interface, abstract +- concrete+: concrete, incomplete concrete + +The legality of judgements in the body is checked before the judgements +themselves are checked. + +The forms of judgement are explained [here](#judgementforms). + + +### Digression: the logic of module types + +Why are the legality conditions of opens and extends so complicated? The +best way to grasp them is probably to consider a simplified logical +model of the module system, replacing modules by types and functions. +This model could actually be developed towards treating modules in GF as +first-class objects; so far, however, this step has not been motivated +by any practical needs. + + module object and type + ------------------------------------------ ----------------------- + abstract A = B A = B : type + concrete C of A = B C = B : A -\> S + interface I = B I = B : type + instance J of I = B J = B : I + incomplete concrete C of A = open I in B C = B : I -\> A -\> S + concrete K of A = C with (I=J) K = B(J) : A -\> S + resource R = B R = B : I + concrete C of A = open R in B C = B(R) : A -\> S + +A further step of defining modules as first-class objects would use +GADTs and record types: + +- an abstract syntax is a Generalized Algebraic Datatype (GADT) +- the target type `S` of concrete syntax is the type of nested tuples + over strings and integers +- an interface is a labelled record type +- an instance is a record of the type defined by the interface +- a functor, with a module body opening an interface, is a function on + its instances +- the instantiation of a functor is an application of the function to + some instance +- a resource is a typed labelled record, putting together an interface + and an instance of it +- the body of a module opening a resource is as a function on the + interface implicit in the resource; this function is immediately + applied to the instance defined in the resource + +Slightly unexpectedly, interfaces and instances are easier to understand +in this way than resources - a resource is, indeed, more complex, since +it fuses together an interface and an instance. + +[]{#openabstract} + +When an abstract is used as an interface and a concrete as its instance, +they are actually reinterpreted so that they match the model. Then the +abstract is no longer a GADT, but a system of *abstract* datatypes, with +a record field of type `Type` for each category, and a function among +these types for each abstract syntax function. A concrete syntax +instantiates this record with linearization types and linearizations. + + +### Inheritance + +After checking that the *extends* of a module are of appropriate module +types, the compiler adds the inherited judgements to the judgements +included in the body. The inherited judgements are not copied entirely, +but their names with links to the inherited module. Conflicts may arise +in this process: a name can have two definitions in the combined pool of +inherited and added judgements. Such a conflict is always an error: GF +provides no way to redefine an inherited constant. + +Simple as the definition of a conflict may sound, it has to take care of +the inheritance hierarchy. A very common pattern of inheritance is the +**diamond**: inheritance from two modules which themselves inherit a +common base module. Assume that the base module defines a name `f`: + + N + / \ + M1 M2 + \ / + Base {f} + +Now, `N` inherits `f` from both `M1` and `M2`, so is there a conflict? +The answer in GF is *no*, because the \"two\" `f`\'s are in the end the +same: the one defined in `Base`. The situation is thus simpler than in +**multiple inheritance** in languages like C++, because definitions in +GF are **immutable**: neither `M1` nor `M2` can possibly have changed +the definition of `f` given in `Base`. In practice, the compiler manages +inheritance through hierarchy in a very simple way, by just always +creating a link not to the immediate parent, but the original ancestor; +this ancestor can be read from the link provided by the immediate +parent. Here is how links are created from source modules by the +compiler: + + Base {f} + M1 {m1} ===> M1 {Base.f, m1} + M2 {m2} ===> M2 {Base.f, m2} + N {n} ===> N {Base.f, M1.m1, M2.m2, n} + +[]{#restrictedinheritance} + +Inheritance can be **restricted**. This means that a module can be +specified as inheriting *only* explicitly listed constants, or all +constants *except* ones explicitly listed. The syntax uses constant +names in brackets, prefixed by a minus sign in the case of an exclusion +list. In the following configuration, N inherits `a,b,c` from `M1`, and +all names but `d` from `M2` + + N = M1 {a,b,c}, M2-{d} + +Restrictions are performed as a part of inheritance linking, module by +module: the link is created for a constant if and only if it is both +included in the module and compatible with the restriction. Thus, for +instance, an inadvertent usage can exclude a constant from one module +but inherit it from another one. In the following configuration, `f` is +inherited via `M1`, if `M1` inherits it. + + N = M1 [a,b,c], M2-[f] + +Unintended inheritance may cause problems later in compilation, in the +judgement-level dependency analysis phase. For instance, suppose a +function `f` has category `C` as its type in `M`, and we only include +`f`. The exclusion has the effect of creating an ill-formed module: + + abstract M = {cat C ; fun f : C ;} + M [f] ===> {fun f : C ;} + +One might expect inheritance restriction to be transitive: if an +included constant *b* depends on some other constant *a*, then *a* +should be included automatically. However, this rule would leave to +hard-to-detect inheritances. And it could only be applied later in the +compilation phase, when the compiler has not only collected the names +defined, but also resolved the names used in definitions. + +Yet another pitfall with restricted inheritance is that it must be +stated for each module separately. For instance, a concrete syntax of an +abstract must exclude all those names that the abstract does, and a +functor instantiation must replicate all restrictions of the functor. + + +### Opening + +Opening makes constants from other modules usable in judgements, without +inheriting them. This means that, unlike inheritance, opening is not +transitive. + +[]{#qualifiednames} + +Opening cannot be restricted as inheritance can, but it can be +**qualified**. This means that the names from the opened modules cannot +be used as such, but only as prefixed by a qualifier and a dot (`.`). +The qualifier can be any identifier, including the name of the module. +Here is an example of an *opens* list: + + open A, (X = XSLTS), (Y = XSLTS), B + +If `A` defines the constant `a`, it can be accessed by the names + + a A.a + +If `XSLTS` defines the constant `x`, it can be accessed by the names + + X.x Y.x XSLTS.x + +Thus qualification by real module name is always possible, and one and +the same module can be qualified in different ways at the same time (the +latter can be useful if you want to be able to change the +implementations of some constants to a different resource later). Since +the qualification with real module name is always possible, it is not +possible to \"swap\" the names of modules locally: + + open (A=B), (B=A) -- NOT POSSIBLE! + +The list of qualifiers names and module names in a module header may +thus not contain any duplicates. + + +### Name resolution + +[]{#renaming} + +**Name resolution** is the compiler phase taking place after inheritance +linking. It qualifies all names occurring in the definition parts of +judgements (that is, just excluding the defined names themselves) with +the names of the modules they come from. If a name can come from +different modules (that is, not from their common ancestor), a conflict +is reported; this decision is hence not dependent on e.g. types, which +are known only at a later phase. + +Qualification of names is the main device for avoiding conflicts in name +resolution. No other information is used, such as priorities between +modules. However, if a name is defined in different opened modules but +never used in the module body, a conflict does not arise: conflicts +arise only when names are used. Also in this respect, opening is thus +different from inheritance, where conflicts are checked independently of +use. + +As usual, inner scope has priority in name resolution. This means that +if an identifier is in scope as a bound variable, it will not be +interpreted as a constant, unless qualified by a module name (variable +bindings are explained [here](#variablebinding)). + + +### Functor instantiations + +We have dealt with the principles of module headers, inheritance, and +names in a general way that applies to all module types. The exception +is functor instantiations, that have an extra part of the instantiating +equations, assigning an instance to every interface. Here is a typical +example, displaying the full generality: + + concrete FoodsEng of Foods = PhrasesEng ** + FoodsI-[Pizza] with + (Syntax = SyntaxEng), + (LexFoods = LexFoodsEng) ** + open SyntaxEng, ParadigmsEng in { + lin Pizza = mkCN (mkA "Italian") (mkN "pie") ; + } + +(The example is modified from Section 5.9 in the GF Tutorial.) + +The instantiation syntax is similar to qualified *opens*. The +left-hand-side names must be interfaces, the right-hand-side names their +instances. (Recall that `abstract` can be use as `interface` and +`concrete` as its `instance`.) Inheritance from the functor can be +restricted, typically in the purpose of defining some excluded functions +in language-specific ways in the module body. + + +### Completeness + +(This section refers to the forms of judgement introduced +[here](#judgementforms).) + +A `concrete` is complete with respect to an `abstract`, if it contains a +`lincat` definition for every `cat` declaration, and a `lin` definition +for every `fun` declaration. + +The same completeness criterion applies to functor instantiations. It is +not possible to use a partial functor instantiation, leading to another +functor. + +Functors do not need to be complete in the sense concrete modules need. +The missing definitions can then be provided in the body of each functor +instantiation. + +A `resource` is complete, if all its `oper` and `param` judgements have +a definition part. While a `resource` must be complete, an `interface` +need not. For an `interface`, it is the definition parts of judgements +are optional. + +An `instance` is complete with respect to an `interface`, if it gives +the definition parts of all `oper` and `param` judgements that are +omitted in the `interface`. Giving definitions to judgements that have +already been defined in the `interface` is illegal. Type signatures, on +the other hand, can be repeated if the same types are used. + +In addition to completing the definitions in an `interface`, its +instance may contain other judgements, but these must all be complete +with definitions. + +Here is an example of an instance and its interface showing the above +variations: + + interface Pos = { + param Case ; -- no definition + param Number = Sg | Pl ; -- definition given + oper Noun : Type = { -- relative definition given + s : Number => Case => Str + } ; + oper regNoun : Str -> Noun ; -- no definition + } + + instance PosEng of Pos = { + param Case = Nom | Gen ; -- definition of Case + -- Number and Noun inherited + oper regNoun = \dog -> { -- type of regNoun inherited + s = table { -- definition of regNoun + Sg => table { + Nom => dog + -- etc + } + } ; + oper house_N : Noun = -- new definition + regNoun "house" ; + } + + +Judgements +---------- + + +### Overview of the forms of judgement + +[]{#judgementforms} + +A module body in GF is a set of **judgements**. Judgements are +definitions or declarations, sometimes combinations of the two; the +common feature is that every judgement introduces a name, which is +available in the module and whenever the module is extended or opened. + +There are several different **forms of judgement**, identified by +different **judgement keywords**. Here is a list of all these forms, +together with syntax descriptions and the types of modules in which each +form can occur. The table moreover indicates whether the judgement has a +default value, and whether it contributes to the **name base**, i.e. +introduces a new name to the scope. + + judgement where module default base + ----------------------------- ---------------------------- ------------ --------- ------ + `cat` C G G context abstract N/A yes + `fun` f : A A type abstract N/A yes + `def` f ps = t f fun, ps patterns, t term abstract yes no + `data` C = f `|` \... `|` g C cat, f\...g fun abstract yes no + `lincat` C = T C cat, T type concrete\* yes yes + `lin` f = t f fun, t term concrete\* no yes + `lindef` C = t C cat, t term concrete\* yes no + `linref` C = t C cat, t term concrete\* yes no + `printname cat` C = t C cat, t term concrete\* yes no + `printname fun` f = t f fun, t term concrete\* yes no + `param` P = C`|` \... `|` D C\...D constructors resource\* N/A yes + `oper` f : T = t T type, t term resource\* N/A yes + `flags` o = v o flag, v value all yes N/A + +Judgements that have default values are rarely used, except `lincat` and +`flags`, which often need values different from the defaults. + +Introducing a name twice in the same module is an error. In other words, +all judgements that have a \"yes\" in the name base column, must have +distinct identifiers on their left-hand sides. + +All judgement end with semicolons (`;`). + +In addition to the syntax given in the table, many of the forms have +syntactic sugar. This sugar will be explained below in connection to +each form. There are moreover two kinds of syntactic sugar common to all +forms: + +- the judgement keyword is shared between consecutive judgements until + a new keyword appears: + `keyw J ; K ;` === `keyw J ; keyw K ;` +- the right-hand sides of colon (`:`) and equality (`=`) can be + shared, by using comma (`,`) as separator of left-hand sides, which + must consist of identifiers + `c,d : T` === `c : T ; d : T ;` + `c,d = t` === `c = t ; d = t ;` + +These conventions, like all syntactic sugar, are performed at an early +compilation phase, directly after parsing. This means that e.g. + + lin f,g = \x -> x ; + +can be correct even though `f` and `g` required different function +types. + +Within a module, judgements can occur in any order. In particular, a +name can be used before it is introduced. + +The explanations of judgement forms refer to the notions of **type** and +**term** (the latter also called **expression**). These notions will be +explained in detail [here](#expressions). + + +### Category declarations, cat + +[]{#catjudgements} + +Category declarations + +`cat` *C* *G* + +define the **basic types** of abstract syntax. A basic type is formed +from a category by giving values to all variables in the **context** +*G*. If the context is empty, the basic type looks the same as the +category itself. Otherwise, application syntax is used: + +*C* *a*~1~\...*a*~n~ + + +### Hypotheses and contexts + +[]{#contexts} + +A context is a sequence of **hypotheses**, i.e. variable-type pairs. A +hypothesis is written + +`(` *x* `:` *T* `)` + +and a sequence does not have any separator symbols. As syntactic sugar, + +- variables can share a type, + `(` *x,y* `:` *T* `)` === `(` *x* `:` *T* `)` `(` *y* `:` *T* `)` +- a **wildcard** can be used for a variable not occurring in types + later in the context, + `(` `_` `:` *T* `)` === `(` *x* `:` *T* `)` +- if the variable does not occur later, it can be omitted altogether, + and parentheses are not used, + *T* === `(` *x* `:` *T* `)` + But if *T* is more complex than an identifier, it needs parentheses + to be separated from the rest of the context. + +An abstract syntax has **dependent types**, if any of its categories has +a non-empty context. + + +### Function declarations, fun + +Function declarations, + +`fun` *f* `:` *T* + +define the **syntactic constructors** of abstract syntax. The type *T* +of *f* is built built from basic types (formed from categories) by using +the function type constructor `->`. Thus its form is + +(*x*~1~ `:` *A*~1~) `->` \... `->` (*x*~n~ `:` *A*~n~) `->` *B* + +where *Ai* are types, called the **argument types**, and *B* is a basic +type, called the **value type** of *f*. The **value category** of *f* is +the category that forms the type *B*. + +A **syntax tree** is formed from *f* by applying it to a full list of +arguments, so that the result is of a basic type. + +A **higher-order function** is one that has a function type as an +argument. The concrete syntax of GF does not support displaying the +bound variables of functions of higher than second order, but they are +legal in abstract syntax. + +An abstract syntax is **context-free**, if it has neither dependent +types nor higher-order functions. Grammars with context-free abstract +syntax are an important subclass of GF, with more limited complexity +than full GF. Whether the *concrete* syntax is context-free in the sense +of the Chomsky hierarchy is independent of the context-freeness of the +abstract syntax. + + +### Function definitions, def + +Function definitions, + +`def` *f* *p*~1~ \... *p*~n~ `=` *t* + +where *f* is a `fun` function and *p*~i~\# are patterns, impose a +relation of **definitional equality** on abstract syntax trees. They +form the basis of **computation**, which is used when comparing whether +two types are equal; this notion is relevant only if the types are +dependent. Computation can also be used for the **normalization** of +syntax trees, which applies even in context-free abstract syntax. + +The set of `def` definitions for *f* can be scattered around the module +in which *f* is introduced as a function. The compiler builds the set of +pattern equations in the order in which the equations appear; this order +is significant in the case of overlapping patterns. All equations must +appear in the same module in which *f* itself declared. + +The syntax of patterns will be specified [here](#patternmatching), +commonly for abstract and concrete syntax. In abstract syntax, +**constructor patterns** are those of the form + +*C* *p*~1~ \... *p*~n~ + +where *C* is declared as `data` for some abstract syntax category (see +next section). A **variable pattern** is either an identifier or a +wildcard. + +A common pitfall is to forget to declare a constructor as data, which +causes it to be interpreted as a variable pattern in definitions. + +Computation is performed by applying definitions and beta conversions, +and in general by using **pattern matching**. Computation and pattern +matching are explained commonly for abstract and concrete syntax +[here](#patternmatching). + +In contrast to concrete syntax, abstract syntax computation is +completely **symbolic**: it does not produce a value, but just another +term. Hence it is not an error to have incomplete systems of pattern +equations for a function. In addition, the definitions can be +**recursive**, which means that computation can fail to terminate; this +can never happen in concrete syntax. + + +### Data constructor definitions, data + +A data constructor definition, + +`data` *C* `=` *f*~1~ `|` \... `|` *f*~n~ + +defines the functions *f1*\...*fn* to be **constructors** of the +category *C*. This means that they are recognized as constructor +patterns when used in function definitions. + +In order for the data constructor definition to be correct, +*f*~1~\...*f*~n~ must be functions with *C* as their value category. + +The complete set of constructors for a category *C* is the union of all +its data constructor definitions. Thus a category can be \"extended\" by +new constructors afterwards. However, all these constructor definitions +must appear in the same module in which the category is itself defined. + +There is syntactic sugar for declaring a function as a constructor at +the same time as introducing it: + +`data` *f* : *A*~1~ `->` \... `->` *A*~n~ `->` *C* *t*~1~ \... *t*~m~ + +=== + +`fun` *f* : *A*~1~ `->` \... `->` *A*~n~ `->` *C* *t*~1~ \... *t*~m~ ; +`data` *C* = *f* + + +### The semantic status of an abstract syntax function + +There are three possible statuses for a function declared in a `fun` +judgement: + +- primitive notion: the default status +- constructor: the function appears on the right-hand side in `data` + judgement +- defined: the function has a `def` definition + +The \"constructor\" and \"defined\" statuses are in contradiction with +each other, whereas the primitive notion status is overridden by any of +the two others. + +This distinction is relevant for the semantics of abstract syntax, not +for concrete syntax. It shows in the way patterns are treated in +equations in `def` definitions: a constructor in a pattern matches only +itself, whereas any other name is treated as a variable pattern, which +matches anything. + + +### Linearization type definitions, lincat + +A linearization type definition, + +`lincat` *C* `=` *T* + +defines the type of linearizations of trees whose type has category *C*. +Type dependences have no effect on the linearization type. + +The type *T* must be a **legal linearization type**, which means that it +is a *record type* whose fields have either parameter types, the type +Str of strings, or table or record types of these. In particular, +function types may not appear in *T*. A detailed explanation of types in +concrete syntax will be given [here](#cnctypes). + +If *K* is the concrete syntax of an abstract syntax *A*, then *K* must +define the linearization type of all categories declared in *A*. +However, the definition can be omitted from the source code, in which +case the default type `{s : Str}` is used. + + +### Linearization definitions, lin + +A linearization definition, + +`lin` *f* `=` *t* + +defines the linearizations function of function *f*, i.e. the function +used for linearizing trees formed by *f*. + +The type of *t* must be the homomorphic image of the type of *f*. In +other words, if + +`fun` *f* `:` *A*~1~ `->` \... `->` *A*~n~ `->` *A* + +then + +`lin` *f* `:` *A*~1~\* `->` \... `->` *A*~n~\* `->` *A*\* + +where the type *T*\* is defined as follows depending on *T*: + +- (*C* *t*~1~ \... *t*~n~)\* = *T*, if `lincat` *C* `=` *T* +- (*B*~1~ `->` \... `->` *B*~m~ `->` *B*)\* = *B*\* + `** {$0,...,$m : Str}` + +The second case is relevant for higher-order functions only. It says +that the linearization type of the value type is extended by adding a +string field for each argument types; these fields store the variable +symbol used for the binding of each variable. + +[]{#HOAS} + +Since the arguments of a function argument are treated as bare strings, +orders higher than the second are irrelevant for concrete syntax. + +There is syntactic sugar for binding the variables of the linearization +of a function on the left-hand side: + +`lin` *f* *p* `=` *t* === `lin` *f* `= \`*p* `->` *t* + +The pattern *p* must be either a variable or a wildcard (`_`); this is +what the syntax of lambda abstracts (`\p -> t`) requires. + + +### Linearization default definitions, lindef + +[]{#lindefjudgements} + +A linearization default definition, + +`lindef` *C* `=` *t* + +defines the default linearization of category *C*, i.e. the function +applicable to a string to make it into an object of the linearization +type of *C*. + +Linearization defaults are invoked when linearizing variable bindings in +higher-order abstract syntax. A variable symbol is then presented as a +string, which must be converted to correct type in order for the +linearization not to fail with an error. + +The other use of the defaults is for linearizing metavariables and +abstract functions without linearization in the concrete syntax. In the +first case the default linearization is applied to the string `"?X"` +where `X` is the unique index of the metavariable, and in the second +case the string is `"[f]"` where `f` is the name of the abstract +function with missing linearization. + +Usually, linearization defaults are generated by using the default rule +that \"uses the symbol itself for every string, and the first value of +the parameter type for every parameter\". The precise definition is by +structural recursion on the type: + +- default(Str,s) = s +- default(P,s) = \#1(P) +- default(P =\> T,s) = `\\_ =>` default(T,s) +- default(`{`\... ; r : R ; \...`}`,s) = `{`\... ; r : default(R,s) ; + \...`}` + +The notion of the first value of a parameter type (\#1(P)) is defined +[below](#paramvalues). + + +### Linearization reference definitions, linref + +[]{#linrefjudgements} + +A linearization reference definition, + +`linref` *C* `=` *t* + +defines the reference linearization of category *C*, i.e. the function +applicable to an object of the linearization type of *C* to make it into +a string. + +The reference linearization is always applied to the top-level node of +the abstract syntax tree. For example when we linearize the tree +`f x1 x2 .. xn`, then we first apply `f` to its arguments which gives us +an object of the linearization type of its category. After that we apply +the reference linearization for the same category to get a string out of +the object. This is particularly useful when the linearization type of +*C* contains discontious constituents. In this case usually the +reference linearization glues the constituents together to produce an +intuitive linearization string. + +The reference linearization is also used for linearizing metavariables +which stand in function position. For example the tree +`f (? x1 x2 .. xn)` is linearized as follows. Each of the arguments +`x1 x2 .. xn` is linearized, and after that the reference linearization +of the its category is applied to the output of the linearization. The +result is a sequence of `n` strings which are concatenated into a single +string. The final string is the input to the default linearization of +the category for the argument of `f`. After applying the default +linearization we get an object that we could safely pass to `f`. + +Usually, linearization references are generated by using the rule that +\"picks the first string in the linearization type\". The precise +definition is by structural recursion on the type: + +- reference(Str,o) = Just o +- reference(P,s) = Nothing +- reference(P =\> T,o) = reference(T,o ! \#1(P)) \|\| reference(T,o ! + \#2(P)) \|\| \... \|\| reference(T,o ! \#n(P)) +- reference({r1 : R1; \... rn : Rn},o) = reference(R1, o.r1) \|\| + reference(R2, o.r2) \|\| \... \|\| reference(Rn, o.rn) + +Here each call to reference returns either `(Just o)` or `Nothing`. When +we compute the reference for a table or a record then we pick the +reference for the first expression for which the recursive call gives us +`Just`. If we get `Nothing` for all of them then the final result is +`Nothing` too. + + +### Printname definitions, printname cat and printname fun + +A category printname definition, + +`printname cat` *C* `=` *s* + +defines the printname of category *C*, i.e. the name used in some +abstract syntax information shown to the user. + +Likewise, a function printname definition, + +`printname fun` *f* `=` *s* + +defines the printname of function *f*, i.e. the name used in some +abstract syntax information shown to the user. + +The most common use of printnames is in the interactive syntax editor, +where printnames are displayed in menus. It is possible e.g. to adapt +them to each language, or to embed HTML tooltips in them (as is used in +some HTML-based editor GUIs). + +Usually, printnames are generated automatically from the symbol and/or +concrete syntax information. + + +### Parameter type definitions, param + +[]{#paramjudgements} + +A parameter type definition, + +`param` *P* `=` *C*~1~ *G*~1~ `|` \... `|` *C*~n~ *G*~n~ + +defines a parameter type *P* with the **parameter constructors** +*C*~1~\...*C*~n~, with their respective contexts *G*~1~\...*G*~n~. + +[]{#paramtypes} + +Contexts have the same syntax as in `cat` judgements, explained +[here](#catjudgements). Since dependent types are not available in +parameter type definitions, the use of variables is never necessary. The +types in the context must themselves be **parameter types**, which are +defined as follows: + +- Given the judgement `param` *P* \..., *P* is a parameter type. +- A record type of parameter types is a parameter type. +- `Ints` *n* (an initial segment of integers) is a parameter type. + +The names defined by a parameter type definition include both the type +name *P* and the constructor names *C*~i~. Therefore all these names +must be distinct in a module. + +A parameter type may not be recursive, i.e. *P* itself may not occur in +the contexts of its constructors. This restriction extends to mutual +recursion: we say that *P* **depends** on the types that occur in the +contexts of its constructors and on all types that those types depend +on, and state that *P* may not depend on itself. + +In an `interface module`, it is possible to declare a parameter type +without defining it, + +`param` *P* `;` + + +### Parameter values + +[]{#paramvalues} + +All parameter types are finite, and the GF compiler will internally +compute them to **lists of parameter values**. These lists are formed by +traversing the `param` definitions, usually respecting the order of +constructors in the source code. For records, bibliographical sorting is +applied. However, both the order of traversal of `param` definitions and +the order of fields in a record are specified in a compiler-internal +way, which means that the programmer should not rely on any particular +order. + +The order of the list of parameter values can affect the program in two +cases: + +- in the default `lindef` definition ([here](#lindefjudgements)), the + first value is chosen +- in course-of-value tables ([here](#tables)), the compiler-internal + order is followed + +The first usage implies that, if `lindef` definitions are essential for +the application, they should be given manually. The second usage implies +that course-of-value tables should be avoided in hand-written GF code. + +In run-time grammar generation, all parameter values are translated to +integers denotions positions in these parameter lists. + + +### Operation definitions, oper + +An operation definition, + +`oper` *h* `:` *T* `=` *t* + +defines an **operation** *h* of type *T*, with the computation rule + +*h* ==\> *t* + +The type *T* can be any concrete syntax type, including function types +of any order. The term *t* must have the type *T*, as defined +[here](#expressions). + +As syntactic sugar, the type can be omitted, + +`oper` *h* `=` *t* + +which works in two cases + +- the type can be inferred from *t* (compiler-dependent) +- the definition occurs in an `instance` and the type is given in the + `interface` + +It is also possible to give the type and the definition separately: + +`oper` *h* `:` *T* ; `oper` *h* `=` *t* === `oper` *h* `:` *T* `=` *t* + +The order of the type part and the definition part is free, and there +can be other judgements in between. However, they must occur in the same +`resource` module for it to be complete (as defined +[here](#completeness)). In an `interface` module, it is enough to give +the type. + +When only the definition is given, it is possible to use a shorthand +similar to `lin` judgements: + +`oper` *h* *p* `=` *t* === `oper` *h* `=` `\`*p* `->` *t* + +The pattern *p* is either a variable or a wildcard (`_`). + +Operation definitions may not be recursive, not even mutually recursive. +This condition ensures that functions can in the end be eliminated from +concrete syntax code (as explained [here](#functionelimination)). + + +### Operation overloading + +[]{#overloading} + +One and the same operation name *h* can be used for different +operations, which have to have different types. For each call of *h*, +the type checker selects one of these operations depending on what type +is expected in the context of the call. The syntax of overloaded +operation definitions is + +`oper` *h* `= overload {`*h* : *T*~1~ = *t*~1~ ; \... ; *h* : *T*~n~ = +*t*~n~`}` + +Notice that *h* must be the same in all cases. This format can be used +to give the complete implementation; to give just the types, e.g. in an +interface, one can use the form + +`oper` *h* `: overload {`*h* : *T*~1~ ; \... ; *h* : *T*~n~`}` + +The implementation of this operation typing is given by a judgement of +the first form. The order of branches need not be the same. + + +### Flag definitions, flags + +A flag definition, + +`flags` *o* `=` *v* + +sets the value of the flag *o*, to be used when compiling or using the +module. + +The flag *o* is an identifier, and the value *v* is either an identifier +or a quoted string. + +Flags are a kind of metadata, which do not strictly belong to the GF +language. For instance, compilers do not necessarily check the +consistency of flags, or the meaningfulness of their values. The +inheritance of flags is not well-defined; the only certain rule is that +flags set in the module body override the settings from inherited +modules. + +Here are some flags commonly included in grammars. + + flag value description module + ------------ -------------------- ---------------------------------- ---------- + `coding` character encoding encoding used in string literals concrete + `startcat` category default target of parsing abstract + +The possible values of these flags are specified [here](#flagvalues). +Note that the `lexer` and `unlexer` flags are deprecated. If you need +their functionality, you should use supply them to GF shell commands +like so: + + put_string -lextext "страви, напої" | parse + +A summary of their possible values can be found at the [GF shell +reference](http://www.grammaticalframework.org/doc/gf-shell-reference.html). + + +Types and expressions +--------------------- + + +### Overview of expression forms + +[]{#expressions} + +Like many dependently typed languages, GF makes no syntactic distinction +between expressions and types. An illegal use of a type as an expression +or vice versa comes out as a type error. Whether a variable, for +instance, stands for a type or an expression value, can only be resolved +from its context of use. + +One practical consequence of the common syntax is that global and local +definitions (`oper` judgements and `let` expressions, respectively) work +in the same way for types and expressions. Thus it is possible to +abbreviate a type occurring in a type expression: + + let A = {s : Str ; b : Bool} in A -> A -> A + +Type and other expressions have a system of **precedences**. The +following table summarizes all expression forms, from the highest to the +lowest precedence. Some expressions are moreover left- or +right-associative. + + prec expression example explanation + --------- ------------------------------------ ----------------------------------- + 7 `c` constant or variable + 7 `Type` the type of types + 7 `PType` the type of parameter types + 7 `Str` the type of strings/token lists + 7 `"foo"` string literal + 7 `123` integer literal + 7 `0.123` floating point literal + 7 `?` metavariable + 7 `[]` empty token list + 7 `[C a b]` list category + 7 `["foo bar"]` token list + 7 `{"s : Str ; n : Num}` record type + 7 `{"s = "foo" ; n = Sg}` record + 7 `<Sg,Fem,Gen>` tuple + 7 `<n : Num>` type-annotated expression + 6 left `t.r` projection or qualification + 5 left `f a` function application + 5 `table {Sg => [] ; _ => "xs"}` table + 5 `table P [a ; b ; c]` course-of-values table + 5 `case n of {Sg => [] ; _ => "xs"}` case expression + 5 `variants {"color" ; "colour"}` free variation + 5 `pre {vowel => "an" ; _ => "a"}` prefix-dependent choice + 4 left `t ! v` table selection + 4 left `A * B` tuple type + 4 left `R ** {b : Bool}` record (type) extension + 3 left `t + s` token gluing + 2 left `t ++ s` token list concatenation + 1 right `\x,y -> t` function abstraction (\"lambda\") + 1 right `\\x,y => t` table abstraction + 1 right `(x : A) -> B` dependent function type + 1 right `A -> B` function type + 1 right `P => T` table type + 1 right `let x = v in t` local definition + 1 `t where {x = v}` local definition + 1 `in M.C "foo"` rule by example + +Any expression in parentheses (`(`*exp*`)`) is in the highest precedence +class. + + +### The functional fragment: expressions in abstract syntax + +[]{#functiontype} + +The expression syntax is the same in abstract and concrete syntax, +although only a part of the syntax is actually usable in well-typed +expressions in abstract syntax. An abstract syntax is essentially used +for defining a set of types and a set of functions between those types. +Therefore it needs essentially the **functional fragment** of the +syntax. This fragment comprises two kinds of types: + +- **basic types**, of form *C a1\...an* where + - `cat` *C* (*x*~1~ : *A*~1~)\...(*x*~n~ : *A*~n~), including the + predefined categories `Int`, `Float`, and `String` explained + [here](#predefabs) + - *a*~1~ : *A*~1~,\...,*a*~n~ : *A*~n~{*x*~1~ = + *a*~1~,\...,*x*~n-1~=*a*~n-1~} + +<!-- --> + +- **function types**, of form (*x* : *A*) `->` *B*, where + - *A* is a type + - *B* is a type possibly depending on *x* : *A* + +When defining basic types, we used the notation *t*{*x*~1~ = +*t*~1~,\...,*x*~n~=*t*~n~} for the **substitution** of values to +variables. This is a metalevel notation, which denotes a term that is +formed by replacing the free occurrences of each variable *x*~i~ by +*t*~i~. + +These types have six kinds of expressions: + +- **constants**, *f* : *A* where + - `fun` *f* : *A* + +<!-- --> + +- **literals** for integers, floats, and strings (defined in + [here](#predefabs)) + +<!-- --> + +- **variables**, *x* : *A* where + - *x* has been introduced by a binding + +<!-- --> + +- **applications**, *f a* : *B*{*x*=*a*}, where + - *f* : (*x* : *A*) `->` *B* + - *a* : *A* + +<!-- --> + +- **abstractions**, `\`*x* `->` *b* : (*x* : *A*) `->` *B*, where + - *b* : *B* possibly depending on *x* : *A* + +<!-- --> + +- **metavariables**, `?`, as introduced in intermediate phases of + incremental type checking; metavariables are not permitted in GF + source code + +[]{#variablebinding} + +The notion of **binding** is defined for occurrences of variables in +subexpressions as follows: + +- in (*x* : *A*) `->` *B*, *x* is bound in *B* +- in `\`*x* `->` *b*, *x* is bound in *b* +- in `def` *f* *p*~1~ \... *p*~n~ = *t*, any pattern variable + introduced in any *pi* is bound in *t* (as defined + [here](#patternmatching)) + +As syntactic sugar, function types have sharing of types and suppression +of variables, in the same way as contexts (defined [here](#contexts)): + +- variables can share a type, + `(` *x,y* `:` *A* `)` `->` *B* === `(` *x* `:` *A* `) -> (` *y* `:` + *A* `) ->` *B* +- a **wildcard** can be used for a variable not occurring later in the + type, + `(` `_` `:` *A* `) ->` *B* === `(` *x* `:` *T* `) ->` *B* +- if the variable does not occur later, it can be omitted altogether, + and parentheses are not used, + *A* `->` *B* === `(` *\_* `:` *A* `) ->` *B* + +There is analogous syntactic sugar for constant functions, + +`\`*\_* `->` *t* === `\`*x* `->` *t* + +where *x* does not occur in *t*, and for multiple lambda abstractions: + +`\`*p,q* `->` *t* === `\`*p* `->` `\`*q* `->` *t* + +where *p* and *q* are variables or wild cards (`_`). + + +### Conversions + +Among expressions, there is a relation of **definitional equality** +defined by four **conversion rules**: + +- **alpha conversion**: `\`*x* `->` *b* = `\`*y* `->` *b*{*x*=*y*} + +<!-- --> + +- **beta conversion**: (`\`*x* `->` *b*) *a* = *b*{*x*=*a*} + +<!-- --> + +- **delta conversion**: *f* *a*~1~ \... *a*~n~ = *tg*, if + - there is a definition `def` *f* *p*~1~ \... *p*~n~ = *t* + - this definition is the first for *f* that matches the sequence + *a*~1~ \.... *a*~n~, with the substitution *g* + +<!-- --> + +- **eta conversion**: *c* = `\`*x* `->` *c x*, if *c* : (*x* : *A*) + `->` *B* + +Pattern matching substitution used in delta conversion is defined +[here](#patternmatching). + +An expression is in **beta-eta-normal form** if + +- it has no subexpressions to which beta conversion applies (beta + normality) +- each constant or variable whose type is a function type must be + **eta-expanded**, i.e. made into an abstract equal to it by eta + conversion (eta normality) + +Notice that the iteration of eta expansion would lead to an expression +not in beta-normal form. + + +### Syntax trees + +[]{#syntaxtrees} + +The **syntax trees** defined by an abstract syntax are well-typed +expressions of basic types in beta-eta normal form. Linearization +defined in concrete syntax applies to all and only these expressions. + +There is also a direct definition of syntax trees, which does not refer +to beta and eta conversions: keeping in mind that a type always has the +form + +(*x*~1~ : *A*~1~) `->` \... `->` (*x*~n~ : *A*~n~) `->` *B* + +where *Ai* are types and *B* is a basic type, a syntax tree is an +expression + +*b* *t*~1~ \... *t*~n~ : *B\'* + +where + +- *B\'* is the basic type *B*{*x*~1~ = *t*~1~,\...,*x*~n~ = *t*~n~} +- `fun` *b* : (*x*~1~ : *A*~1~) `->` \... `->` (*x*~n~ : *A*~n~) `->` + *B* +- each *t*~i~ has the form `\`*z*~1~,\...,*z*~m~ `->` *c* where *A*~i~ + is + (*y*~1~ : *B*~1~) `->` \... `->` (*y*~m~ : *B*~m~) `->` *B* + + +### Predefined types in abstract syntax + +[]{#predefabs} + +GF provides three predefined categories for abstract syntax, with +predefined expressions: + + category expressions + ---------- --------------------------------------- + `Int` integer literals, e.g. `123` + `Float` floating point literals, e.g. `12.34` + `String` string literals, e.g. `"foo"` + +These categories take no arguments, and they can be used as basic types +in the same way as if they were introduced in `cat` judgements. However, +it is not legal to define `fun` functions that have any of these types +as value type: their only well-typed expressions are literals as defined +in the above table. + + +### Overview of expressions in concrete syntax + +[]{#cnctypes} + +Concrete syntax is about defining mappings from abstract syntax trees to +**concrete syntax objects**. These objects comprise + +- records +- tables +- strings +- parameter values + +Thus functions are not concrete syntax objects; however, the mappings +themselves are expressed as functions, and the source code of a concrete +syntax can use functions under the condition that they can be eliminated +from the final compiled grammar (which they can; this is one of the +fundamental properties of compilation, as explained in more detail in +the *JFP* article). + +Concrete syntax thus has the same function types and expression forms as +abstract syntax, specified [here](#functiontype). The basic types +defined by categories (`cat` judgements) are available via grammar reuse +explained [here](#reuse); this also comprises the predefined categories +`Float` and `String`. + + +### Values, canonical forms, and run-time variables + +In abstract syntax, the conversion rules fiven [here](#conversions) +define a computational relation among expressions, but there is no +separate notion of a **value** of computation: the value (the end point) +of a computation chain is simply an expression to which no more +conversions apply. In general, we are interested in expressions that +satisfy the conditions of being syntax trees (as defined +[here](#syntaxtrees)), but there can be many computationally equivalent +syntax trees which nonetheless are distinct syntax trees and hence have +different linearizations. The main use of computation in abstract syntax +is to compare types in dependent type checking. + +In concrete syntax, the notion of values is central. At run time, we +want to compute the values of linearizations; at compile time, we want +to perform **partial evaluation**, which computes expressions as far as +possible. To specify what happens in computation we therefore have to +distinguish between **canonical forms** and other forms of expressions. +The canonical forms are defined separately for each form of type, +whereas the other forms may usually produce expressions of any type. + +[]{#linexpansion} []{#runtimevariables} + +What is done at compile time is the elimination of any noncanonical +forms, except for those depending on **run-time variables**. Run-time +variables are the same as the **argument variables** of linearization +rules, i.e. the variables *x*~1~,\...,*x*~n~ in + +`lin` *f* `= \` *x*~1~,\...,*x*~n~ `->` *t* + +where + +`fun` *f* `:` (*x*~1~ : *A*~1~) `->` \... `->` (*x*~n~ : *A*~n~) `->` +*B* + +Notice that this definition refers to the **eta-expanded** linearization +term, which has one abstracted variable for each argument type of *f*. +These variables are not necessarily explicit in GF source code, but +introduced by the compiler. + +Since certain expression forms should be eliminated in compilation but +cannot be eliminated if run-time variables appear in them, errors can +appear late in compilation. This is an issue with the following +expression forms: + +- gluing (`s + t`), defined [here](#gluing) +- pattern matching on strings, defined [here](#patternmatching) +- predefined string operations, defined [here](#predefcnc) (those + taking `Str` arguments) + + +### Token lists, tokens, and strings + +[]{#strtype} + +The most prominent basic type is `Str`, the type of **token lists**. +This type is often sloppily referred to as the type of **strings**; but +it should be kept in mind that the objects of `Str` are *lists* of +strings rather than single strings. + +Expressions of type `Str` have the following canonical forms: + +- **tokens**, i.e. **string literals**, in double quotes, e.g. `"foo"` +- **the empty token list**, `[]` +- **concatenation**, *s* `++` *t*, where *s,t* : `Str` +- **prefix-dependent choice**, + `pre {p1 => s1 ; ... ; pn => sn ; _ => s }`, where + - *s*, *s*~1~,\...,*s*~n~, *p*~1~,\...,*p*~n~ : `Str` + +For convenience, the notation is overloaded so that tokens are +identified with singleton token lists, and there is no separate type of +tokens (this is a change from the *JFP* article). The notion of a token +is still important for compilation: all tokens introduced by the grammar +must be known at compile time. This, in turn, is required by the parsing +algorithms used for parsing with GF grammars. + +In addition to string literals, tokens can be formed by a specific +non-canonical operator: + +- **gluing**, *s* `+` *t*, where *s,t* : `Str` + +[]{#gluing} + +Being noncanonical, gluing is equipped with a computation rule: string +literals are glued by forming a new string literal, and empty token +lists can be ignored: + +- `"foo" + "bar"` ==\> `"foobar"` +- *t* `+ []` ==\> *t* +- `[] +` *t* ==\> *t* + +Since tokens must be known at compile time, the operands of gluing may +not depend on run-time variables, as defined [here](#runtimevariables). + +As syntactic sugar, token lists can be given as bracketed string +literals, where spaces separate tokens: + +- **token lists**, `["one two three"]` === `"one" ++ "two" ++ "three"` + +Notice that there are no empty tokens, but the expression `[]` can be +used in a context requiring a token, in particular in gluing expression +below. Since `[]` denotes an empty token list, the following computation +laws are valid: + +- *t* `++ []` ==\> *t* +- `[] ++` *t* ==\> *t* + +Moreover, concatenation and gluing are associative: + +- s `+` (t `+` u) ==\> s `+` t `+` u +- s `++` (t `++` u) ==\> s `++` t `++` u + +For the programmer, associativity and the empty token laws mean that the +compiler can use them to simplify string expressions. It also means that +these laws are respected in pattern matching on strings. + +A prime example of prefix-dependent choice operation is the following +approximative expression for the English indefinite article: + + pre { + ("a" | "e" | "i" | "o") => "an" ; + _ => "a" + } ; + +This expression can be computed in the context of a subsequent token: + +- `pre {p1 => s1 ; ... ; pn => sn ; _ => s } ++ t` ==\> + - *s*~i~ for the first *i* such that the prefix *p*~i~ matches + *t*, if it exists + - *s* otherwise + +The **matching prefix** is defined by comparing the string with the +prefix of the token. If the prefix is a variant list of strings, then it +matches the token if any of the strings in the list matches it. + +The computation rule can sometimes be applied at compile time, but it +general, prefix-dependent choices need to be passed to the run-time +grammar, because they are not given a subsequent token to compare with, +or because the subsequent token depends on a run-time variable. + +The prefix-dependent choice expression itself may not depend on run-time +variables. + +*There is an older syntax for prefix-dependent choice, namely: +`pre { s ; s1 / p1 ; ... ; sn / pn}`. This syntax will not accept +strings as patterns.* + +*In GF prior to 3.0, a specific type* `Strs` *is used for defining +prefixes,* *instead of just* `variants` *of* `Str`. + + +### Records and record types + +A **record** is a collection of objects of possibly different types, +accessible by **projections** from the record with **labels** pointing +to these objects. A record is also itself an object, whose type is a +**record type**. Record types have the form + +`{` *r*~1~ : *A*~1~ `;` \... `;` *r*~n~ : *A*~n~ `}` + +where *n* \>= 0, each *A*~i~ is a type, and the labels *r*~i~ are +distinct. A record of this type has the form + +`{` *r*~1~ = *a*~1~ `;` \... `;` *r*~n~ = *a*~n~ `}` + +where each \#aii : \"Aii. A limiting case is the **empty record type** +`{}`, which has the object `{}`, the **empty record**. + +The **fields** of a record type are its parts of the form *r* : *A*, +also called **typings**. The **fields** of a record are of the form *r* += *a*, also called **value assignments**. Value assignments may +optionally indicate the type, as in *r* : *A* = *a*. + +The order of fields in record types and records is insignificant: two +record types (or records) are equal if they have the same fields, in any +order, and a record is an object of a record type, if it has +type-correct value assignments for all fields of the record type. The +latter definition implies the even stronger principle of **record +subtyping**: a record can have any type that has some subset of its +fields. This principle is explained further [here](#subtyping). + +All fields in a record must have distinct labels. Thus it is not +possible e.g. to \"redefine\" a field \"later\" in a record. + +Lexically, labels are identifiers (defined [here](#identifiers)). This +is with the exception of the labels selecting bound variables in the +linearization of higher-order abstract syntax, which have the form +`$`*i* for an integer *i*, as specified [here](#HOAS). In source code, +these labels should not appear in records fields, but only in +selections. + +Labels occur only in syntactic positions where they cannot be confused +with constants or variables. Therefore it is safe to write, as in +`Prelude`, + + ss : Str -> {s : Str} = \s -> {s = s} ; + +A **projection** is an expression of the form + +*t*.*r* + +where *t* must be a record and *r* must be a label defined in it. The +type of the projection is the type of that field. The computation rule +for projection returns the value assigned to that field: + +`{` \... `;` *r* = *a* `;` \... `}.`*r* ==\> *a* + +Notice that the dot notation *t*.*r* is also used for qualified names as +specified [here](#qualifiednames). This ambiguity follows tradition and +convenience. It is resolved by the following rules (before type +checking): + +1. if *t* is a bound variable or a constant in scope, *t*.*r* is + type-checked as a projection +2. otherwise, *t*.*r* is type-checked as a qualified name + +As syntactic sugar, types and values can be shared: + +- `{` \... `;` *r,s* : *A* `;` \... `}` === `{` \... `;` *r* : *A* `;` + *s* : *A* `;` \... `}` +- `{` \... `;` *r,s* = *a* `;` \... `}` === `{` \... `;` *r* = *a* `;` + *s* = *a* `;` \... `}` + +Another syntactic sugar are **tuple types** and **tuples**, which are +translated by endowing their unlabelled fields by the labels `p1`, +`p2`,\... in the order of appearance of the fields: + +- *A*~1~ `*` \... `*` *A*~n~ === `{` `p1` : *A*~1~ `;` \... `;` `pn` : + *A*~n~ `}` +- `<`*a*~1~ `,` \... `,` *a*~n~ `>` === `{` `p1` = *a*~1~`;` \... `;` + `pn` = *a*~n~ `}` + +A **record extension** is formed by adding fields to a record or a +record type. The general syntax involves two expressions, + +*R* `**` *S* + +The result is a record type or a record with a union of the fields of +*R* and *S*. It is therefore well-formed if + +- both *R* and *S* are either records or record types +- the labels in *R* and *S* are disjoint, if *R* and *S* are record + types + +(Since GF version 3.6) If *R* and *S* are record objects, then the +labels in them need not be disjoint. Labels defined in *S* are then +given priority, so that record extensions in fact works as **record +update**. A common pattern of using this feature is + + lin F x ... = x ** {r = ... x.r ...} + +where `x` is a record with many fields, just one of which is updated. +Following the normal binding conditions, `x.r` on the right hand side +still refers to the old value of the `r` field. + + +### Subtyping + +The possibility of having superfluous fields in a record forms the basis +of the **subtyping** relation. That *A* is a subtype of *B* means that +*a : A* implies *a : B*. This is clearly satisfied for records with +superfluous fields: + +- if *R* is a record type without the label *r*, then *R* `** {` *r* : + *A* `}` is a subtype of *R* + +The GF grammar compiler extends subtyping to function types by +**covariance** and **contravariance**: + +- covariance: if *A* is a subtype of *B*, then *C* `->` *A* is a + subtype of *C* `->` *B* +- contravariance: if *A* is a subtype of *B*, then *B* `->` *C* is a + subtype of *A* `->` *C* + +The logic of these rules is natural: if a function is returns a value in +a subtype, then this value is *a fortiori* in the supertype. If a +function is defined for some type, then it is *a fortiori* defined for +any subtype. + +In addition to the well-known principles of record subtyping and co- and +contravariance, GF implements subtyping for initial segments of +integers: + +- if *m* \< *n*, then `Ints` *m* is a subtype of `Ints` *n* +- `Ints` *n* is a subtype of `Integer` + +As the last rule, subtyping is transitive: + +- if *A* is a subtype of *B* and *B* is a subtype of *C*, then *A* is + a subtype of *C*. + + +### Tables and table types + +[]{#tables} + +One of the most characteristic constructs of GF is **tables**, also +called **finite functions**. That these functions are finite means that +it is possible to finitely enumerate all argument-value pairs; this, in +turn, is possible because the argument types are finite. + +A **table type** has the form + +*P* `=>` *T* + +where *P* must be a parameter type in the sense defined +[here](#paramtypes), whereas *T* can be any type. + +Canonical expressions of table types are **tables**, of the form + +`table` `{` *V*~1~ `=>` *t*~1~ ; \... ; *V*~n~ `=>` *t*~n~ `}` + +where *V*~1~,\...,*V*~n~ is the complete list of the parameter values of +the argument type *P* (defined [here](#paramvalues)), and each *t*~i~ is +an expression of the value type *T*. + +In addition to explicit enumerations, tables can be given by **pattern +matching**, + +`table` `{`*p*~1~ `=>` *t*~1~ ; \... ; *p*~m~ `=>` *t*~m~`}` + +where *p*~1~,\....,*p*~m~ is a list of patterns that covers all values +of type *P*. Each pattern *p*~i~ may bind some variables, on which the +expression *t*~i~ may depend. A complete account of patterns and pattern +matching is given [here](#patternmatching). + +A **course-of-values table** omits the patterns and just lists all +values. It uses the enumeration of all values of the argument type *P* +to pair the values with arguments: + +`table` *P* `[`*t*~1~ ; \... ; *t*~n~`]` + +This format is not recommended for GF source code, since the ordering of +parameter values is not specified and therefore a compiler-internal +decision. + +The argument type can be indicated in ordinary tables as well, which is +sometimes helpful for type inference: + +`table` *P* `{` \... `}` + +The **selection** operator `!`, applied to a table *t* and to an +expression *v* of its argument type + +*t* `!` *v* + +returns the first pattern matching result from *t* with *v*, as defined +[here](#patternmatching). The order of patterns is thus significant as +long as the patterns contain variables or wildcards. When the compiler +reorders the patterns following the enumeration of all values of the +argument type, this order no longer matters, because no overlap remains +between patterns. + +The GF compiler performs **table expansion**, i.e. an analogue of eta +expansion defined [here](#conversions), where a table is applied to all +values to its argument type: + +*t* : *P* `=>` *T* ==\> `table` *P* `[`*t* `!` *V*~1~ ; \... ; *t* `!` +*V*~n~`]` + +As syntactic sugar, one-branch tables can be written in a way similar to +lambda abstractions: + +`\\`*p* `=>` *t* === `table {`*p* `=>` *t* `}` + +where *p* is either a variable or a wildcard (`_`). Multiple bindings +can be abbreviated: + +`\\`*p,q* `=>` *t* === `\\`*p* `=>` `\\`*q* `=>` *t* + +**Case expressions** are syntactic sugar for selections: + +`case` *e* `of {`\...`}` === `table {`\...`} !` *e* + + +### Pattern matching + +[]{#patternmatching} + +We will list all forms of patterns that can be used in table branches. +We define their **variable bindings** and **matching substitutions**. + +We start with the patterns available for all parameter types, as well as +for the types `Integer` and `Str`. + +- A constructor pattern *C* *p*~1~\...*p*~n~ binds the union of all + variables bound in the subpatterns *p*~1~,\...,*p*~n~. It matches + any value *C* *V*~1~\...*V*~n~ where each *p*~i~\# matches *V*~i~, + and the matching substitution is the union of these substitutions. +- A record pattern `{` *r*~1~ `=` *p*~1~ `;` \... `;` *r*~n~ `=` + *p*~n~ `}` binds the union of all variables bound in the subpatterns + *p*~1~,\...,*p*~n~. It matches any value `{` *r*~1~ `=` *V*~1~ `;` + \... `;` *r*~n~ `=` *V*~n~ `;` \...`}` where each *p*~i~\# matches + *V*~i~, and the matching substitution is the union of these + substitutions. +- A variable pattern *x* (identifier other than parameter constructor) + binds the variable *x*. It matches any value *V*, with the + substitution {*x* = *V*}. +- The wild card `_` binds no variables. It matches any value, with the + empty substitution. +- A disjunctive pattern *p* `|` *q* binds the intersection of the + variables bound by *p* and *q*. It matches anything that either *p* + or *q* matches, with the first substitution starting with *p* + matches, from which those variables that are not bound by both + patterns are removed. +- A negative pattern `-` *p* binds no variables. It matches anything + that *p* does *not* match, with the empty substitution. +- An alias pattern *x* `@` *p* binds *x* and all the variables bound + by *p*. It matches any value *V* that *p* matches, with the same + substition extended by {*x* = *V*}. + +The following patterns are only available for the type `Str`: + +- A string literal pattern, e.g. `"s"`, binds no variables. It matches + the same string, with the empty substitution. +- A concatenation pattern, *p* `+` *q*, binds the union of variables + bound by *p* and *q*. It matches any string that consists of a + prefix matching *p* and a suffix matching *q*, with the union of + substitutions corresponding to the first match (see below). +- A repetition pattern *p*`*` binds no variables. It matches any + string that can be decomposed into strings that match *p*, with the + empty substitution. + +The following pattern is only available for the types `Integer` and +`Ints` *n*: + +- An integer literal pattern, e.g. `214`, binds no variables. It + matches the same integer, with the empty substitution. + +All patterns must be **linear**: the same pattern variable may occur +only once in them. This is what makes it straightforward to speak about +unions of binding sets and substitutions. + +Pattern matching is performed in the order in which the branches appear +in the source code: the branch of the first matching pattern is +followed. In concrete syntax, the type checker reject sets of patterns +that are not exhaustive, and warns for completely overshadowed patterns. +It also checks the type correctness of patterns with respect to the +argument type. In abstract syntax, only type correctness is checked, no +exhaustiveness or overshadowing. + +It follows from the definition of record pattern matching that it can +utilize partial records: the branch + + {g = Fem} => t + +in a table of type `{g : Gender ; n : Number} => T` means the same as + + {g = Fem ; n = _} => t + +Variables in regular expression patterns are always bound to the **first +match**, which is the first in the sequence of binding lists. For +example: + +- `x + "e" + y` matches `"peter"` with `x = "p", y = "ter"` +- `x + "er"*` matches `"burgerer"` with `x = "burg"` + + +### Free variation + +An expressions of the form + +`variants` `{`*t*~1~ ; \... ; *t*~n~`}` + +where all *t*~i~ are of the same type *T*, has itseld type *T*. This +expression presents *t*~i~,\...,*t*~n~ as being in **free variation**: +the choice between them is not determined by semantics or parameters. A +limiting case is + +`variants {}` + +which encodes a rule saying that there is no way to express a certain +thing, e.g. that a certain inflectional form does not exist. + +A common wisdom in linguistics is that \"there is no free variation\", +which refers to the situation where *all* aspects are taken into +account. For instance, the English negation contraction could be +expressed as free variation, + + variants {"don't" ; "do" ++ "not"} + +if only semantics is taken into account, but if stylistic aspects are +included, then the proper formulation might be with a parameter +distinguishing between informal and formal style: + + case style of {Informal => "don't" ; Formal => "do" ++ "not"} + +Since there is not way to choose a particular element from a +\`\`variants\` list, free variants is normally not adequate in +libraries, nor in grammars meant for natural language generation. In +application grammars meant to parse user input, free variation is a way +to avoid cluttering the abstract syntax with semantically insignificant +distinctions and even to tolerate some grammatical errors. + +Permitting `variants` in all types involves a major modification of the +semantics of GF expressions. All computation rules have to be lifted to +deal with lists of expressions and values. For instance, + +*t* `!` `variants` `{`*t*~1~ ; \... ; *t*~n~`}` ==\> `variants` `{`*t* +`!` *t*~1~ ; \... ; *t* `!` *t*~n~`}` + +This is done in such a way that variation does not distribute to records +(or other product-like structures). For instance, variants of records, + + variants {{s = "Auto" ; g = Neutr} ; {s = "Wagen" ; g = Masc}} + +is *not* the same as a record of variants, + + {s = variants {"Auto" ; "Wagen"} ; g = variants {Neutr ; Masc}} + +Variants of variants are flattened, + +`variants` `{`\...; `variants` `{`*t*~1~ ;\...; *t*~n~`}` ;\...`}` ==\> +`variants` `{`\...; *t*~1~ ;\...; *t*~n~ ;\...`}` + +and singleton variants are eliminated, + +`variants` `{`*t*`}` ==\> *t* + + +### Local definitions + +A **local definition**, i.e. a **let expression** has the form + +`let` *x* : *T* = *t* `in` *e* + +The type of *x* must be *T*, which also has to be the type of *t*. +Computation is performed by substituting *t* for *x* in *e*: + +`let` *x* : *T* = *t* `in` *e* ==\> *e* {*x* = *t*} + +As syntactic sugar, the type can be omitted if the type checker is able +to infer it: + +`let` *x* = *t* `in` *e* + +It is possible to compress several local definitions into one block: + +`let` *x* : *T* = *t* `;` *y* : *U* = *u* `in` *e* === `let` *x* : *T* = +*t* `in` `let` *y* : *U* = *u* `in` *e* + +Another notational variant is a definition block appearing after the +main expression: + +*e* `where` `{`\...`}` === `let` `{`\...`}` `in` *e* + +Curly brackets are obligatory in the `where` form, and can also be +optionally used in the `let` form. + +Since a block of definitions is treated as syntactic sugar for a nested +`let` expression, a constant must be defined before it is used: the +scope is not mutual, as in a module body. Furthermore, unlike in `lin` +and `oper` definitions, it is *not* possible to bind variables on the +left of the equality sign. + + +### Function applications in concrete syntax + +[]{#functionelimination} + +Fully compiled concrete syntax may not include expressions of function +types except on the outermost level of `lin` rules, as defined +[here](#linexpansion). However, in the source code, and especially in +`oper` definitions, functions are the main vehicle of code reuse and +abstraction. Thus function types and functions follow the same rules as +in abstract syntax, as specified [here](#functiontype). In particular, +the application of a lambda abstract is computed by beta conversion. + +To ensure the elimination of functions, GF uses a special computation +rule for pushing function applications inside tables, since otherwise +run-time variables could block their applications: + +(`table` `{`*p*~1~ `=>` *f*~1~ ; \... ; *p*~n~ `=>` *f*~n~ `}` `!` *e*) +*a* ==\> `table` `{`*p*~1~ `=>` *f*~1~ *a* ; \... ; *p*~n~ `=>` *f*~n~ +*a*`}` `!` *e* + +Also parameter constructors with non-empty contexts, as defined +[here](#paramjudgements), result in expressions in application form. +These expressions are never a problem if their arguments are just +constructors, because they can then be translated to integers +corresponding to the position of the expression in the enumaration of +the values of its type. However, a constructor applied to a run-time +variable may need to be converted as follows: + +*C*\...*x*\... ==\> `case` *x* of `{_ =>` *C*\...*x*`}` + +The resulting expression, when processed by table expansion as explained +[here](#tables), results in *C* being applied to just values of the type +of *x*, and the application thereby disappears. + + +### Reusing top-level grammars as resources + +[]{#reuse} + +*This section is valid for GF 3.0, which abandons the \"lock field\"* +*discipline of GF 2.8.* + +As explained [here](#openabstract), abstract syntax modules can be +opened as interfaces and concrete syntaxes as their instances. This +means that judgements are, as it were, translated in the following way: + +- `cat` *C* *G* ===\> `oper` *C* : `Type` +- `fun` *f* : *T* ===\> `oper` *f* : *T* +- `lincat` *C* = *T* ===\> `oper` *C* : `Type` = *C* +- `lin` *f* = *t* ===\> `oper` *f* = *t* + +Notice that the value *T* of `lincat` definitions is not disclosed in +the translation. This means that the type *C* remains abstract: the only +ways of building an object of type *C* are the operations *f* obtained +from *fun* and *lin* rules. + +The purpose of keeping linearization types abstract is to enforce +**grammar checking via type checking**. This means that any well-typed +operation application is also well-typed in the sense of the original +grammar. If the types were disclosed, then we could for instance easily +confuse all categories that have the linearization type `{s : Str}`. Yet +another reason is that revealing the types makes it impossible for the +library programmers to change their type definitions afterwards. + +Library writers may occasionally want to have access to the values of +linearization types. The way to make it possible is to add an extra +construction operation to a module in which the linearization type is +available: + + oper MkC : T -> C = \x -> x + +In object-oriented terms, the type *C* itself is **protected**, whereas +*MkC* is a **public constructor** of *C*. Of course, it is possible to +make these constructors overloaded (concept explained +[here](#overloading)), to enable easy access to special cases. + + +### Predefined concrete syntax types + +[]{#predefcnc} + +The following concrete syntax types are predefined: + +- `Str`, the type of tokens and token lists (defined [here](#strtype)) +- `Integer`, the type of nonnegative integers +- `Ints` *n*, the type of integers from *0* to *n* +- `Type`, the type of (concrete syntax) types +- `PType`, the type of parameter types + +The last two types are, in a way, extended by user-written grammars, +since new parameter types can be defined in the way shown +[here](#paramjudgements), and every paramater type is also a type. From +the point of view of the values of expressions, however, a `param` +declaration does not extend `PType`, since all parameter types get +compiled to initial segments of integers. + +Notice the difference between the concrete syntax types `Str` and +`Integer` on the one hand, and the abstract syntax categories `String` +and `Int`, on the other. As *concrete syntax* types, the latter are +treated in the same way as any reused categories: their objects can be +formed by using syntax trees (string and integer literals). + +*The type name* `Integer` *replaces in GF 3.0 the name* `Int`, *to avoid +confusion with the abstract syntax type and to be analogous* *with the* +`Str` *vs.* `String` *distinction.* + + +### Predefined concrete syntax operations + +The following predefined operations are defined in the resource module +`prelude/Predef.gf`. Their implementations are defined as a part of the +GF grammar compiler. + + ------------------------------------------------------------------------------------------------- + operation type explanation + -------------- --------------------------------- ------------------------------------------------ + `PBool` `PType` `PTrue | PFalse` + + `Error` `Type` the empty type + + `Int` `Type` the type of integers + + `Ints` `Integer -> Type` the type of integers from 0 to n + + `error` `Str -> Error` forms error message + + `length` `Str -> Int` length of string + + `drop` `Integer -> Str -> Str` drop prefix of length + + `take` `Integer -> Str -> Str` take prefix of length + + `tk` `Integer -> Str -> Str` drop suffix of length + + `dp` `Integer -> Str -> Str` take suffix of length + + `eqInt` `Integer -> Integer -> PBool` test if equal integers + + `lessInt` `Integer -> Integer -> PBool` test order of integers + + `plus` `Integer -> Integer -> Integer` add integers + + `eqStr` `Str -> Str -> PBool` test if equal strings + + `occur` `Str -> Str -> PBool` test if occurs as substring + + `occurs` `Str -> Str -> PBool` test if any char occurs + + `show` `(P : Type) -> P -> Str` convert param to string + + `read` `(P : Type) -> Str -> P` convert string to param + + `toStr` `(L : Type) -> L -> Str` find the \"first\" string + + `nonExist` `Str` a special token marking\ + non-existing morphological forms + + `BIND` `Str` a special token marking\ + that the surrounding tokens should not\ + be separated by space + + `SOFT_BIND` `Str` a special token marking\ + that the surrounding tokens may not\ + be separated by space + + `SOFT_SPACE` `Str` a special token marking\ + that the space between the surrounding tokens\ + is optional + + `CAPIT` `Str` a special token marking\ + that the first character in the next token\ + should be capitalized + + `ALL_CAPIT` `Str` a special token marking\ + that the next word should be\ + in all capital letters + ------------------------------------------------------------------------------------------------- + +Compilation eliminates these operations, and they may therefore not take +arguments that depend on run-time variables. + +The module `Predef` is included in the *opens* list of all modules, and +therefore does not need to be opened explicitly. + + +Flags and pragmas +----------------- + + +### Some flags and their values + +[]{#flagvalues} + +The flag `coding` in concrete syntax sets the **character encoding** +used in the grammar. Internally, GF uses unicode, and `.pgf` files are +always written in UTF8 encoding. The presence of the flag `coding=utf8` +prevents GF from encoding an already encoded file. + +The flag `startcat` in abstract syntax sets the default start category +for parsing, random generation, and any other grammar operation that +depends on category. Its legal values are the categories defined or +inherited in the abstract syntax. + + +### Compiler pragmas + +**Compiler pragmas** are a special form of comments prefixed with `--#`. +Currently GF interprets the following pragmas. + + pragma explanation + -------------- --------------------------------- + `-path=`PATH path list for searching modules + +For instance, the line + + --# -path=.:present:prelude:/home/aarne/GF/tmp + +in the top of `FILE.gf` causes the GF compiler, when invoked on +`FILE.gf`, to search through the current directory (`.`) and the +directories `present`, `prelude`, and `/home/aarne/GF/tmp`, in this +order. If a directory `DIR` is not found relative to the working +directory, `$(GF_LIB_PATH)/DIR` is searched. `$GF_LIB_PATH` can be a +colon-separated list of directories, in which case each directory in the +list contributes to the search path expansion. + + +Alternative grammar input formats +--------------------------------- + +While the GF language as specified in this document is the most +versatile and powerful way of writing GF grammars, there are several +other formats that a GF compiler may make available for users, either to +get started with small grammars or to semiautomatically convert grammars +from other formats to GF. Here are the ones supported by GF 2.8 and 3.0. + + +### Old GF without modules + +[]{#oldgf} + +Before GF compiler version 2.0, there was no module system, and all +kinds of judgement could be written in all files, without any headers. +This format is still available, and the compiler (version 2.8) detects +automatically if a file is in the current or the old format. However, +the old format is not recommended because of pure modularity and missing +separate compilation, and also because libraries are not available, +since the old and the new format cannot be mixed. With version 2.8, +grammars in the old format can be converted to modular grammar with the +command + + > import -o FILE.gf + +which rewrites the grammar divided into three files: an abstract, a +concrete, and a resource module. + + +### Context-free grammars + +A quick way to write a GF grammar is to use the context-free format, +also known as BNF. Files of this form are recognized by the suffix +`.cf`. Rules in these files have the form + +*Label* `.` *Cat* `::=` (*String* \| *Cat*)\* `;` + +where *Label* and *Cat* are identifiers and *String* quoted strings. + +There is a shortcut form generating labels automatically, + +*Cat* `::=` (*String* \| *Cat*)\* `;` + +In the shortcut form, vertical bars (`|`) can be used to give several +right-hand-sides at a time. An empty right-hand side means the singleton +of an empty sequence, and not an empty union. + +Just like old-style GF files (previous section), contex-free grammar +files can be converted to modular GF by using the `-o` option to the +compiler in GF 2.8. + + +### Extended BNF grammars + +Extended BNF (`FILE.ebnf`) goes one step further from the shortcut +notation of previous section. The rules have the form + +*Cat* `::=` *RHS* `;` + +where an *RHS* can be any regular expression built from quoted strings +and category symbols, in the following ways: + + RHS item explanation + ----------------- ------------------------ + *Cat* nonterminal + *String* terminal + *RHS* *RHS* sequence + *RHS* `|` *RHS* alternatives + *RHS* `?` optional + *RHS* `*` repetition + *RHS* `+` non-empty repetition\| + +Parentheses are used to override standard precedences, where `|` binds +weaker than sequencing, which binds weaker than the unary operations. + +The compiler generates not only labels, but also new categories +corresponding to the regular expression combinations actually in use. + +Just like `.cf` files (previous section), `.ebnf` files can be converted +to modular GF by using the `-o` option to the compiler in GF 2.8. + + +### Example-based grammars + +**Example-based grammars** (`.gfe`) provide a way to use resource +grammar libraries without having to know the names of functions in them. +The compiler works as a preprocessor, saving the result in a (`.gf`) +file, which can be compiled as usual. + +If a library is implemented as an abstract and concrete syntax, it can +be used for parsing. Calls of library functions can therefore be formed +by parsing strings in the library. GF has an expression format for this, + +`in` *C* *String* + +where *C* is the category in which to parse (it can be qualified by the +module name) and the string is the input to parser. Expressions of this +form are replaced by the syntax trees that result. These trees are +always type-correct. If several parses are found, all but the first one +are given in comments. + +Here is an example, from `GF/examples/animal/`: + + --# -resource=../../lib/present/LangEng.gfc + --# -path=.:present:prelude + + incomplete concrete QuestionsI of Questions = open Lang in { + lincat + Phrase = Phr ; + Entity = N ; + Action = V2 ; + lin + Who love_V2 man_N = in Phr "who loves men" ; + Whom man_N love_V2 = in Phr "whom does the man love" ; + Answer woman_N love_V2 man_N = in Phr "the woman loves men" ; + } + +The `resource` pragma shows the grammar that is used for parsing the +examples. + +Notice that the variables `love_V2`, `man_N`, etc, are actually +constants in the library. In the resulting rules, such as + + lin Whom = \man_N -> \love_V2 -> + PhrUtt NoPConj (UttQS (UseQCl TPres ASimul PPos + (QuestSlash whoPl_IP (SlashV2 (DetCN (DetSg (SgQuant + DefArt)NoOrd)(UseN man_N)) love_V2)))) NoVoc ; + +those constants are nonetheless treated as variables, following the +normal binding conventions, as stated [here](#renaming). + + +The grammar of GF +----------------- + +The following grammar is actually used in the parser of GF, although we +have omitted some obsolete rules still included in the parser for +backward compatibility reasons. + +This document was automatically generated by the *BNF-Converter*. It was +generated together with the lexer, the parser, and the abstract syntax +module, which guarantees that the document matches with the +implementation of the language (provided no hand-hacking has taken +place). + + +The lexical structure of GF +--------------------------- + + +### Identifiers + +Identifiers *Ident* are unquoted strings beginning with a letter, +followed by any combination of letters, digits, and the characters `_ '` +reserved words excluded. + + +### Literals + +Integer literals *Integer* are nonempty sequences of digits. + +String literals *String* have the form `"`*x*`"`}, where *x* is any +sequence of any characters except `"` unless preceded by `\`. + +Double-precision float literals *Double* have the structure indicated by +the regular expression `digit+ '.' digit+ ('e' ('-')? digit+)?` i.e. +two sequences of digits separated by a decimal point, optionally +followed by an unsigned or negative exponent. + + +### Reserved words and symbols + +The set of reserved words is the set of terminals appearing in the +grammar. Those reserved words that consist of non-letter characters are +called symbols, and they are treated in a different way from those that +are similar to identifiers. The lexer follows rules familiar from +languages like Haskell, C, and Java, including longest match and spacing +conventions. + +The reserved words used in GF are the following: + +- `PType` +- `Str` +- `Strs` +- `Type` +- `abstract` +- `case` +- `cat` +- `concrete` +- `data` +- `def` +- `flags` +- `fun` +- `in` +- `incomplete` +- `instance` +- `interface` +- `let` +- `lin` +- `lincat` +- `lindef` +- `linref` +- `of` +- `open` +- `oper` +- `param` +- `pre` +- `printname` +- `resource` +- `strs` +- `table` +- `transfer` +- `variants` +- `where` +- `with` + +The symbols used in GF are the following: + +- `;` +- `=` +- `:` +- `->` +- `{` +- `}` +- `**` +- `,` +- `(` +- `)` +- `[` +- `]` +- `-` +- `.` +- `|` +- `?` +- `<` +- `>` +- `@` +- `!` +- `*` +- `+` +- `++` +- `\` +- `=>` +- `_` +- `$` +- `/` + +### Comments + +Single-line comments begin with `--`. +Multiple-line comments are enclosed with `{-` and `-}`. + + +The syntactic structure of GF +----------------------------- + +Terminal appear as `code`. +The symbols **->** (production), **|** (union) and **eps** (empty rule) belong to the BNF notation. +All other symbols are non-terminals. + + --------------------- -------- ------------------------------------------------------------------------------------ + *Grammar* **->** *\[ModDef\]* + *\[ModDef\]* **->** **eps** + **|** *ModDef* *\[ModDef\]* + *ModDef* **->** *ModDef* `;` + **|** *ComplMod* *ModType* `=` *ModBody* + *ModType* **->** `abstract` *Ident* + **|** `resource` *Ident* + **|** `interface` *Ident* + **|** `concrete` *Ident* `of` *Ident* + **|** `instance` *Ident* `of` *Ident* + **|** `transfer` *Ident* `:` *Open* `->` *Open* + *ModBody* **->** *Extend* *Opens* `{` *\[TopDef\]* `}` + **|** *\[Included\]* + **|** *Included* `with` *\[Open\]* + **|** *Included* `with` *\[Open\]* `**` *Opens* `{` *\[TopDef\]* `}` + **|** *\[Included\]* `**` *Included* `with` *\[Open\]* + **|** *\[Included\]* `**` *Included* `with` *\[Open\]* `**` *Opens* `{` *\[TopDef\]* `}` + *\[TopDef\]* **->** **eps** + **|** *TopDef* *\[TopDef\]* + *Extend* **->** *\[Included\]* `**` + **|** **eps** + *\[Open\]* **->** **eps** + **|** *Open* + **|** *Open* `,` *\[Open\]* + *Opens* **->** **eps** + **|** `open` *\[Open\]* `in` + *Open* **->** *Ident* + **|** `(` *QualOpen* *Ident* `)` + **|** `(` *QualOpen* *Ident* `=` *Ident* `)` + *ComplMod* **->** **eps** + **|** `incomplete` + *QualOpen* **->** **eps** + *\[Included\]* **->** **eps** + **|** *Included* + **|** *Included* `,` *\[Included\]* + *Included* **->** *Ident* + **|** *Ident* `[` *\[Ident\]* `]` + **|** *Ident* `-` `[` *\[Ident\]* `]` + *Def* **->** *\[Name\]* `:` *Exp* + **|** *\[Name\]* `=` *Exp* + **|** *Name* *\[Patt\]* `=` *Exp* + **|** *\[Name\]* `:` *Exp* `=` *Exp* + *TopDef* **->** `cat` *\[CatDef\]* + **|** `fun` *\[FunDef\]* + **|** `data` *\[FunDef\]* + **|** `def` *\[Def\]* + **|** `data` *\[DataDef\]* + **|** `param` *\[ParDef\]* + **|** `oper` *\[Def\]* + **|** `lincat` *\[PrintDef\]* + **|** `lindef` *\[Def\]* + **|** `linref` *\[Def\]* + **|** `lin` *\[Def\]* + **|** `printname` `cat` *\[PrintDef\]* + **|** `printname` `fun` *\[PrintDef\]* + **|** `flags` *\[FlagDef\]* + *CatDef* **->** *Ident* *\[DDecl\]* + **|** `[` *Ident* *\[DDecl\]* `]` + **|** `[` *Ident* *\[DDecl\]* `]` `{` *Integer* `}` + *FunDef* **->** *\[Ident\]* `:` *Exp* + *DataDef* **->** *Ident* `=` *\[DataConstr\]* + *DataConstr* **->** *Ident* + **|** *Ident* `.` *Ident* + *\[DataConstr\]* **->** **eps** + **|** *DataConstr* + **|** *DataConstr* `|` *\[DataConstr\]* + *ParDef* **->** *Ident* `=` *\[ParConstr\]* + **|** *Ident* `=` `(` `in` *Ident* `)` + **|** *Ident* + *ParConstr* **->** *Ident* *\[DDecl\]* + *PrintDef* **->** *\[Name\]* `=` *Exp* + *FlagDef* **->** *Ident* `=` *Ident* + *\[Def\]* **->** *Def* `;` + **|** *Def* `;` *\[Def\]* + *\[CatDef\]* **->** *CatDef* `;` + **|** *CatDef* `;` *\[CatDef\]* + *\[FunDef\]* **->** *FunDef* `;` + **|** *FunDef* `;` *\[FunDef\]* + *\[DataDef\]* **->** *DataDef* `;` + **|** *DataDef* `;` *\[DataDef\]* + *\[ParDef\]* **->** *ParDef* `;` + **|** *ParDef* `;` *\[ParDef\]* + *\[PrintDef\]* **->** *PrintDef* `;` + **|** *PrintDef* `;` *\[PrintDef\]* + *\[FlagDef\]* **->** *FlagDef* `;` + **|** *FlagDef* `;` *\[FlagDef\]* + *\[ParConstr\]* **->** **eps** + **|** *ParConstr* + **|** *ParConstr* `|` *\[ParConstr\]* + *\[Ident\]* **->** *Ident* + **|** *Ident* `,` *\[Ident\]* + *Name* **->** *Ident* + **|** `[` *Ident* `]` + *\[Name\]* **->** *Name* + **|** *Name* `,` *\[Name\]* + *LocDef* **->** *\[Ident\]* `:` *Exp* + **|** *\[Ident\]* `=` *Exp* + **|** *\[Ident\]* `:` *Exp* `=` *Exp* + *\[LocDef\]* **->** **eps** + **|** *LocDef* + **|** *LocDef* `;` *\[LocDef\]* + *Exp6* **->** *Ident* + **|** *Sort* + **|** *String* + **|** *Integer* + **|** *Double* + **|** `?` + **|** `[` `]` + **|** `data` + **|** `[` *Ident* *Exps* `]` + **|** `[` *String* `]` + **|** `{` *\[LocDef\]* `}` + **|** `<` *\[TupleComp\]* `>` + **|** `<` *Exp* `:` *Exp* `>` + **|** `(` *Exp* `)` + *Exp5* **->** *Exp5* `.` *Label* + **|** *Exp6* + *Exp4* **->** *Exp4* *Exp5* + **|** `table` `{` *\[Case\]* `}` + **|** `table` *Exp6* `{` *\[Case\]* `}` + **|** `table` *Exp6* `[` *\[Exp\]* `]` + **|** `case` *Exp* `of` `{` *\[Case\]* `}` + **|** `variants` `{` *\[Exp\]* `}` + **|** `pre` `{` *Exp* `;` *\[Altern\]* `}` + **|** `strs` `{` *\[Exp\]* `}` + **|** *Ident* `@` *Exp6* + **|** *Exp5* + *Exp3* **->** *Exp3* `!` *Exp4* + **|** *Exp3* `*` *Exp4* + **|** *Exp3* `**` *Exp4* + **|** *Exp4* + *Exp1* **->** *Exp2* `+` *Exp1* + **|** *Exp2* + *Exp* **->** *Exp1* `++` *Exp* + **|** `\` *\[Bind\]* `->` *Exp* + **|** `\` `\` *\[Bind\]* `=>` *Exp* + **|** *Decl* `->` *Exp* + **|** *Exp3* `=>` *Exp* + **|** `let` `{` *\[LocDef\]* `}` `in` *Exp* + **|** `let` *\[LocDef\]* `in` *Exp* + **|** *Exp3* `where` `{` *\[LocDef\]* `}` + **|** `in` *Exp5* *String* + **|** *Exp1* + *Exp2* **->** *Exp3* + *\[Exp\]* **->** **eps** + **|** *Exp* + **|** *Exp* `;` *\[Exp\]* + *Exps* **->** **eps** + **|** *Exp6* *Exps* + *Patt2* **->** `_` + **|** *Ident* + **|** *Ident* `.` *Ident* + **|** *Integer* + **|** *Double* + **|** *String* + **|** `{` *\[PattAss\]* `}` + **|** `<` *\[PattTupleComp\]* `>` + **|** `(` *Patt* `)` + *Patt1* **->** *Ident* *\[Patt\]* + **|** *Ident* `.` *Ident* *\[Patt\]* + **|** *Patt2* `*` + **|** *Ident* `@` *Patt2* + **|** `-` *Patt2* + **|** *Patt2* + *Patt* **->** *Patt* `|` *Patt1* + **|** *Patt* `+` *Patt1* + **|** *Patt1* + *PattAss* **->** *\[Ident\]* `=` *Patt* + *Label* **->** *Ident* + **|** `$` *Integer* + *Sort* **->** `Type` + **|** `PType` + **|** `Str` + **|** `Strs` + *\[PattAss\]* **->** **eps** + **|** *PattAss* + **|** *PattAss* `;` *\[PattAss\]* + *\[Patt\]* **->** *Patt2* + **|** *Patt2* *\[Patt\]* + *Bind* **->** *Ident* + **|** `_` + *\[Bind\]* **->** **eps** + **|** *Bind* + **|** *Bind* `,` *\[Bind\]* + *Decl* **->** `(` *\[Bind\]* `:` *Exp* `)` + **|** *Exp4* + *TupleComp* **->** *Exp* + *PattTupleComp* **->** *Patt* + *\[TupleComp\]* **->** **eps** + **|** *TupleComp* + **|** *TupleComp* `,` *\[TupleComp\]* + *\[PattTupleComp\]* **->** **eps** + **|** *PattTupleComp* + **|** *PattTupleComp* `,` *\[PattTupleComp\]* + *Case* **->** *Patt* `=>` *Exp* + *\[Case\]* **->** *Case* + **|** *Case* `;` *\[Case\]* + *Altern* **->** *Exp* `/` *Exp* + *\[Altern\]* **->** **eps** + **|** *Altern* + **|** *Altern* `;` *\[Altern\]* + *DDecl* **->** `(` *\[Bind\]* `:` *Exp* `)` + **|** *Exp6* + *\[DDecl\]* **->** **eps** + **|** *DDecl* *\[DDecl\]* + --------------------- -------- ------------------------------------------------------------------------------------ |
