diff options
| author | aarne <unknown> | 2005-05-15 19:14:15 +0000 |
|---|---|---|
| committer | aarne <unknown> | 2005-05-15 19:14:15 +0000 |
| commit | 486eed70c57ac584f16abdd07da012aa8b1d4b0b (patch) | |
| tree | 260fabb02552430c8ef3ff5458dfd57f3117361d /doc | |
| parent | 3304438e5a1b1bda431e83efa2cd96d186ebaada (diff) | |
on resource
Diffstat (limited to 'doc')
| -rw-r--r-- | doc/tutorial/gf-tutorial2.html | 241 |
1 files changed, 239 insertions, 2 deletions
diff --git a/doc/tutorial/gf-tutorial2.html b/doc/tutorial/gf-tutorial2.html index cc3f6f3d8..22cf7a38e 100644 --- a/doc/tutorial/gf-tutorial2.html +++ b/doc/tutorial/gf-tutorial2.html @@ -732,12 +732,249 @@ The graph uses <!-- NEW --> -<h3>Topics still to be written</h3> +<h3>Resource modules</h3> -Resource modules, parameter, linearization types, operations +Suppose we want to say, with the vocabulary included in +<tt>Paleolithic.gf</tt>, things like +<pre> + the boy eats two snakes + all boys sleep +</pre> +The new grammatical facility we need are the plural forms +of nouns and verbs (<i>boys, sleep</i>), as opposed to their +singular forms. <p> +The introduction of plural forms requires two things: +<ul> +<li> to <b>inflect</b> nouns and verbs in singular and plural number +<li> to describe the <b>agreement</b> of the verb to subject: the + rule that the verb must have the same number as the subject +</ul> +Different languages have different rules of inflection and agreement. +For instance, Italian has also agreement in gender (masculine vs. feminine). +We want to be able to ignore such differences in the abstract +syntax. + +<p> + +To be able to do all this, we need a couple of new judgement forms, +a new module form, and a more powerful way of expressing linearization +rules. + + +<!-- NEW --> +<h4>Parameters and tables</h4> + +We define the <b>parameter type</b> of number in Englisn by +using a new form of judgement: +<pre> + param Number = Sg | Pl ; +</pre> +To express that nouns in English have a linearization +depending on number, we replace the linearization type <tt>{s : Str}</tt> +with a type where the <tt>s</tt> field is a <b>table</b> depending on number: +<pre> + lincat CN = {s : Number => Str} ; +</pre> +The <b>table type</b> <tt>Number => Str</tt> is in many respects similar to +a function type (<tt>Number -> Str</tt>). The main restriction is that the +argument type of a table type must always be a parameter type. This means +that the argument-value pairs can be listed in a finite table. The following +example shows such a table: +<pre> + lin Boy = {s = table { + Sg => "boy" ; + Pl => "boys" + } + } ; +</pre> +The application of a table to a parameter is done by the <b>selection</b> +operator <tt>!</tt>. For instance, +<pre> + Boy.s ! Pl +</pre> +is a selection, whose value is <tt>"boys"</tt>. + + +<!-- NEW --> +<h4>Inflection tables, paradigms, and <tt>oper</tt> definitions</h4> + +All English common nouns are inflected in number, most of them in the +same way: the plural form is formed from the singular form by adding the +ending <i>s</i>. This rule is an example of +a <b>paradigm</b> - a formula telling how the inflection +forms of a word are formed. + +<p> + +From GF point of view, a paradigm is a function that takes a <b>lemma</b> - +a string also known as a <b>dictionary form</b> - and returns an inflection +table of desired type. Paradigms are not functions in the sense of the +<tt>fun</tt> judgements of abstract syntax (which operate on trees and not +on strings). Thus we call them <b>operations</b> for the sake of clarity, +introduce one one form of judgement, with the keyword <tt>oper</tt>. As an +example, the following operation defines the regular noun paradigm of English: +<pre> + oper regNoun : Str -> {s : Number => Str} = \x -> { + s = table { + Sg => x ; + Pl => x + "s" + } + } ; +</pre> +Thus an <tt>oper</tt> judgement includes the name of the defined operation, +its type, and an expression defining it. As for the syntax of the defining +expression, notice the <b>lambda abstraction</b> form <tt>\x -> t</tt> of +the function, and the <b>glueing</b> operator <tt>+</tt> telling that +the string held in the variable <tt>x</tt> and the ending <tt>"s"</tt> +are written together to form one <b>token</b>. + + +<!-- NEW --> +<h4>The <tt>resource</tt> module type</h4> + +Parameter and operator definitions do not belong to the abstract syntax. +They can be used when defining concrete syntax - but they are not +tied to a particular set of linearization rules. +The proper way to see them is as auxiliary concepts, as <b>resources</b> +usable in many concrete syntaxes. + +<p> + +The <tt>resource</tt> module type thus consists of +<tt>param</tt> and <tt>oper</tt> definitions. Here is an +example. +<pre> + resource MorphoEng = { + param + Number = Sg | Pl ; + oper + Noun : Type = {s : Number => Str} ; + regNoun : Str -> Noun = \x -> { + s = table { + Sg => x ; + Pl => x + "s" + } + } ; + } +</pre> +Resource modules can extend other resource modules, in the +same way as modules of other types can extend modules of the +same type. + + + +<!-- NEW --> +<h3>Opening a <tt>resource</tt></h3> + +Any number of <tt>resource</tt> modules can be +<b>opened</b> in a <tt>concrete</tt> syntax, which +makes the parameter and operation definitions contained +in the resource usable in the concrete syntax. Here is +an example, where the resource <tt>MorphoEng</tt> is +open in (the fragment of) a new version of <tt>PaleolithicEng</tt>. +<pre> +concrete PaleolithicEng of Paleolithic = open MorphoEng in { + lincat + CN = Noun ; + lin + Boy = regNoun "boy" ; + Snake = regNoun "snake" ; + Worm = regNoun "worm" ; + } +</pre> +Notice that, just like in abstract syntax, function application +is written by juxtaposition of the function and the argument. + +<p> + +Using operations defined in resource modules is clearly a concise +way of giving e.g. inflection tables and other repeated patterns +of expression. In addition, it enables a new kind of modularity +and division of labour in grammar writing: grammarians familiar with +the linguistic details of a language can put this knowledge +available through resource grammars, whose users only need +to pick the right operations and not to know their implementation +details. + + + +<!-- NEW --> +<h4>Worst-case macros and data abstraction</h4> + +Some English nouns, such as <tt>louse</tt>, are so irregular that +it makes little sense to see them as instances of a paradigm. Even +then, it is useful to perform <b>data abstraction</b> from the +definition of the type <tt>Noun</tt>, and introduce a constructor +operation, a <b>worst-case macro</b> for nouns: +<pre> + oper mkNoun : Str -> Str -> Noun = \x,y -> { + s = table { + Sg => x ; + Pl => y + } + } ; +</pre> +Thus we define +<pre> + lin Louse = mkNoun "louse" "lice" ; +</pre> +instead of writing the inflection table explicitly. + +<p> + +The grammar engineering advantage of worst-case macros is that +the author of the resource module may change the definitions of +<tt>Noun</tt> and <tt>mkNoun</tt>, and still retain the +interface (i.e. the system of type signatures) that makes it +correct to use these functions in concrete modules. In programming +terms, <tt>Noun</tt> is then treated as an <b>abstract datatype</b>. + + + +<!-- NEW --> +<h4>A system of paradigms using <tt>Prelude</tt> operations</h4> + +The regular noun paradigm <tt>regNoun</tt> can - and should - of course be defined +by the worst-case macro <tt>mkNoun</tt>. In addition, some more noun paradigms +could be defined, for instance, +<pre> + regNoun : Str -> Noun = \snake -> mkNoun snake (snake + "s") ; + sNoun : Str -> Noun = \kiss -> mkNoun kiss (kiss + "es") ; +</pre> +What about nouns like <i>fly</i>, with the plural <i>flies</i>? The already +available solution is to use the so-called "technical stem" <i>fl</i> as +argument, and define +<pre> + yNoun : Str -> Noun = \fl -> mkNoun (fl + "y") (fl + "ies") ; +</pre> +But this paradigm would be very unintuitive to use, because the "technical stem" +is not even an existing form of the word. A better solution is to use +the string operator <tt>init</tt>, which returns the initial segment (i.e. +all characters but the last) of a string: +<pre> + yNoun : Str -> Noun = \fly -> mkNoun fly (init fly + "ies") ; +</pre> +The operator <tt>init</tt> belongs to a set of operations in the +resource module <tt>Prelude</tt>, which therefore has to be +<tt>open</tt>ed so that <tt>init</tt> can be used. + + + +<!-- NEW --> +<h4>An intelligent noun paradigm using <tt>case</tt> expressions</h4> + + + + + + +<!-- NEW --> +<h2>Topics still to be written</h2> + + Morpho and translation quiz <p> |
