working on resource.txt

author: aarne <aarne@cs.chalmers.se> 2006-06-12 09:02:59 +0000
committer: aarne <aarne@cs.chalmers.se> 2006-06-12 09:02:59 +0000
commit: be4e21f4b59d0cb1cb5d8bf93d5aaae78070ef57 (patch)
tree: a592f851cacb89b24df5271c5f9dacfc571cae01
parent: d44e420b6edb0f8d48bf92503372815491d11fa3 (diff)
1 files changed, 157 insertions, 3 deletions
diff --git a/doc/resource.txt b/doc/resource.txt
index 7d5640203..795572af0 100644
--- a/doc/resource.txt
+++ b/doc/resource.txt
@@ -1,5 +1,158 @@
 The GF Resource Grammar Library
 
+
+The GF Resource Grammar Library contains grammar rules for
+10 languages (some more are under construction). Its purpose
+is to make these rules available for application programmers,
+who can thereby concentrate on the semantic and stylistic
+aspects of their grammars, without having to think about 
+grammaticality. 
+
+To give an example, an application dealing with
+music players may have a semantical category ``Kind``, examples
+of Kinds being Song and Artist. In German, for instance, Song 
+is linearized into the noun "Lied", but knowing this is not
+enough to make the application work, because the noun must be 
+produced in both singular and plural, and in four different
+cases. By using the resource grammar library, it is enough to
+write
+
+  lin Song = reg2N "Lied" "Lieder" neuter
+
+and the eight forms are correctly generated. The use of the resource
+grammar extends from lexical items to syntax rules. The application
+mught also want to modify songs with properties, such as "American",
+"old", "good". The German grammar for adjectival modifications is
+particularly complex, because the adjectives have to agree in gender,
+number, and case, also depending on what determiner is used
+("ein Amerikanisches Lied" vs. "das Amerikanische Lied"). All this
+variation is taken care of by the resource grammar function
+
+  fun AdjCN : AP -> CN -> CN
+
+and the resource grammar implementation of the rule adding properties
+to kinds is
+
+  lin PropKind kind prop = AdjCN prop kind
+
+given that 
+
+  lincat Prop = AP
+  lincat Kind = CN
+
+The resource library API is devided into language-specific and language-independet
+parts. To put is roughly,
+- syntax is language-independent
+- lexicon is language-specific
+
+
+Thus, to render the above example in French instead of German, we need to
+pick a different linearization of Song,
+
+  lin Song = regGenN "chanson" feminine
+
+But to linearize PropKind, we can use the very same rule as in German.
+The resource function AdjCN has different implementations in the two
+languages, but the application programmer need not care about the difference.
+
+
+
+==To use a resouce grammar==
+
+===Parsing===
+
+The intended use of the resource grammar is as a library for writing
+application grammars. It is not designed for e.g. parsing text. There
+are several reasons why this is not so practical:
+- efficiency: the resource grammar uses complex data structures, in
+particular, discontinuous constituents, which make parsing slow and the
+parser size huge
+- completeness: the resource grammar does not necessarily cover all rules
+of the language - only enough many so that it is possible to express everything
+in one way or another
+- lexicon: the resource grammar has a very small lexicon, only meant for test
+purposes
+- semantics: the resource grammar has very little semantic control, and may
+accept strange input or deliver strange interpretations
+- ambiguity: parsing in the resource grammar may return lots of results many
+of which are implausible
+
+
+All of these problems should be settled in application grammars - the very point
+of resource grammars is to isolate the low-level linguistic details such as
+inflection, agreement, and word order, from semantic questions, which is what
+the application grammarians should solve.
+
+
+===Inflection paradigms===
+
+The inflection paradigms are defined separately for each language L
+in the module ParadigmsL. To test them, the command cc (= compute_concrete)
+can be used:
+
+  > i -retain german/ParadigmsGer.gf
+
+  > cc regN "Schlange"
+  {
+    s : Number => Case => Str = table Number {
+      Sg => table Case {
+        Nom => "Schlange" ;
+        Acc => "Schlange" ;
+        Dat => "Schlange" ;
+        Gen => "Schlange"
+        } ;
+      Pl => table Case {
+        Nom => "Schlangen" ;
+        Acc => "Schlangen" ;
+        Dat => "Schlangen" ;
+        Gen => "Schlangen"
+        }
+      } ;
+    g : Gender = Fem
+  }
+
+
+
+===Syntax rules===
+
+Syntax rules should be looked for in the abstract modules defining the
+API. There are around 10 such modules, each defining constructors for
+a group of one or more related categories. For instance, the module
+Noun defines how to construct common nouns, noun phrases, and determiners.
+Thus the proper place to find out how nouns are modified with adjectives
+is Noun, because the result of the construction is again a common noun.
+
+Browsing the libraries is helped by the gfdoc-generated HTML pages. 
+However, this is still not easy, and the most efficient way is 
+probably to use the parser.
+Even though parsing is not an intended end-user application 
+of resource grammars, it is a useful technique for application grammarians
+to browse the library. To find out what resource function does some
+particular job, you can just parse a string that exemplifies this job. For
+instance, to find out how sentences are built using transitive verbs, write
+
+  > i english/LangEng.gf
+ 
+  > p -cat=Cl -fcfg "she loves him"
+
+  PredVP (UsePron she_Pron) (ComplV2 love_V2 (UsePron he_Pron))
+
+Parsing with the English resource grammar has an acceptable speed, but
+with most languages it takes just too much resources even to build the
+parser. However, examples parsed in one language can always be linearized in
+other languages:
+
+  > i italian/LangIta.gf
+
+  > l PredVP (UsePron she_Pron) (ComplV2 love_V2 (UsePron he_Pron))
+
+  lo ama
+
+
+
+
+
+
 ==Overview of linguistic structures==
 
 The outermost linguistic structure is Text. Texts are composed
@@ -57,7 +210,7 @@ the same tree.
 The following syntax tree of the Text "John walks." gives an overview
 of the structural levels.
 
-     Node                     Type of subtree   Alternative constructors
+Node Constructor              Type of subtree   Alternative constructors
 
  1.  TFullStop                : Text            TQuestMark
  2.    (PhrUtt                : Phr             
@@ -134,7 +287,8 @@ Verb: How to construct VPs. The main mechanism is verbs with their arguments:
 - sentence-complement verbs: says that it is cold
 - VP-complement verbs: wants to give her a kiss
 
-A special verb is the copula, "be" in English but not even realized by a verb in all languages.
+A special verb is the copula, "be" in English but not even realized 
+by a verb in all languages.
 A copula can take different kinds of complement: 
 
 - an adjectival phrase: (John is) old
@@ -150,7 +304,7 @@ formed in them:
 - Parts of sentence: Adjective, Adverb, Noun, Verb
 - Cross-cut: Conjunction
 
-Because of mutual recursion such as  embedded sentences, this classification is
+Because of mutual recursion such as embedded sentences, this classification is
 not a complete order. However, no mutual dependence is needed between the 
 modules in a formal sense, but they can all be compiled separately. This is due
 to the module Cat, which defines the type system common to the other modules.
author	aarne <aarne@cs.chalmers.se>	2006-06-12 09:02:59 +0000
committer	aarne <aarne@cs.chalmers.se>	2006-06-12 09:02:59 +0000
commit	be4e21f4b59d0cb1cb5d8bf93d5aaae78070ef57 (patch)
tree	a592f851cacb89b24df5271c5f9dacfc571cae01
parent	d44e420b6edb0f8d48bf92503372815491d11fa3 (diff)