summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authoraarne <aarne@cs.chalmers.se>2006-01-27 14:52:20 +0000
committeraarne <aarne@cs.chalmers.se>2006-01-27 14:52:20 +0000
commit0839a2ce18aec46862ce741fe40f50fe5c60945e (patch)
tree672ee074a2baf4f619c2b2e4d4d0911def093946 /doc
parent443f86ba4d5f156472d704ffc83c5cd92a4bd835 (diff)
gslt talk
Diffstat (limited to 'doc')
-rw-r--r--doc/Makefile4
-rw-r--r--doc/gf-resource.txt1048
-rw-r--r--doc/gslt-sem-2006.txt312
3 files changed, 1364 insertions, 0 deletions
diff --git a/doc/Makefile b/doc/Makefile
new file mode 100644
index 000000000..771212515
--- /dev/null
+++ b/doc/Makefile
@@ -0,0 +1,4 @@
+all:
+ txt2tags gslt-sem-2006.txt
+ htmls gslt-sem-2006.html
+
diff --git a/doc/gf-resource.txt b/doc/gf-resource.txt
new file mode 100644
index 000000000..1b277691a
--- /dev/null
+++ b/doc/gf-resource.txt
@@ -0,0 +1,1048 @@
+GF Resource Grammar Library
+Author: Aarne Ranta <aarne (at) cs.chalmers.se>
+Last update: %%date(%c)
+
+% NOTE: this is a txt2tags file.
+% Create an html file from this file using:
+% txt2tags --toc gf-resource.txt
+
+%!target:html
+
+%!postproc(html): #NEW <!-- NEW -->
+
+
+#NEW
+==GF = Grammatical Framework==
+
+GF is a grammar formalism based on functional programming and type theory.
+
+
+
+GF was designed to be nice for //ordinary programmers// to use: by this
+we mean programmers without training in linguistics.
+
+
+
+The mission of GF is to make natural-language applications available for
+ordinary programmers, in tasks like
+
+- software documentation
+- domain-specific translation
+- human-computer interaction
+- dialogue systems
+
+Thus GF is //not// primarily another theoretical framework for
+linguists.
+
+
+
+#NEW
+==Multilingual grammars==
+
+A GF grammar consists of an abstract syntax and a set
+of concrete syntaxes.
+
+
+
+**Abstract syntax**: language-independent representation
+```
+ cat Prop ; Nat ;
+ fun Even : Nat -> Prop ;
+ fun NInt : Int -> Nat ;
+```
+**Concrete syntax**: mapping from abstract syntax trees to strings in a language
+(English, French, German, Swedish,...)
+```
+ lin Even x = {s = x.s ++ "is" ++ "even"} ;
+
+ lin Even x = {s = x.s ++ "est" ++ "pair"} ;
+
+ lin Even x = {s = x.s ++ "ist" ++ "gerade"} ;
+
+ lin Even x = {s = x.s ++ "är" ++ "jämnt"} ;
+```
+We can **translate** between languages via the abstract syntax:
+```
+ 4 is even 4 ist gerade
+ \ /
+ Even (NInt 4)
+ / \
+ 4 est pair 4 är jämnt
+```
+
+
+
+But is it really so simple?
+
+
+#NEW
+==Difficulties with concrete syntax==
+
+Most languages have rules of **inflection**, **agreement**,
+and **word order**, which have to be obeyed when putting together
+expressions.
+
+
+
+The previous multilingual grammar breaks these rules in many situations:
+//
+2 and 3 is even
+la somme de 3 et de 5 est pair
+wenn 2 ist gerade, dann 2+2 ist gerade
+om 2 är jämnt, 2+2 är jämnt
+//
+All these sentences are grammatically incorrect.
+
+
+
+#NEW
+==Solving the difficulties==
+
+GF has tools for expressing the linguistic rules that are needed to
+produce correct translations in different languages.
+
+
+
+Instead of just strings, we need parameters**, **tables**,
+and **record types**. For instance, French:
+```
+ param Mod = Ind | Subj ;
+ param Gen = Masc | Fem ;
+
+ lincat Nat = {s : Str ; g : Gen} ;
+ lincat Prop = {s : Mod => Str} ;
+
+ lin Even x = {s =
+ table {
+ m => x.s ++
+ case m of {Ind => "est" ; Subj => "soit"} ++
+ case x.g of {Masc => "pair" ; Fem => "paire"}
+ }
+ } ;
+```
+To learn more about these constructs, consult GF documentation, e.g. the
+[../../../doc/tutorial/gf-tutorial2.html New Grammarian's Tutorial].
+However, in what follows we will show how to avoid learning them and
+still write linguistically correct grammars.
+
+
+#NEW
+==Language + Libraries==
+
+Writing natural language grammars still requires
+theoretical knowledge about the language.
+
+
+
+Which kind of a programmer is it easier to find?
+
+- one who can write a sorting algorithm
+- one who can write a grammar for Swedish determiners
+
+
+
+
+In main-stream programming, sorting algorithms are not
+written by hand but taken from **libraries**.
+
+
+
+In the same way, we want to create grammar libraries that encapsulate
+basic linguistic facts.
+
+
+
+Cf. the Java success story: the language is just a half of the
+success - libraries are another half.
+
+
+
+#NEW
+==Example of library-based grammar writing==
+
+To define a Swedish expression of a mathematical predicate from scratch:
+```
+ Even x =
+ let jämn = case &lt;x.n,x.g> of {
+ &lt;Sg,Utr> => "jämn" ;
+ &lt;Sg,Neutr> => "jämnt" ;
+ &lt;Pl,_> => "jämna"
+ }
+ in
+ {s = table {
+ Main => x.s ! Nom ++ "är" ++ jämn ;
+ Inv => "är" ++ x.s ! Nom ++ jämn ;
+ Sub => x.s ! Nom ++ "är" ++ jämn
+ }
+ }
+```
+To use library functions for syntax and morphology:
+```
+ Even = predA (regA "jämn") ;
+```
+For the French version, we write
+```
+ Even = predA (regA "pair") ;
+```
+
+
+
+#NEW
+==Questions in grammar library design==
+
+What should there be in the library?
+
+- morphology, lexicon, syntax, semantics,...
+
+
+
+How do we organize and present the library?
+
+- division into modules, level of granularity
+
+- "school grammar" vs. sophisticated linguistic concepts
+
+
+
+Where do we get the data from?
+
+- automatic extraction or hand-writing?
+
+- reuse of existing resources?
+
+Extra constraint: we want open-source free software and
+hence cannot use existing proprietary resources.
+
+
+#NEW
+==Answers to questions in grammar library design==
+
+The current GF resource grammar library has
+made the following decisions:
+
+The library has, for each language
+
+- complete morphology, some lexicon (500 words), representative fragment of syntax,
+very little semantics,
+
+
+
+Organization and presentation:
+
+- division into top-level (API) modules, and internal modules (only
+interesting for resource implementors)
+
+- the API is, as much as possible, common in different languages
+
+- we favour "school grammar" concepts rather than innovative linguistic theory
+
+
+
+Where do we get the data from?
+
+- morphology and syntax are hand-written
+
+- the 500-word lexicon is hand-written, but a tool is provided
+ for automatic lexicon extraction
+
+- we have not reused existing resources
+
+The resource grammar library is entirely
+open-source free software (under GNU GPL license).
+
+
+
+
+
+#NEW
+==The scope of a resource grammar library for a language==
+
+All morphological paradigms
+
+
+
+Basic lexicon of structural, common, and irregular words
+
+
+
+Basic syntactic structures
+
+
+
+Currently,
+- //no// semantics,
+- //no// language-specific structures if not necessary for expressivity.
+
+
+
+
+
+#NEW
+==Success criteria==
+
+Grammatical correctness
+
+
+
+Semantic coverage: you can express whatever you want.
+
+
+
+Usability as library for non-linguists.
+
+
+
+(Bonus for linguists:) nice generalizations w.r.t. language
+families, using the module system of GF.
+
+
+
+#NEW
+==These are not our success criteria==
+
+Language coverage: to be able to parse all expressions.
+
+Example:
+the French //passé simple// tense, although covered by the
+morphology, is not used in the language-independent API, but
+only the //passé composé// is. However, an application
+accessing the French-specific (or Romance-specific)
+modules can use the passé simple.
+
+
+
+Semantic correctness: only to produce meaningful expressions.
+
+Example: the following sentences can be generated
+```
+ colourless green ideas sleep furiously
+
+ the time is seventy past forty-two
+```
+However, an applicatio grammar can use a domain-specific
+semantics to guarantee semantic well-formedness.
+
+
+
+(Warning for linguists:) theoretical innovation in
+syntax is not among the goals
+(and it would be hidden from users anyway!).
+
+
+
+#NEW
+==So where is semantics?==
+
+GF incorporates a **Logical Framework** and is therefore
+capable of expressing logical semantics //à la// Montague
+or any other flavour, including anaphora and discourse.
+
+
+
+But we do //not// try to give semantics once and
+for all for the whole language.
+
+
+
+Instead, we expect semantics to be given in
+**application grammars** built on semantic models
+of different domains.
+
+
+
+Example application: number theory
+```
+ fun Even : Nat -> Prop ; -- a mathematical predicate
+
+ lin Even = predA (regA "even") ; -- English translation
+ lin Even = predA (regA "pair") ; -- French translation
+ lin Even = predA (regA "jämn") ; -- Swedish translation
+```
+How could the resource predict that just //these//
+translations are correct in this domain?
+
+
+
+Application grammars are built by experts of these domains
+who - thanks to resource grammars - do no more need to be
+experts in linguistics.
+
+
+
+
+
+
+
+#NEW
+==Languages==
+
+The current GF Resource Project covers ten languages:
+
+-``Dan``ish
+-``Eng``lish
+-``Fin``nish
+-``Fre``nch
+-``Ger``man
+-``Ita``lian
+-``Nor``wegian
+-``Rus``sian
+-``Spa``nish
+-``Swe``dish
+
+The first three letters (``Dan`` etc) are used in grammar module names
+
+
+
+#NEW
+==Library structure 1: language-independent API==
+
+
+- ``Lang`` is the top module collecting all of the following.
+
+
+
+- syntactic ``Categories`` (parts of speech, word classes), e.g.
+```
+ V ; NP ; CN ; Det ; -- verb, noun phrase, common noun, determiner
+```
+- ``Rules`` for combining words and phrases, e.g.
+```
+ DetNP : Det -> CN -> NP ; -- combine Det and CN into NP
+```
+- the most common ``Structural`` words (determiners,
+conjunctions, pronouns) (now 83), e.g.
+```
+ and_Conj : Conj ;
+```
+- ``Numerals``, number words from 1 to 999,999 with their
+inflections, e.g.
+```
+ n8 : Digit ;
+```
+- ``Basic`` lexicon of (now 218) frequent everyday words
+```
+ man_N : N ;
+```
+
+
+
+In addition, and not included in ``Lang``, there is
+- ``SwadeshLex``, lexicon of (now 206) words from the
+[http://en.wiktionary.org/wiki/Swadesh_List Swadesh list], e.g.
+```
+ squeeze_V : V ;
+```
+Of course, there is some overlap between ``SwadeshLex`` and the other modules.
+
+
+#NEW
+==Library structure 2: language-dependent modules==
+
+- morphological ``Paradigms``, e.g. Swedish
+```
+ mkN : Str -> Str -> Str -> Str -> Gender -> N ; -- worst-case nouns
+ mkN : Str -> N ; -- regular nouns
+```
+- (in some languages) irregular ``Verbs``, e.g.
+```
+ angripa_V = irregV "angripa" "angrep" "angripit" ;
+```
+- (not yet available) ``Ext``ended syntax with language-specific rules
+```
+ PassBli : V2 -> NP -> VP ; -- bli överkörd av ngn
+```
+
+
+
+#NEW
+==How much can be language-independent?==
+
+For the ten languages we have considered, it //is// possible
+to implement the current API.
+
+
+
+Reservations:
+
+- this does not necessarily extend to all other languages
+- this does not necessarily cover the most idiomatic expressions
+ of each language
+- this may not be the easiest API to implement (e.g. negation and
+inversion with //do// in English suggest that some other
+structure would be more natural)
+- it is not guaranteed that same structure has the same semantics
+in all different languages
+
+
+
+#NEW
+==Library structure: language-independent API==
+
+%#center
+ [src="Lang.gif]
+%#center
+
+
+#NEW
+==API documentation==
+
+[Categories.html Categories]
+
+
+[Rules.html Rules]
+
+
+Two alternative views on sentence formation by predication:
+[Clause.html Clause],
+[Verbphrase.html Verbphrase]
+
+
+[Structural.html Structural]
+
+
+
+[Time.html Time]
+
+
+[Basic.html Basic]
+
+
+
+[Lang.html Lang]
+
+
+
+See also [../../resource-1.0/doc/gfdoc resource v 1.0 documentation],
+now implemented for English, German, and Swedish.
+
+
+
+#NEW
+==Paradigms documentation==
+
+[ParadigmsEng.html English paradigms]
+
+[BasicEng.html example use of English oaradigms]
+
+[VerbsEng.html English verbs]
+
+
+
+[ParadigmsFin.html Finnish paradigms]
+
+[BasicFin.html example use of Finnish oaradigms]
+
+
+
+[ParadigmsFre.html French paradigms]
+
+[BasicFre.html example use of French paradigms]
+
+[VerbsFre.html French verbs]
+
+
+
+[ParadigmsIta.html Italian paradigms]
+
+[BasicIta.html example use of Italian paradigms]
+
+[BeschIta.html Italian verb conjugations]
+
+
+
+[ParadigmsNor.html Norwegian paradigms]
+
+[BasicNor.html example use of Norwegian paradigms]
+
+[VerbsNor.html Norwegian verbs]
+
+
+[ParadigmsSpa.html Spanish paradigms]
+
+[BasicSpa.html example use of Spanish paradigms]
+
+[BeschSpa.html Spanish verb conjugations]
+
+
+[ParadigmsSwe.html Swedish paradigms]
+
+[BasicSwe.html example use of Swedish paradigms]
+
+[VerbsSwe.html Swedish verbs]
+
+
+
+#NEW
+==Use as top-level grammar: testing==
+
+Import a set of ``LangX`` grammars:
+```
+ i english/LangEng.gf
+ i swedish/LangSwe.gf
+```
+Alternatively, you can ``make`` a precompiled package of
+all the languages by using ``lib/resource/Makefile``:
+```
+ make
+ gf langs.gfcm
+```
+Then you can test with translation, random generation, morphological analysis...
+```
+ > p -lang=LangEng "I have loved her." | l -lang=LangFre
+ Je l' ai aimée.
+
+ > gr -cat=NP | l -multi
+ The sock
+ Strumpan
+ Strømpen
+ La media
+ La calza
+ La chaussette
+ Sukka
+```
+
+
+#NEW
+==Use as top-level grammar: language learning quizzes==
+
+Morpho quiz with words (e.g. French verbs):
+```
+ i french/VerbsFre.gf
+ mq -cat=V
+```
+Morpho quiz with phrases (e.g. Swedish clauses):
+```
+ i swedish/LangSwe.gf
+ mq -cat=Cl
+```
+Translation quiz with sentences (e.g. sentences from English to Swedish):
+```
+ i swedish/LangEng.gf
+ i swedish/LangSwe.gf
+ tq -cat=S LangEng LangSwe
+```
+
+
+
+
+#NEW
+==Use as library==
+
+Import directly by ``open``:
+```
+ concrete AppNor of App = open LangNor, ParadigmsNor in {...}
+```
+(Note for the users of GF 2.1 and older:
+the dummy ``reuse`` modules and their bulky ``.gfr`` versions
+are no longer needed!)
+
+
+
+If you need to convert resource records to strings, and don't want to know
+the concrete type (as you never should), you can use
+```
+ Predef.toStr : (L : Type) -> L -> Str ;
+```
+``L`` must be a linearization type. For instance,
+```
+ toStr LangNor.CN (ModAP (PositADeg old_ADeg) (UseN car_N))
+ ---> "gammel bil"
+```
+
+
+
+
+#NEW
+==Use as library through parser==
+
+You can use the parser with a ``LangX`` grammar
+when developing a resource.
+
+
+
+Using the ``-v`` option shows if the parser fails because
+of unknown words.
+```
+ > p -cat=S -v -lexer=words "jag ska åka till Chalmers"
+ unknown tokens [TS "åka",TS "Chalmers"]
+```
+Then try to select words that ``LangX`` recognizes:
+```
+ > p -cat=S "jag ska springa till Danmark"
+ UseCl (PosTP TFuture ASimul)
+ (AdvCl (SPredV i_NP run_V)
+ (AdvPP (PrepNP to_Prep (UsePN (PNCountry Denmark)))))
+```
+Use these API structures and extend vocabulary to match your need.
+```
+ åka_V = lexV "åker" ;
+ Chalmers = regPN "Chalmers" neutrum ;
+```
+
+#NEW
+==Syntax editor as library browser==
+
+You can run the syntax editor on ``LangX`` to
+find resource API functions through context-sensitive menus.
+For instance, the shell command
+```
+ gfeditor LangEng.gf LangFre.gf
+```
+opens the editor with English and French views. The
+[http://www.cs.chalmers.se/%7Eaarne/GF2.0/doc/javaGUImanual/javaGUImanual.htm
+Editor User Manual] gives more information on the use of the editor.
+
+
+
+A restriction of the editor is that it does not give access to
+``ParadigmsX`` modules. An IDE environment extending the editor
+to a grammar programming tool is work in progress.
+
+
+
+
+#NEW
+==Example application: a small translation system==
+
+In this system, you can express questions and answers of
+the following forms:
+```
+ Who chases mice ?
+ Whom does the lion chase ?
+ The dog chases cats.
+```
+We build the abstract syntax in two phases:
+
+- [example/Questions.gf>Questions] defines question and
+ answer forms independently of domain
+- [example/Animals.gf>Animals] defines a lexicon with
+ animals and things that animals do.
+
+
+
+
+The concrete syntax of English is built in three phases:
+
+- [example/HandQuestionsI.gf QuestionsI] is a parametrized module
+ using the API module ``Resource``.
+- [example/QuestionsEng.gf QuestionsEng] is an instantiation
+ of the API with ``ResourceEng``.
+- [example/AnimalsEng.gf AnimalsEng] is a concrete syntax
+ of ``Animals`` using ``ParadigmsEng`` and ``VerbsEng``.
+
+
+
+
+The concrete syntax of Swedish is built upon ``QuestionsI``
+in a similar way, with the modules
+[example/QuestionsSwe.gf QuestionsSwe] and.
+[example/AnimalsSwe.gf AnimalsSwe].
+
+
+
+The concrete syntax of French consists similarly of the modules
+[example/QuestionsFre.gf QuestionsFre] and
+[example/AnimalsFre.gf AnimalsFre].
+
+
+
+
+#NEW
+==Compiling the example application==
+
+The resources are bulky, and it takes a therefore a lot of
+time and memory to load the grammars. However, they can be
+compiled into the ``gfcm``
+(**GF canonical multilingual**) format,
+which is almost one thousand times smaller and faster to load
+for this set of grammars.
+
+
+
+To produce an end-user multilingual grammar ``animals.gfcm``,
+write the sequence of compilation commands in a ``gfs`` (**GF script**)
+file, say
+[example/mkAnimals.gfs ``mkAnimals.gfs``],
+and then call GF with
+```
+ gf &lt;mkAnimals.gfs
+```
+To try out the grammar,
+```
+ > i animals.gfcm
+
+ > gr | l -multi
+ vem jagar hundar ?
+ qui chasse des chiens ?
+ who chases dogs ?
+```
+
+
+#NEW
+
+==Grammar writing by examples==
+
+(New in GF 2.3)
+
+
+
+You can use the resource grammar as a parser on a special file format,
+``.gfe`` ("GF examples"). Here is the real source,
+[example/QuestionsI.gfe QuestionsI.gfe], which
+generated
+[example/QuestionsI.gf QuestionsI.gf].
+when you executed the GF command
+```
+ i -ex AnimalsEng.gf
+```
+Since ``QuestionsI`` is an incomplete module ("functor"),
+it need only be built once. This is why only the first
+command in ``mkAnimals.gfs`` needs the flag ``-ex``.
+
+
+
+Of course, the grammar of any language can be created by
+parsing any language, as long as they have a common resource API.
+The use of English resource is generally recommended, because it
+is smaller and faster to parse than the other languages.
+
+
+#NEW
+==Constants and variables in examples==
+
+The file [example/QuestionsI.gfe QuestionsI.gfe] uses
+as resource ``LangEng``, which contains all resource syntax and
+a lexicon of ca. 300 words. A linearization rule, such as
+```
+ lin Who love_V2 man_N = in Phr "who loves men ?" ;
+```
+uses as argument variables constants for words that can be found in
+the lexicon. It is due to this that the example can be parsed.
+When the resulting rule,
+```
+ lin Who love_V2 man_N =
+ QuestPhrase (UseQCl (PosTP TPresent ASimul)
+ (QPredV2 who8one_IP love_V2 (IndefNumNP NoNum (UseN man_N)))) ;
+```
+is read by the GF compiler, the identifiers ``love_V2`` and
+``man_N`` are not treated as constants, but, following
+the normal binding rules of functional languages, as bound variables.
+This is what gives the example method the generality that is needed.
+
+
+
+To write linearization rules by examples one thus has to know at
+least one abstract syntax constant for each category for which
+one needs a variable.
+
+
+
+#NEW
+==Extending the lexicon on the fly==
+
+The greatest limitation of the example method is that the lexicon
+may lack many of the words that are needed in examples. If parsing
+fails because of this, the compiler gives a list of unknown words
+in its error message. An obvious solution is,
+of course, to extend the resource lexicon and try again.
+A more light-weight solution is to add a **substitution** to
+the example. For instance, if you want the example "the pope"
+but the lexicon does not have the word "pope", you can write
+```
+ lin Pope = in NP "the man" {man_N = regN "pope"} ;
+```
+The resulting linearization rule is initially
+```
+ lin Pope = DefOneNP (UseN man_N) ;
+```
+but the substitution changes this to
+```
+ lin Pope = DefOneNP (UseN (regN "pope")) ;
+```
+In this way, you do not have to extend the resource lexicon, but you
+need to open the Paradigms module to compile the resulting term.
+
+
+
+Of course, the substituted expressions may come from another language
+than the main language of the example:
+```
+ lin Pope = in NP "the man" {man_N = regN "pape" masculine} ;
+```
+If many substitutions are needed, semicolons are used as separators:
+```
+ {man_N = regN "pope" ; walk_N = regV "pray"} ;
+```
+
+
+#NEW
+==Implementation details: low-level files==
+
+**For developers of resource grammars.**
+The modules listed in this section should never be imported in application
+grammars.
+
+
+
+Each of the API implementations uses the following auxiliary resource modules:
+
+- ``Types``, the morphological paradigms and word classes
+- ``Morpho``, inflection machinery
+- ``Syntax``, complex categories and their combinations
+
+In addition, the following language-independent modules from ``lib/prelude``
+are used.
+
+- ``Predef``, operations whose definitions are hard-coded in GF
+- ``Prelude``, generic string and boolean operations
+- ``Coordination``, coordination structures for arbitrary categories
+
+
+
+#NEW
+==Implementation details: the structure of low-level files==
+
+%#center
+ [Low.gif]
+%#center
+
+
+#NEW
+==How to change a resource grammar?==
+
+In many cases, the source of a bug is in one of
+the low-level modules. Try to trace it back there
+by starting from the high-level module.
+
+
+
+(Much more to be written...)
+
+
+#NEW
+==How to write a resource grammar?==
+
+Start with a more limited goal, e.g. to implement
+the ``stoneage`` grammar (``examples/stoneage``)
+for your language.
+
+
+
+For this, you need
+
+- most of ``Types``
+- most of ``Morpho``
+- some of ``Syntax``
+- most of ``Paradigms``
+
+
+
+
+A useful command to test ``oper``s:
+```
+ i -retain MorphoRot.gf
+ cc regNoun "foo"
+```
+
+
+
+See also [../../resource-1.0/doc/Resource-HOWTO.html Resource-HOWTO]
+(under construction).
+
+
+#NEW
+==The use of parametrized modules==
+
+In two language families, a lot of code is shared.
+- Romance: French, Italian, Spanish
+- Scandinavian: Danish, Norwegian, Swedish
+
+
+The structure looks like this.
+
+ []
+
+
+#NEW
+==Current status==
+
+ | Language | v0.6 | v0.9 | v1.0 | Paradigms | Lexicon | Verbs |
+ | Danish | - | X | - | - | - | -
+ | English | X | X | X | X | X | X
+ | Finnish | X | + | - | X | X | 0
+ | French | X | X | X | X | X | X
+ | German | X | - | X | X | - | -
+ | Italian | X | X | - | X | X | X
+ | Norwegian | - | X | X | X | X | X
+ | Russian | X | X | - | * | - | -
+ | Spanish | - | X | - | X | X | X
+ | Swedish | X | X | X | X | X | X
+
+X = implemented (few exceptions may occur)
+
++ = implemented for a large part
+
+* = linguistic material ready for implementation
+
+- = not implemented
+
+0 = not applicable
+
+
+#NEW
+==Known bugs and limitations==
+
+(//The listed limitations are ones that do not follow from the table on
+the previous page//.)
+
+Danish
+
+English:
+missing uncontracted negations.
+
+Finnish:
+compiling the heuristic paradigms is slow;
+possessive and interrogative suffixes have no proper lexer.
+
+French:
+no inverted questions;
+some verbs in Basic should be reflexive
+
+German
+
+Italian:
+no omission of unstressed subject pronouns;
+some verbs in Basic should be reflexive;
+bad forms of reflexive infinitives
+
+Norwegian:
+possessives of type //bilen min// not included
+
+Russian
+
+Spanish:
+no omission of unstressed subject pronouns;
+no switch to dative case for human objects;
+some verbs in Basic should be reflexive;
+bad forms of reflexive infinitives;
+spurious parameter for verb auxiliary inherited from Romance
+
+
+Swedish:
+
+
+
+#NEW
+==Obtaining it==
+
+Get the grammar package from
+[http://sourceforge.net/project/showfiles.php?group_id=132285
+GF Download Page]. The current libraries are in
+``lib/resource``. Version 0.6 is in
+``lib/resource-0.6``.
+
+
+
+The very very latest version of GF and its libraries is in
+[http://www.cs.chalmers.se/~bringert/gf/downloads/snapshots/ Snapshots].
+
diff --git a/doc/gslt-sem-2006.txt b/doc/gslt-sem-2006.txt
new file mode 100644
index 000000000..a98959b18
--- /dev/null
+++ b/doc/gslt-sem-2006.txt
@@ -0,0 +1,312 @@
+Grammars as Software Libraries
+Author: Aarne Ranta <aarne (at) cs.chalmers.se>
+Last update: %%date(%c)
+
+% NOTE: this is a txt2tags file.
+% Create an html file from this file using:
+% txt2tags --toc gslt-sem-2006.txt
+
+%!target:html
+
+%!postproc(html): #NEW <!-- NEW -->
+
+#NEW
+
+==Software Libraries==
+
+The main device of **division of labour** in programming.
+
+Instead of writing a sorting algorithm over and over again,
+the programmers take it from a library. You write (in Haskell),
+```
+ Data.List.sort xs
+```
+instead of a lot of code actually implementing sorting.
+
+Practical advantages:
+- division of labour
+- faster development of new software
+
+
+#NEW
+
+==Abstraction==
+
+Libraries promote **abstraction**: you abstract away from details.
+
+The use of libraries is therefore a good programming style.
+
+It is also **scientifically interesting** to create libraries:
+you have to think about abstractions on your domain of expertise.
+
+Notice: libraries can bring abstraction to almost any language,
+if it just has a support for functions or macros.
+
+
+#NEW
+
+==Grammars as libraries?==
+
+Example: we want to create a GUI (Graphical User Interface) button
+that says //yes//, and **localize** it to different languages:
+```
+ Yes Ja Kyllä Oui Ja Sì
+```
+Possible ways to do this:
++ Go around dictionaries to find the word in different languages
+```
+ yesButton english = button "Yes"
+ yesButton swedish = button "Ja"
+ yesButton finnish = button "Kyllä"
+```
++ Hire more programmers to perform localization in different languages
++ Use a library ``GUIText`` such that you can write
+```
+ yesButton lang = button (render lang GUIText.Yes)
+```
+
+
+
+#NEW
+
+==A slightly more advanced example==
+
+This is what you often see as a feedback from a program:
+```
+ You have 1 messages.
+```
+Or perhaps with a little more thought:
+```
+ You have 1 message(s).
+```
+The code that should be written is of course
+```
+ mess n = "You have" +++ show n +++ messages ++ "."
+ where
+ messages = if n==1 then "message" else "messages"
+```
+(E.g. VoiceXML gives good support for this.)
+
+
+#NEW
+
+==Problems with the more advanced example==
+
+The same as with "Yes": you have to know the words "you",
+"have", "message".
+
+//Moreover//, you have to know the inflection of the equivalent
+of "message":
+```
+ if n==1 then "meddelande" else "meddelanden"
+```
+//Moreover//, you have to know the congruence with different numbers
+(e.g. Russian, Arabic):
+```
+ if n==1 then "m" else
+ if n==2 then "mein" else "moun"
+```
+You also have to know the case required by the verb "have"
+(e.g. Finnish: nominative in singular, partitive in plural).
+
+//Moreover//, you have to know what is the proper way to politely
+address the user:
+```
+ Du har 3 meddelanden / Ni har 3 meddelanden
+ Vous avez 3 messages / Tu as 3 messages
+```
+(This can also depend on country and the kind of program.)
+
+
+#NEW
+
+==A library-based solution==
+
+In analogy with the "Yes" case, you write
+```
+ mess lang n = render lang (MailText.YouHaveMessages n)
+```
+Hmm, is this so smart? What about if you want to say
+```
+ You have 4 documents.
+ You have 5 jewels.
+ I have 7 surprises.
+```
+It is time to move from **canned text** to a **grammar**.
+
+
+
+#NEW
+
+==An improved library-based solution==
+
+You may want to write
+```
+ mess lang n = render lang (Have PolYou (Num n Message))
+ sword lang n = render lang (Have FamYou (Num n Sword))
+ surpr lang n = render lang (Have I (Num n Surprise))
+```
+For this purpose, you need a library with the following API
+(Application Programmer's Interface):
+```
+ Have : NounPhrase -> NounPhrase -> Sentence
+
+ PolYou, FamYou, I : NounPhrase
+
+ Num : Int -> Noun -> NounPhrase
+
+ Message, Sword, Surprise : Noun
+```
+You also need a top-level rendering function
+```
+ render : Language -> Sentence -> String
+```
+
+
+#NEW
+
+==An optimal solution?==
+
+The library API for language will certainly grow big and become
+difficult to use. Why could't I just write
+```
+ mess lang n = render lang (parse english "you have n messages")
+```
+To this end, the API should provide the top-level function
+```
+ parse : Language -> String -> Sentence
+```
+The library that we will present actually has this as well!
+
+The only complication is that ``parse`` does not always return
+just one sentence. Those may be zero:
+```
+ you have n mesaggse
+```
+or many:
+```
+ Have PolYou (Num n Message)
+ Have FamYou (Num n Message)
+ Have PlurYou (Num n Message)
+```
+
+
+#NEW
+
+==The components of a grammar library==
+
+The library has **construction functions** like
+```
+ Have : NounPhrase -> NounPhrase -> Sentence
+ PolYou : NounPhrase
+```
+These functions build **grammatical structures**, which
+can have different realizations in different languages.
+
+Therefore we also need **realization functions**,
+```
+ render : Language -> Sentence -> String
+ parse : Language -> String -> [Sentence]
+```
+Both of them require major linguistic expertise to write - but,
+one this is done, they can be used with very little linguistic
+knowledge by application programmers!
+
+
+#NEW
+
+==Implementing a grammar library in GF==
+
+GF = Grammatical Framework
+
+Those who know GF have already seen the introduction as a
+seduction argument for GF.
+
+In GF,
+- construction functions = **abstract syntax**
+- realization functions = **concrete syntax**
+
+
+Example:
+```
+ abstract GUIText = {
+ cat Text ;
+ fun Yes : Text ;
+ }
+ concrete GUITextEng of GUIText = {
+ lin Yes = ss "yes" ;
+ }
+ concrete GUITextFin of GUIText = {
+ lin Yes = ss "kyllä" ;
+ }
+```
+
+
+#NEW
+
+==Linearization and parsing==
+
+The realizatin function is, for each language, implemented by
+**linearization rules** (``lin``).
+
+The linearization rules directly give the ``render`` method:
+```
+ render english x = GUITextEng.lin x
+```
+The GF formalism moreover has the property of **reversibility**:
+a set of linearization rules automatically generates a parser as
+well.
+
+While reversibility has a minor importance for the applications
+shown above, it is crucial for other applications of GF grammars.
+
+
+#NEW
+
+==Applying GF==
+
+**multilingual grammar** = abstract syntax + concrete syntaxes
+
+Early instances of the idea (from 1998) - **application grammars**:
+- multilingual authoring
+- domain-specific translation
+- dialogue systems
+
+
+Later development (from 2001) - **resource grammars**:
+- grammar libraries with language-independent APIs
+
+
+Of course, one important use of resource grammars is
+to help writing application grammars in GF.
+
+In addition to GF itself, GF grammars can be accessed in
+Haskell, Prolog, and Java programs.
+
+
+#NEW
+
+==Domain, ontology, idiom==
+
+An abstract syntax can represent
+- a **semantic model**
+- an **ontology**
+
+
+The concrete syntax defines how the **concepts** of the ontology
+are represented in natural language (or in a formal language).
+
+The following requirements are made:
+- linguistic correctness (inflection, agreement, word order,...)
+- semantic correctness (express the intended concepts)
+- conformance to the domain idiom (use natural phrasing)
+
+
+Benefit: translation via semantic model of domain can reach high quality.
+
+Problem: the expertise of both a linguist and a domain expert are required.
+
+
+
+
+%http://www.boost.org/ \ No newline at end of file