From adad6f44b779ab5cfaaa8cc778dfad6e30648529 Mon Sep 17 00:00:00 2001 From: aarne Date: Thu, 2 Feb 2006 13:17:01 +0000 Subject: gf-formalism.html --- doc/gf-formalism.html | 350 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 350 insertions(+) create mode 100644 doc/gf-formalism.html (limited to 'doc/gf-formalism.html') diff --git a/doc/gf-formalism.html b/doc/gf-formalism.html new file mode 100644 index 000000000..e52929c03 --- /dev/null +++ b/doc/gf-formalism.html @@ -0,0 +1,350 @@ + + + + +A Birds-Eye View of GF as a Grammar Formalism + +

A Birds-Eye View of GF as a Grammar Formalism

+ +Author: Aarne Ranta
+Last update: Thu Feb 2 14:16:01 2006 +
+ +

+
+

+ + +

+
+

+

+ +

+

+Abstract. This document gives a general description of the +Grammatical Framework (GF), with comparisons to other grammar +formalisms such as CG, ACG, HPSG, and LFG. +

+

+ +

+ +

GF in a few words

+

+Grammatical Framework (GF) is a grammar formalism +based on constructive type theory. +

+

+GF makes a distinction between abstract syntax and concrete syntax. +

+

+The abstract syntax part of GF is a logical framework, with +dependent types and higher-order functions. +

+

+The concrete syntax is a system of records containing strings and features. +

+

+A GF grammar defines a reversible homomorphism from an abstract syntax to a +concrete syntax. +

+

+A multilingual GF grammar is a set of concrete syntaxes associated with +one abstract syntax. +

+

+GF grammars are written in a high-level functional programming language, +which is compiled into a core language (GFC). +

+

+GF grammars can be used as resources, i.e. as libraries for writing +new grammars; these are compiled and optimized by the method of +grammar composition. +

+

+GF has a module system that supports grammar engineering and separate +compilation. +

+

+ +

+ +

History of GF

+

+1988. Intuitionistic Categorial Grammar; type theory as abstract syntax, +playing the role of Montague's analysis trees. Grammars implemented in Prolog. +

+

+1994. Type-Theoretical Grammar. Abstract syntax organized as a system of +combinators. Grammars implemented in ALF. +

+

+1996. Multilingual Type-Theoretical Grammar. Rules for generating six +languages from the same abstract syntax. Grammars implemented in ALF, ML, and +Haskell. +

+

+1998. The first implementation of GF as a language of its own. +

+

+2000. New version of GF: high-level functional source language, records used +for concrete syntax. +

+

+2003. The module system. +

+

+2004. Ljunglöf's thesis Expressivity and Complexity of GF. +

+

+ +

+ +

Some key ingredients of GF in other grammar formalisms

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
/GFACGLFGHPSGCG
abstract vs concrete syntaxXX?--
type theoryXX--X
records and featuresX-XX-
+ +

+

+ +

+ +

Examples of descriptions in each formalism

+

+To be written... +

+

+ +

+ +

Lambda terms and records

+

+In CS, abstract syntax is trees and concrete syntax is strings. +This works more or less for programming languages. +

+

+In CG, all syntax is lambda terms. +

+

+In Montague grammar, abstract syntax is lambda terms and +concrete syntax is trees. Abstract syntax as lambda terms +can be considered well-established. +

+

+In PATR and HPSG, concrete syntax it records. This can be considered +well-established for natural languages. +

+

+In ACG, both are lambda terms. This is more general than GF, +but reversibility requires linearity restriction, which can be +unnatural for grammar writing. +

+

+In GF, linearization from lambda terms to records is reversible, +and grammar writing is not restricted to linear terms. +

+

+Grammar composition in ACG is just function composition. In GF, +it is more restricted... +

+

+ +

+ +

The structure of GF formalisms

+

+The following diagram (to be drawn properly!) describes the +levels. +

+
+         |   programming language design
+         V
+    GF source language
+         |
+         |   type-directed partial evaluation
+         V
+    GFC assembly language
+         |
+         |   Ljunglöf's translation
+         V
+    MCFG parser
+
+

+The last two phases are nontrivial mathematica properties. +

+

+In most grammar formalisms, grammarians have to work on the GFC +(or MCFG) level. +

+

+Maybe they use macros - they are therefore like macro assemblers. But there +are no separately compiled library modules, no type checking, etc. +

+

+ +

+ +

The expressivity of GF

+

+Parsing complexity is the same as MCFG: polynomial, with +unrestricted exponent depending on grammar. +This is between TAG and HPSG. +

+

+If semantic well-formedness (type theory) is taken into account, +then arbitrary logic can be expressed. The well-formedness of +abstract syntax is decidable, but the well-formedness of a +concrete-syntax string can require an arbitrary proof construction +and is therefore undecidable. +

+

+Separability between AS and CS: like TAG (Tree Adjoining Grammar), GF +has the goal of assigning intended trees for strings. This is +generalized to shared trees for different languages. +

+

+The high-level language strives after the properties of +writability and readability (programming language notions). +

+

+ +

+ +

Grammars and parsing

+

+In many projects, a grammar is just seen as a declarative parsing program. +

+

+For GF, a grammar is primarily the definition of a language. +

+

+Detaching grammars from parsers is a good idea, giving +

+ + +

+Separating abstract from concrete syntax is a prerequisite for this: +we want parsers to return abstract syntax objects, and these must exist +independently of parse trees. +

+

+A possible radical approach to parsing: +use a grammar to generate a treebank and machine-learn +a statistical parser from this. +

+

+Comparison: Steedman in CCG has done something like this. +

+

+ +

+ +

Grammars as software libraries

+

+Reuse for different purposes. +

+

+Grammar composition. +

+

+ +

+ +

Multilinguality

+

+In application grammars, the AS is a semantic +model, and a CS covers domain terminology and idioms. +

+

+This can give publication-quality translation on +limited domains (e.g. the WebALT project). +

+

+Resource grammars with grammar composition lead to +compile-time transfer. +

+

+When is run-time transfer necessary? +

+

+Cf. CLE (Core Language Engine). +

+

+ +

+ +

Parametrized modules

+

+This notion comes from the ML language in the 1980's. +

+

+It can be used for sharing even more code between languages +than their AS. +

+

+Especially, for related languages (Scandinavian, Romance). +

+

+Cf. grammar porting in CLE: what they do with untyped +macro packages GF does with typable interfaces. +

+ + + + -- cgit v1.2.3