summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authoraarne <aarne@cs.chalmers.se>2006-10-30 16:43:13 +0000
committeraarne <aarne@cs.chalmers.se>2006-10-30 16:43:13 +0000
commit8b753833b5858977138c9a622d187ddeec85135c (patch)
treeb1b8d3a93472b3c57157bcc5addf40811c7fe7f6
parentf240612ef4d33640eae8a04b8fe3e369465799c7 (diff)
compilation doc close to final for presentation
-rw-r--r--doc/compiling-gf.txt91
1 files changed, 62 insertions, 29 deletions
diff --git a/doc/compiling-gf.txt b/doc/compiling-gf.txt
index 997b6647a..6f7b60eb8 100644
--- a/doc/compiling-gf.txt
+++ b/doc/compiling-gf.txt
@@ -28,6 +28,10 @@ The grammar gives a declarative description of these functionalities,
on a high abstraction level that improves grammar writing
productivity.
+For efficiency, the grammar is compiled to lower-level formats.
+
+Type checking is another essential compilation phase.
+
#NEW
@@ -53,11 +57,41 @@ Pattern matching.
Higher-order functions.
Dependent types.
+```
+ flip : (a, b, c : Type) -> (a -> b -> c) -> b -> a -> c =
+ \_,_,_,f,y,x -> f x y ;
+```
+
+
+#NEW
+
+==The module system of GF==
-Module system reminiscent of ML (signatures, structures, functors).
+Main division: abstract syntax and concrete syntax
```
+ abstract Greeting = {
+ cat Greet ;
+ fun Hello : Greet ;
+ }
+
+ concrete GreetingEng of Greeting = {
+ lincat Greet = {s : Str} ;
+ lin Hello = {s = "hello"} ;
+ }
+ concrete GreetingIta of Greeting = {
+ param Politeness = Familiar | Polite ;
+ lincat Greet = {s : Politeness => Str} ;
+ lin Hello = {s = table {
+ Familiar => "ciao" ;
+ Polite => "buongiorno"
+ } ;
+ }
```
+Other features of the module system:
+- extension and opening
+- parametrized modules (cf. ML: signatures, structures, functors)
+
@@ -110,7 +144,7 @@ GF includes a complete Logical Framework).
A concrete syntax defines a homomorphism (compositional mapping)
from the abstract syntax to a system of tuples of strings.
-The homomorphism can as such be used as linearization algorithm.
+The homomorphism can as such be used as linearization function.
It is a functional program, but a restricted one, since it works
in the end on finite data structures only.
@@ -136,7 +170,7 @@ The MPCFG grammar can be used for parsing, with (unbounded)
polynomial time complexity (w.r.t. the size of the string).
For these target formats, we have also built interpreters in
-different programming languages (C++, Haskell, Java, Prolog).
+different programming languages (C, C++, Haskell, Java, Prolog).
Moreover, we generate supplementary formats such as grammars
required by various speech recognition systems.
@@ -183,9 +217,7 @@ built for sets of modules: GFCC and the parser formats.)
When the grammar compiler is called, it has a main module as its
argument. It then builds recursively a dependency graph with all
the other modules, and decides which ones must be recompiled.
-The behaviour is rather similar to GHC, and we don't go into
-details (although it would be beneficial to make explicit the
-rules that are right now just in the implementation...)
+The behaviour is rather similar to GHC.
Separate compilation is //extremely important// when developing
big grammars, especially when using grammar libraries. Example: compiling
@@ -271,7 +303,7 @@ This has helped to make the formats portable.
"Almost compositional functions" (``composOp``) are used in
many compiler passes, making them easier to write and understand.
-A grep on the sources reveals 40 uses (outside the definition
+A ``grep`` on the sources reveals 40 uses (outside the definition
of ``composOp`` itself).
The key algorithmic ideas are
@@ -361,8 +393,8 @@ the application inside the table,
```
table {P => f a ; Q = g a} ! x
```
-The transformation is the same as Prawitz's (1965) elimination
-of maximal segments:
+This transformation is the same as Prawitz's (1965) elimination
+of maximal segments in natural deduction:
```
A B
C -> D C C -> D C
@@ -397,7 +429,7 @@ argument by a variable:
```
table { table {
P => t ; ---> x => t[P->x] ;
- Q => t x => t[Q->x]
+ Q => u x => u[Q->x]
} }
```
If the resulting branches are all equal, you can replace the table
@@ -412,6 +444,24 @@ with the lambda abstract is efficient.
#NEW
+==Course-of-values tables==
+
+By maintaining a canonical order of parameters in a type, we can
+eliminate the left hand sides of branches.
+```
+ table { table T [
+ P => t ; ---> P ;
+ Q => u Q
+ } ]
+```
+The treatment is similar to ``Enum`` instances in Haskell.
+
+Actually, all parameter types could be translated to
+initial segments of integers. This is done in the GFCC format.
+
+
+#NEW
+
==Common subexpression elimination==
Algorithm:
@@ -430,23 +480,6 @@ But since all variables (and constants) are unique, they can
be computed by simple replacement.
-#NEW
-
-==Course-of-values tables==
-
-By maintaining a canonical order of parameters in a type, we can
-eliminate the left hand sides of branches.
-```
- table { table T [
- P => t ; ---> P ;
- Q => u Q
- } ]
-```
-The treatment is similar to ``Enum`` instances in Haskell.
-
-Actually, all parameter types could be translated to
-initial segments of integers. This is done in the GFCC format.
-
#NEW
@@ -467,7 +500,7 @@ Example: the German resource grammar
Optimization "all" means parametrization + course-of-values.
The source code size is an estimate, since it includes
-potentially irrelevant library modules.
+potentially irrelevant library modules, and comments.
The GFCC format is not reusable in separate compilation.
@@ -568,4 +601,4 @@ Compilation-related modules that need rewriting
- ``Compile``: clarify the logic of what to do with each module
- ``Compute``: make the evaluation more efficient
- ``Parsing/*``, ``OldParsing/*``, ``Conversion/*``: reduce the number
- of parser formats and algorithms \ No newline at end of file
+ of parser formats and algorithms