summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authoraarne <aarne@cs.chalmers.se>2006-11-01 09:47:28 +0000
committeraarne <aarne@cs.chalmers.se>2006-11-01 09:47:28 +0000
commit81b973e197303f491de55294697314b63ae46c93 (patch)
treeb6a535d50b09887e426f44345e329ca33646a1e4
parent264bcfb28eeda36402411c3aaf1bf167887404c8 (diff)
formal rules in the compilation doc
-rw-r--r--doc/compiling-gf.txt166
1 files changed, 156 insertions, 10 deletions
diff --git a/doc/compiling-gf.txt b/doc/compiling-gf.txt
index 6f7b60eb8..9e438f40f 100644
--- a/doc/compiling-gf.txt
+++ b/doc/compiling-gf.txt
@@ -30,7 +30,10 @@ productivity.
For efficiency, the grammar is compiled to lower-level formats.
-Type checking is another essential compilation phase.
+Type checking is another essential compilation phase. Its purpose is
+twofold, as usual:
+- checking the correctness of the grammar
+- type-annotating expressions for code generation
#NEW
@@ -133,28 +136,171 @@ weakly-typed interfaces or dynamically-checked interfaces."
(B. Stroustrup, 1994, p. 107)
+
#NEW
-==The computation model==
+==The computation model: abstract syntax==
An abstract syntax defines a free algebra of trees (using
dependent types, recursion, higher-order abstract syntax:
GF includes a complete Logical Framework).
+```
+ cat C (x_1 : A_1)...(x_n : A_n)
+ a_1 : A_1
+ ...
+ a_n : A_n{x_1 : A_1,...,x_n-1 : A_n-1}
+ ----------------------------------------------------
+ (C a_1 ... a_n) : Type
+
+
+ fun f : (x_1 : A_1) -> ... -> (x_n : A_n) -> A
+ a_1 : A_1
+ ...
+ a_n : A_n{x_1 : A_1,...,x_n-1 : A_n-1}
+ ----------------------------------------------------
+ (f a_1 ... a_n) : A{x_1 : A_1,...,x_n : A_n}
+
+
+ A : Type x : A |- B : Type x : A |- b : B f : (x : A) -> B a : A
+ ---------------------------- ---------------------- ------------------------
+ (x : A) -> B : Type \x -> b : (x : A) -> B f a : B{x := A}
+```
+Notice that all syntax trees are in eta-long form.
+
+
+#NEW
+
+==The computation model: concrete syntax==
A concrete syntax defines a homomorphism (compositional mapping)
-from the abstract syntax to a system of tuples of strings.
+from the abstract syntax to a system of concrete syntax objects.
+```
+ cat C _
+ --------------------
+ lincat C = C* : Type
+ fun f : (x_1 : A_1) -> ... -> (x_n : A_n) -> A
+ -----------------------------------------------
+ lin f = f* : A_1* -> ... -> A_n* -> A*
+
+ (f a_1 ... a_n)* = f* a_1* ... a_n*
+```
The homomorphism can as such be used as linearization function.
It is a functional program, but a restricted one, since it works
in the end on finite data structures only.
+But a more efficient program is obtained via compilation to
GFC = Canonical GF: the "machine code" of GF.
The parsing problem of GFC can be reduced to that of MPCFG (Multiple
Parallel Context Free Grammars), see P. Ljunglöf's thesis (2004).
+
+#NEW
+
+==The core type system of concrete syntax: basic types==
+
+```
+ param P P : PType
+ PType : Type --------- ---------
+ P : PType P : Type
+
+ s : Str t : Str
+ Str : type "foo" : Str [] : Str ----------------
+ s ++ t : Str
+```
+
+
+#NEW
+
+==The core type system of concrete syntax: functions and tables==
+
+```
+ A : Type x : A |- B : Type x : A |- b : B f : (x : A) -> B a : A
+ ---------------------------- ---------------------- ------------------------
+ (x : A) -> B : Type \x -> b : (x : A) -> B f a : B{x := A}
+
+
+ P : PType A : Type t : P => A p : p
+ -------------------- -----------------
+ P => A : Type t ! p : A
+
+ v_1,...,v_n : A
+ ---------------------------------------------- P = {C_1,...,C_n}
+ table {C_1 => v_1 ; ... ; C_n => v_n} : P => A
+```
+Pattern matching is treated as an abbreviation for tables. Notice that
+```
+ case e of {...} == table {...} ! e
+```
+
+
+#NEW
+
+==The core type system of concrete syntax: records==
+
+```
+ A_1,...,A_n : Type
+ ------------------------------------ n >= 0
+ {r_1 : A_1 ; ... ; r_n : A_n} : Type
+
+
+ a_1 : A_1 ... a_n : A_n
+ ------------------------------------------------------------
+ {r_1 = a_1 ; ... ; r_n = a_n} : {r_1 : A_1 ; ... ; r_n : A_n}
+
+
+ r : {r_1 : A_1 ; ... ; r_n : A_n}
+ ----------------------------------- i = 1,...,n
+ r.r_1 : A_1
+```
+Subtyping: if ``r : R`` then ``r : R ** {r : A}``
+
+
+
+#NEW
+
+==Computation rules==
+
+```
+ (\x -> b) a = b{x := a}
+
+ (table {C_1 => v_1 ; ... ; C_n => v_n} : P => A) ! C_i = v_i
+
+ {r_1 = a_1 ; ... ; r_n = a_n}.r_i = a_i
+```
+
+
+
+#NEW
+
+==Canonical GF==
+
+Concrete syntax type system:
+```
+ A_1 : Type ... A_n : Type
+ Str : Type Int : Type ------------------------- $i : A
+ [A_1, ..., A_n] : Type
+
+
+ a_1 : A_1 ... a_n : A_n t : [A_1, ..., A_n]
+ --------------------------------- ------------------- i = 1,..,n
+ [a_1, ..., a_n] : [A_1, ..., A_n] t ! i : A_i
+```
+Tuples represent both records and tables.
+
+There are no functions.
+
+Linearization:
+```
+ lin f = f*
+
+ (f a_1 ... a_n)* = f*{$1 = a_1*, ..., $n = a_n*}
+```
+
+
#NEW
==The compilation task, again==
@@ -295,8 +441,8 @@ The additional rule now is:
==Techniques used==
-The compiler is written in Haskell, using some C foreign function calls
-(readline, killing threads).
+The compiler is written in Haskell, with some C foreign function calls
+in the interactive version (readline, killing threads).
BNFC is used for generating both the parsers and printers.
This has helped to make the formats portable.
@@ -417,7 +563,7 @@ Irrelevant table branches are thrown away, which can reduce the size.
But, since tables are expanded and auxiliary functions are inlined,
the size can grow exponentially.
-How can we keep the first and eliminate the second?
+How can we keep the first property and eliminate the second?
#NEW
@@ -450,14 +596,14 @@ By maintaining a canonical order of parameters in a type, we can
eliminate the left hand sides of branches.
```
table { table T [
- P => t ; ---> P ;
- Q => u Q
+ P => t ; ---> t ;
+ Q => u u
} ]
```
The treatment is similar to ``Enum`` instances in Haskell.
-Actually, all parameter types could be translated to
-initial segments of integers. This is done in the GFCC format.
+In the end, all parameter types can be translated to
+initial segments of integers.
#NEW