From a1ff75d208fa0e4f6ead832a8785ed749bfd0fc4 Mon Sep 17 00:00:00 2001 From: Krasimir Angelov Date: Thu, 24 Aug 2017 18:10:21 +0200 Subject: The documentation for the Python API is now partly ported for Haskell and Java --- doc/python-api.html | 437 ------------------------------------ doc/runtime-api.html | 614 +++++++++++++++++++++++++++++++++++++++++++++++++++ index.html | 4 +- 3 files changed, 615 insertions(+), 440 deletions(-) delete mode 100644 doc/python-api.html create mode 100644 doc/runtime-api.html diff --git a/doc/python-api.html b/doc/python-api.html deleted file mode 100644 index 0cc4f2701..000000000 --- a/doc/python-api.html +++ /dev/null @@ -1,437 +0,0 @@ - - - - - -

Using the Python binding to the C runtime

-

Krasimir Angelov, July 2015

- -

Loading the Grammar

- -Before you use the Python binding you need to import the pgf module. -
->>> import pgf
-
- -Once you have the module imported, you can use the dir and -help functions to see what kind of functionality is available. -dir takes an object and returns a list of methods available -in the object: -
->>> dir(pgf)
-
-help is a little bit more advanced and it tries -to produce more human readable documentation, which more over -contains comments: -
->>> help(pgf)
-
- -A grammar is loaded by calling the method readPGF: -
->>> gr = pgf.readPGF("App12.pgf")
-
- -From the grammar you can query the set of available languages. -It is accessible through the property languages which -is a map from language name to an object of class pgf.Concr -which respresents the language. -For example the following will extract the English language: -
->>> eng = gr.languages["AppEng"]
->>> print(eng)
-<pgf.Concr object at 0x7f7dfa4471d0>
-
- -

Parsing

- -All language specific services are available as methods of the -class pgf.Concr. For example to invoke the parser, you -can call: -
->>> i = eng.parse("this is a small theatre")
-
-This gives you an iterator which can enumerates all possible -abstract trees. You can get the next tree by calling next: -
->>> p,e = i.next()
-
-or by calling __next__ if you are using Python 3: -
->>> p,e = i.__next__()
-
-The results are always pairs of probability and tree. The probabilities -are negated logarithmic probabilities and which means that the lowest -number encodes the most probable result. The possible trees are -returned in decreasing probability order (i.e. increasing negated logarithm). -The first tree should have the smallest p: -
->>> print(p)
-35.9166526794
-
-and this is the corresponding abstract tree: -
->>> print(e)
-PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetNP (DetQuant this_Quant NumSg)) (UseComp (CompNP (DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA small_A) (UseN theatre_N)))))))) NoVoc
-
- -The parse method has also the following optional parameters: - - - - - -
catstart category
nmaximum number of trees
heuristicsa real number from 0 to 1
callbacksa list of category and callback function
- -By using these parameters it is possible for instance to change the start category for -the parser or to limit the number of trees returned from the parser. For example -parsing with a different start category can be done as follows: -
->>> i = eng.parse("a small theatre", cat="NP")
-
- -

The heuristics factor can be used to trade parsing speed for quality. -By default the list of trees is sorted by probability this corresponds -to factor 0.0. When we increase the factor then parsing becomes faster -but at the same time the sorting becomes imprecise. The worst -factor is 1.0. In any case the parser always returns the same set of -trees but in different order. Our experience is that even a factor -of about 0.6-0.8 with the translation grammar, still orders -the most probable tree on top of the list but further down the list -the trees become shuffled. -

- -

-The callbacks is a list of functions that can be used for recognizing -literals. For example we use those for recognizing names and unknown -words in the translator. -

- -

Linearization

- -You can either linearize the result from the parser back to another -language, or you can explicitly construct a tree and then -linearize it in any language. For example, we can create -a new expression like this: -
->>> e = pgf.readExpr("AdjCN (PositA red_A) (UseN theatre_N)")
-
-and then we can linearize it: -
->>> print(eng.linearize(e))
-red theatre
-
-This method produces only a single linearization. If you use variants -in the grammar then you might want to see all possible linearizations. -For that purpouse you should use linearizeAll: -
->>> for s in eng.linearizeAll(e):
-       print(s)
-red theatre
-red theater
-
-If, instead, you need an inflection table with all possible forms -then the right method to use is tabularLinearize: -
->>> eng.tabularLinearize(e):
-{'s Sg Nom': 'red theatre', 's Pl Nom': 'red theatres', 's Pl Gen': "red theatres'", 's Sg Gen': "red theatre's"}
-
- -

-Finally, you could also get a linearization which is bracketed into -a list of phrases: -

->>> [b] = eng.bracketedLinearize(e)
->>> print(b)
-(CN:4 (AP:1 (A:0 red)) (CN:3 (N:2 theatre)))
-
-Each bracket is actually an object of type pgf.Bracket. The property -cat of the object gives you the name of the category and -the property children gives you a list of nested brackets. -If a phrase is discontinuous then it is represented as more than -one brackets with the same category name. In that case, the index -that you see in the example above will have the same value for all -brackets of the same phrase. -

- -The linearization works even if there are functions in the tree -that doesn't have linearization definitions. In that case you -will just see the name of the function in the generated string. -It is sometimes helpful to be able to see whether a function -is linearizable or not. This can be done in this way: -
->>> print(eng.hasLinearization("apple_N"))
-
- -

Analysing and Constructing Expressions

- -

-An already constructed tree can be analyzed and transformed -in the host application. For example you can deconstruct -a tree into a function name and a list of arguments: -

->>> e.unpack()
-('AdjCN', [<pgf.Expr object at 0x7f7df6db78c8>, <pgf.Expr object at 0x7f7df6db7878>])
-
- -The result from unpack can be different depending on the form of the -tree. If the tree is a function application then you always get -a tuple of function name and a list of arguments. If instead the -tree is just a literal string then the return value is the actual -literal. For example the result from: -
->>> pgf.readExpr('"literal"').unpack()
-'literal'
-
-is just the string 'literal'. Situations like this can be detected -in Python by checking the type of the result from unpack. -

- -

-For more complex analyses you can use the visitor pattern. -In object oriented languages this is just a clumpsy way to do -what is called pattern matching in most functional languages. -You need to define a class which has one method for each function -in the abstract syntax of the grammar. If the functions is called -f then you need a method called on_f. The method -will be called each time when the corresponding function is encountered, -and its arguments will be the arguments from the original tree. -If there is no matching method name then the runtime will -to call the method default. The following is an example: -

->>> class ExampleVisitor:
-		def on_DetCN(self,quant,cn):
-			print("Found DetCN")
-			cn.visit(self)
-			
-		def on_AdjCN(self,adj,cn):
-			print("Found AdjCN")
-			cn.visit(self)
-			
-		def default(self,e):
-			pass
->>> e2.visit(ExampleVisitor())
-Found DetCN
-Found AdjCN
-
-Here we call the method visit from the tree e2 and we give -it, as parameter, an instance of class ExampleVisitor. -ExampleVisitor has two methods on_DetCN -and on_AdjCN which are called when the top function of -the current tree is DetCN or AdjCN -correspondingly. In this example we just print a message and -we call visit recursively to go deeper into the tree. -

- -Constructing new trees is also easy. You can either use -readExpr to read trees from strings, or you can -construct new trees from existing pieces. This is possible by -using the constructor for pgf.Expr: -
->>> quant = pgf.readExpr("DetQuant IndefArt NumSg")
->>> e2 = pgf.Expr("DetCN", [quant, e])
->>> print(e2)
-DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA red_A) (UseN theatre_N))
-
- -

Embedded GF Grammars

- -The GF compiler allows for easy integration of grammars in Haskell -applications. For that purpose the compiler generates Haskell code -that makes the integration of grammars easier. Since Python is a -dynamic language the same can be done at runtime. Once you load -the grammar you can call the method embed, which will -dynamically create a Python module with one Python function -for every function in the abstract syntax of the grammar. -After that you can simply import the module: -
->>> gr.embed("App")
-<module 'App' (built-in)>
->>> import App
-
-Now creating new trees is just a matter of calling ordinary Python -functions: -
->>> print(App.DetCN(quant,e))
-DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA red_A) (UseN house_N))
-
- -

Access the Morphological Lexicon

- -There are two methods that gives you direct access to the morphological -lexicon. The first makes it possible to dump the full form lexicon. -The following code just iterates over the lexicon and prints each -word form with its possible analyses: -
-for entry in eng.fullFormLexicon():
-	print(entry)
-
-The second one implements a simple lookup. The argument is a word -form and the result is a list of analyses: -
-print(eng.lookupMorpho("letter"))
-[('letter_1_N', 's Sg Nom', inf), ('letter_2_N', 's Sg Nom', inf)]
-
- -

Access the Abstract Syntax

- -There is a simple API for accessing the abstract syntax. For example, -you can get a list of abstract functions: -
->>> gr.functions
-....
-
-or a list of categories: -
->>> gr.categories
-....
-
-You can also access all functions with the same result category: -
->>> gr.functionsByCat("Weekday")
-['friday_Weekday', 'monday_Weekday', 'saturday_Weekday', 'sunday_Weekday', 'thursday_Weekday', 'tuesday_Weekday', 'wednesday_Weekday']
-
-The full type of a function can be retrieved as: -
->>> print(gr.functionType("DetCN"))
-Det -> CN -> NP
-
- -

Type Checking Abstract Trees

- -

The runtime type checker can do type checking and type inference -for simple types. Dependent types are still not fully implemented -in the current runtime. The inference is done with method inferExpr: -

->>> e,ty = gr.inferExpr(e)
->>> print(e)
-AdjCN (PositA red_A) (UseN theatre_N)
->>> print(ty)
-CN
-
-The result is a potentially updated expression and its type. In this -case we always deal with simple types, which means that the new -expression will be always equal to the original expression. However, this -wouldn't be true when dependent types are added. -

- -

Type checking is also trivial: -

->>> e = gr.checkExpr(e,pgf.readType("CN"))
->>> print(e)
-AdjCN (PositA red_A) (UseN theatre_N)
-
-In case of type error you will get an exception: -
->>> e = gr.checkExpr(e,pgf.readType("A"))
-pgf.TypeError: The expected type of the expression AdjCN (PositA red_A) (UseN theatre_N) is A but CN is infered
-
-

- -

Partial Grammar Loading

- -By default the whole grammar is compiled into a single file -which consists of an abstract syntax together will all concrete -languages. For large grammars with many languages this might be -inconvinient because loading becomes slower and the grammar takes -more memory. For that purpose you could split the grammar into -one file for the abstract syntax and one file for every concrete syntax. -This is done by using the option -split-pgf in the compiler: -
-$ gf -make -split-pgf App12.pgf
-
- -Now you can load the grammar as usual but this time only the -abstract syntax will be loaded. You can still use the languages -property to get the list of languages and the corresponding -concrete syntax objects: -
->>> gr = pgf.readPGF("App.pgf")
->>> eng = gr.languages["AppEng"]
-
-However, if you now try to use the concrete syntax then you will -get an exception: -
->>> gr.languages["AppEng"].lookupMorpho("letter")
-Traceback (most recent call last):
-  File "", line 1, in 
-pgf.PGFError: The concrete syntax is not loaded
-
- -Before using the concrete syntax, you need to explicitly load it: -
->>> eng.load("AppEng.pgf_c")
->>> print(eng.lookupMorpho("letter"))
-[('letter_1_N', 's Sg Nom', inf), ('letter_2_N', 's Sg Nom', inf)]
-
- -When you don't need the language anymore then you can simply -unload it: -
->>> eng.unload()
-
- -

GraphViz

- -GraphViz is used for visualizing abstract syntax trees and parse trees. -In both cases the result is a GraphViz code that can be used for -rendering the trees. See the examples bellow. - -
->>> print(gr.graphvizAbstractTree(e))
-graph {
-n0[label = "AdjCN", style = "solid", shape = "plaintext"]
-n1[label = "PositA", style = "solid", shape = "plaintext"]
-n2[label = "red_A", style = "solid", shape = "plaintext"]
-n1 -- n2 [style = "solid"]
-n0 -- n1 [style = "solid"]
-n3[label = "UseN", style = "solid", shape = "plaintext"]
-n4[label = "theatre_N", style = "solid", shape = "plaintext"]
-n3 -- n4 [style = "solid"]
-n0 -- n3 [style = "solid"]
-}
-
- -
->>> print(eng.graphvizParseTree(e))
-graph {
-  node[shape=plaintext]
-
-  subgraph {rank=same;
-    n4[label="CN"]
-  }
-
-  subgraph {rank=same;
-    edge[style=invis]
-    n1[label="AP"]
-    n3[label="CN"]
-    n1 -- n3
-  }
-  n4 -- n1
-  n4 -- n3
-
-  subgraph {rank=same;
-    edge[style=invis]
-    n0[label="A"]
-    n2[label="N"]
-    n0 -- n2
-  }
-  n1 -- n0
-  n3 -- n2
-
-  subgraph {rank=same;
-    edge[style=invis]
-    n100000[label="red"]
-    n100001[label="theatre"]
-    n100000 -- n100001
-  }
-  n0 -- n100000
-  n2 -- n100001
-}
-
- - - - diff --git a/doc/runtime-api.html b/doc/runtime-api.html new file mode 100644 index 000000000..015c3d372 --- /dev/null +++ b/doc/runtime-api.html @@ -0,0 +1,614 @@ + + + + + + + +

Using the Python Haskell Java C# binding to the C runtime

+

Krasimir Angelov, July 2015

+ + Choose a language: Haskell Python Java C# + +

Loading the Grammar

+ +Before you use the Python binding you need to import the PGF2 modulepgf modulepgf package. +
+>>> import pgf
+
+
+Prelude> import PGF2
+
+
+import org.grammaticalframework.pgf.*;
+
+ +Once you have the module imported, you can use the dir and +help functions to see what kind of functionality is available. +dir takes an object and returns a list of methods available +in the object: +
+>>> dir(pgf)
+
+help is a little bit more advanced and it tries +to produce more human readable documentation, which more over +contains comments: +
+>>> help(pgf)
+
+
+ +A grammar is loaded by calling the method pgf.readPGFthe function readPGFthe method PGF.readPGFthe method PGF.ReadPGF: +
+>>> gr = pgf.readPGF("App12.pgf")
+
+
+Prelude PGF2> gr <- readPGF "App12.pgf"
+
+
+PGF gr = PGF.readPGF("App12.pgf")
+
+ +From the grammar you can query the set of available languages. +It is accessible through the property languages which +is a map from language name to an object of class pgf.Concrtype Concrclass Concr +which respresents the language. +For example the following will extract the English language: +
+>>> eng = gr.languages["AppEng"]
+>>> print(eng)
+<pgf.Concr object at 0x7f7dfa4471d0>
+
+
+Prelude PGF2> let Just eng = Data.Map.lookup "AppEng" (languages gr)
+Prelude PGF2> :t eng
+eng :: Concr
+
+
+Concr eng = gr.getLanguages().get("AppEng")
+
+ +

Parsing

+ +All language specific services are available as +methods of the class pgf.Concrfunctions that take as an argument an object of type Concrmethods of the class Concr. +For example to invoke the parser, you can call: +
+>>> i = eng.parse("this is a small theatre")
+
+
+Prelude PGF2> let res = parse eng (startCat gr) "this is a small theatre"
+
+
+Iterable<ExprProb> iterable = eng.parse(gr.startCat(), "this is a small theatre")
+
+ +This gives you an iterator which can enumerate all possible +abstract trees. You can get the next tree by calling next: +
+>>> p,e = i.next()
+
+or by calling __next__ if you are using Python 3: +
+>>> p,e = i.__next__()
+
+
+ +This gives you a result of type Either String [(Expr, Float)]. +If the result is Left then the parser has failed and you will +get the token where the parser got stuck. If the parsing was successful +then you get a potentially infinite list of parse results: +
+Prelude PGF2> let Right ((p,e):rest) = res
+
+
+ +This gives you an iterable which can enumerate all possible +abstract trees. You can get the next tree by calling next: +
+Iterator<ExprProb> iter = iterable.iterator()
+ExprProb ep = iter.next()
+
+
+ +

The results are pairs of probability and tree. The probabilities +are negated logarithmic probabilities and this means that the lowest +number encodes the most probable result. The possible trees are +returned in decreasing probability order (i.e. increasing negated logarithm). +The first tree should have the smallest p: +

+
+>>> print(p)
+35.9166526794
+
+
+Prelude PGF2> print p
+35.9166526794
+
+
+System.out.println(ep.getProb())
+35.9166526794
+
+and this is the corresponding abstract tree: +
+>>> print(e)
+PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetNP (DetQuant this_Quant NumSg)) (UseComp (CompNP (DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA small_A) (UseN theatre_N)))))))) NoVoc
+
+
+Prelude PGF2> print e
+PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetNP (DetQuant this_Quant NumSg)) (UseComp (CompNP (DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA small_A) (UseN theatre_N)))))))) NoVoc
+
+
+System.out.println(ep.getExpr())
+PhrUtt NoPConj (UttS (UseCl (TTAnt TPres ASimul) PPos (PredVP (DetNP (DetQuant this_Quant NumSg)) (UseComp (CompNP (DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA small_A) (UseN theatre_N)))))))) NoVoc
+
+ +

Note that depending on the grammar it is absolutely possible that for +a single sentence you might get infinitely many trees. +In other cases the number of trees might be finite but still enormous. +The parser is specifically designed to be lazy, which means that +each tree is returned as soon as it is found before exhausting +the full search space. For grammars with a patological number of +trees it is advisable to pick only the top N trees +and to ignore the rest.

+ + +The parse method has also the following optional parameters: + + + + + +
catstart category
nmaximum number of trees
heuristicsa real number from 0 to 1
callbacksa list of category and callback function
+ +

By using these parameters it is possible for instance to change the start category for +the parser or to limit the number of trees returned from the parser. For example +parsing with a different start category can be done as follows:

+
+>>> i = eng.parse("a small theatre", cat=pgf.readType("NP"))
+
+
+ +There is also the function parseWithHeuristics which +takes two more paramaters which let you to have a better control +over the parser's behaviour: +
+let res = parseWithHeuristics eng (startCat gr) heuristic_factor callbacks
+
+
+ +There is also the method parseWithHeuristics which +takes two more paramaters which let you to have a better control +over the parser's behaviour: +
+Iterable<ExprProb> iterable = eng.parseWithHeuristics(gr.startCat(), heuristic_factor, callbacks)
+
+
+ +

The heuristics factor can be used to trade parsing speed for quality. +By default the list of trees is sorted by probability and this corresponds +to factor 0.0. When we increase the factor then parsing becomes faster +but at the same time the sorting becomes imprecise. The worst +factor is 1.0. In any case the parser always returns the same set of +trees but in different order. Our experience is that even a factor +of about 0.6-0.8 with the translation grammar still orders +the most probable tree on top of the list but further down the list, +the trees become shuffled. +

+ +

+The callbacks is a list of functions that can be used for recognizing +literals. For example we use those for recognizing names and unknown +words in the translator. +

+ +

Linearization

+ +You can either linearize the result from the parser back to another +language, or you can explicitly construct a tree and then +linearize it in any language. For example, we can create +a new expression like this: +
+>>> e = pgf.readExpr("AdjCN (PositA red_A) (UseN theatre_N)")
+
+
+Prelude PGF2> let Just e = readExpr "AdjCN (PositA red_A) (UseN theatre_N)"
+
+
+Expr e = Expr.readExpr("AdjCN (PositA red_A) (UseN theatre_N)")
+
+and then we can linearize it: +
+>>> print(eng.linearize(e))
+red theatre
+
+
+Prelude PGF2> putStrLn (linearize eng e)
+red theatre
+
+
+System.out.println(eng.linearize(e))
+red theatre
+
+This method produces only a single linearization. If you use variants +in the grammar then you might want to see all possible linearizations. +For that purpouse you should use linearizeAll: +
+>>> for s in eng.linearizeAll(e):
+       print(s)
+red theatre
+red theater
+
+
+Prelude PGF2> mapM_ putStrLn (linearizeAll eng e)
+red theatre
+red theater
+
+
+for (String s : eng.linearizeAll(e)) {
+    System.out.println(s)
+}
+red theatre
+red theater
+
+If, instead, you need an inflection table with all possible forms +then the right method to use is tabularLinearize: +
+>>> eng.tabularLinearize(e):
+{'s Sg Nom': 'red theatre', 's Pl Nom': 'red theatres', 's Pl Gen': "red theatres'", 's Sg Gen': "red theatre's"}
+
+
+Prelude PGF2> tabularLinearize eng e
+{'s Sg Nom': 'red theatre', 's Pl Nom': 'red theatres', 's Pl Gen': "red theatres'", 's Sg Gen': "red theatre's"}
+
+
+for (Map.Entry<String,String> entry : eng.tabularLinearize(e)) {
+    System.out.println(entry.getKey() + ": " + entry.getValue());
+}
+s Sg Nom: red theatre
+s Pl Nom: red theatres
+s Pl Gen: red theatres'
+s Sg Gen: red theatre's
+
+ +

+Finally, you could also get a linearization which is bracketed into +a list of phrases: +

+>>> [b] = eng.bracketedLinearize(e)
+>>> print(b)
+(CN:4 (AP:1 (A:0 red)) (CN:3 (N:2 theatre)))
+
+
+Prelude PGF2> let [b] = bracketedLinearize eng e
+Prelude PGF2> print b
+(CN:4 (AP:1 (A:0 red)) (CN:3 (N:2 theatre)))
+
+
+Object[] bs = eng.bracketedLinearize(e)
+
+Each bracket is actually an object of type pgf.Bracket. The property +cat of the object gives you the name of the category and +the property children gives you a list of nested brackets. +If a phrase is discontinuous then it is represented as more than +one brackets with the same category name. In that case, the index +that you see in the example above will have the same value for all +brackets of the same phrase. +

+ +The linearization works even if there are functions in the tree +that doesn't have linearization definitions. In that case you +will just see the name of the function in the generated string. +It is sometimes helpful to be able to see whether a function +is linearizable or not. This can be done in this way: +
+>>> print(eng.hasLinearization("apple_N"))
+
+
+Prelude PGF2> print (hasLinearization eng "apple_N")
+
+
+System.out.println(eng.hasLinearization("apple_N"))
+
+ +

Analysing and Constructing Expressions

+ +

+An already constructed tree can be analyzed and transformed +in the host application. For example you can deconstruct +a tree into a function name and a list of arguments: +

+>>> e.unpack()
+('AdjCN', [<pgf.Expr object at 0x7f7df6db78c8>, <pgf.Expr object at 0x7f7df6db7878>])
+
+ +The result from unpack can be different depending on the form of the +tree. If the tree is a function application then you always get +a tuple of function name and a list of arguments. If instead the +tree is just a literal string then the return value is the actual +literal. For example the result from: +
+>>> pgf.readExpr('"literal"').unpack()
+'literal'
+
+is just the string 'literal'. Situations like this can be detected +in Python by checking the type of the result from unpack. +

+ +

+For more complex analyses you can use the visitor pattern. +In object oriented languages this is just a clumpsy way to do +what is called pattern matching in most functional languages. +You need to define a class which has one method for each function +in the abstract syntax of the grammar. If the functions is called +f then you need a method called on_f. The method +will be called each time when the corresponding function is encountered, +and its arguments will be the arguments from the original tree. +If there is no matching method name then the runtime will +to call the method default. The following is an example: +

+>>> class ExampleVisitor:
+		def on_DetCN(self,quant,cn):
+			print("Found DetCN")
+			cn.visit(self)
+			
+		def on_AdjCN(self,adj,cn):
+			print("Found AdjCN")
+			cn.visit(self)
+			
+		def default(self,e):
+			pass
+>>> e2.visit(ExampleVisitor())
+Found DetCN
+Found AdjCN
+
+Here we call the method visit from the tree e2 and we give +it, as parameter, an instance of class ExampleVisitor. +ExampleVisitor has two methods on_DetCN +and on_AdjCN which are called when the top function of +the current tree is DetCN or AdjCN +correspondingly. In this example we just print a message and +we call visit recursively to go deeper into the tree. +

+ +Constructing new trees is also easy. You can either use +readExpr to read trees from strings, or you can +construct new trees from existing pieces. This is possible by +using the constructor for pgf.Expr: +
+>>> quant = pgf.readExpr("DetQuant IndefArt NumSg")
+>>> e2 = pgf.Expr("DetCN", [quant, e])
+>>> print(e2)
+DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA red_A) (UseN theatre_N))
+
+ +

Embedded GF Grammars

+ +The GF compiler allows for easy integration of grammars in Haskell +applications. For that purpose the compiler generates Haskell code +that makes the integration of grammars easier. Since Python is a +dynamic language the same can be done at runtime. Once you load +the grammar you can call the method embed, which will +dynamically create a Python module with one Python function +for every function in the abstract syntax of the grammar. +After that you can simply import the module: +
+>>> gr.embed("App")
+<module 'App' (built-in)>
+>>> import App
+
+Now creating new trees is just a matter of calling ordinary Python +functions: +
+>>> print(App.DetCN(quant,e))
+DetCN (DetQuant IndefArt NumSg) (AdjCN (PositA red_A) (UseN house_N))
+
+ +

Access the Morphological Lexicon

+ +There are two methods that gives you direct access to the morphological +lexicon. The first makes it possible to dump the full form lexicon. +The following code just iterates over the lexicon and prints each +word form with its possible analyses: +
+for entry in eng.fullFormLexicon():
+	print(entry)
+
+The second one implements a simple lookup. The argument is a word +form and the result is a list of analyses: +
+print(eng.lookupMorpho("letter"))
+[('letter_1_N', 's Sg Nom', inf), ('letter_2_N', 's Sg Nom', inf)]
+
+ +

Access the Abstract Syntax

+ +There is a simple API for accessing the abstract syntax. For example, +you can get a list of abstract functions: +
+>>> gr.functions
+....
+
+or a list of categories: +
+>>> gr.categories
+....
+
+You can also access all functions with the same result category: +
+>>> gr.functionsByCat("Weekday")
+['friday_Weekday', 'monday_Weekday', 'saturday_Weekday', 'sunday_Weekday', 'thursday_Weekday', 'tuesday_Weekday', 'wednesday_Weekday']
+
+The full type of a function can be retrieved as: +
+>>> print(gr.functionType("DetCN"))
+Det -> CN -> NP
+
+ +

Type Checking Abstract Trees

+ +

The runtime type checker can do type checking and type inference +for simple types. Dependent types are still not fully implemented +in the current runtime. The inference is done with method inferExpr: +

+>>> e,ty = gr.inferExpr(e)
+>>> print(e)
+AdjCN (PositA red_A) (UseN theatre_N)
+>>> print(ty)
+CN
+
+The result is a potentially updated expression and its type. In this +case we always deal with simple types, which means that the new +expression will be always equal to the original expression. However, this +wouldn't be true when dependent types are added. +

+ +

Type checking is also trivial: +

+>>> e = gr.checkExpr(e,pgf.readType("CN"))
+>>> print(e)
+AdjCN (PositA red_A) (UseN theatre_N)
+
+In case of type error you will get an exception: +
+>>> e = gr.checkExpr(e,pgf.readType("A"))
+pgf.TypeError: The expected type of the expression AdjCN (PositA red_A) (UseN theatre_N) is A but CN is infered
+
+

+ +

Partial Grammar Loading

+ +By default the whole grammar is compiled into a single file +which consists of an abstract syntax together will all concrete +languages. For large grammars with many languages this might be +inconvinient because loading becomes slower and the grammar takes +more memory. For that purpose you could split the grammar into +one file for the abstract syntax and one file for every concrete syntax. +This is done by using the option -split-pgf in the compiler: +
+$ gf -make -split-pgf App12.pgf
+
+ +Now you can load the grammar as usual but this time only the +abstract syntax will be loaded. You can still use the languages +property to get the list of languages and the corresponding +concrete syntax objects: +
+>>> gr = pgf.readPGF("App.pgf")
+>>> eng = gr.languages["AppEng"]
+
+However, if you now try to use the concrete syntax then you will +get an exception: +
+>>> gr.languages["AppEng"].lookupMorpho("letter")
+Traceback (most recent call last):
+  File "", line 1, in 
+pgf.PGFError: The concrete syntax is not loaded
+
+ +Before using the concrete syntax, you need to explicitly load it: +
+>>> eng.load("AppEng.pgf_c")
+>>> print(eng.lookupMorpho("letter"))
+[('letter_1_N', 's Sg Nom', inf), ('letter_2_N', 's Sg Nom', inf)]
+
+ +When you don't need the language anymore then you can simply +unload it: +
+>>> eng.unload()
+
+ +

GraphViz

+ +GraphViz is used for visualizing abstract syntax trees and parse trees. +In both cases the result is a GraphViz code that can be used for +rendering the trees. See the examples bellow. + +
+>>> print(gr.graphvizAbstractTree(e))
+graph {
+n0[label = "AdjCN", style = "solid", shape = "plaintext"]
+n1[label = "PositA", style = "solid", shape = "plaintext"]
+n2[label = "red_A", style = "solid", shape = "plaintext"]
+n1 -- n2 [style = "solid"]
+n0 -- n1 [style = "solid"]
+n3[label = "UseN", style = "solid", shape = "plaintext"]
+n4[label = "theatre_N", style = "solid", shape = "plaintext"]
+n3 -- n4 [style = "solid"]
+n0 -- n3 [style = "solid"]
+}
+
+ +
+>>> print(eng.graphvizParseTree(e))
+graph {
+  node[shape=plaintext]
+
+  subgraph {rank=same;
+    n4[label="CN"]
+  }
+
+  subgraph {rank=same;
+    edge[style=invis]
+    n1[label="AP"]
+    n3[label="CN"]
+    n1 -- n3
+  }
+  n4 -- n1
+  n4 -- n3
+
+  subgraph {rank=same;
+    edge[style=invis]
+    n0[label="A"]
+    n2[label="N"]
+    n0 -- n2
+  }
+  n1 -- n0
+  n3 -- n2
+
+  subgraph {rank=same;
+    edge[style=invis]
+    n100000[label="red"]
+    n100001[label="theatre"]
+    n100000 -- n100001
+  }
+  n0 -- n100000
+  n2 -- n100001
+}
+
+ + + + diff --git a/index.html b/index.html index d38227c1a..18f8c8bce 100644 --- a/index.html +++ b/index.html @@ -90,9 +90,7 @@ function sitesearch() {

Develop Applications

-- cgit v1.2.3