From b248e6e25e5b58163cc9b897be7eb0b4bf6dbdc6 Mon Sep 17 00:00:00 2001 From: aarne Date: Mon, 21 Jun 2004 08:53:58 +0000 Subject: for release meeting --- doc/release2.html | 546 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 546 insertions(+) create mode 100644 doc/release2.html (limited to 'doc') diff --git a/doc/release2.html b/doc/release2.html new file mode 100644 index 000000000..d34b49cc1 --- /dev/null +++ b/doc/release2.html @@ -0,0 +1,546 @@ + + + + +
+ +

Grammatical Framework Version 2

+ +Release of Version 2.0 + +

+ +Planned: 24 June 2004 + +

+ +Aarne Ranta + +

+ + + + +

Highlights

+ +Module system. + +

+ +Separate compilation to canonical GF. + +

+ +Improved GUI. + +

+ +Improved parser generation. + +

+ +Improved shell (new commands and options, help, error messages). + +

+ +Accurate language specification +(also of GFC). + +

+ +Extended resource library. + +

+ +Extended Numerals library. + + + + + + +

Module system

+ +
  • Separate modules for abstract, + concrete, and resource. +
  • Replaces the file-based include system +
  • Name space handling with qualified names +
  • Hierarchic structure (single inheritance **) + + cross-cutting reuse (open) +
  • Separate compilation, one module per file +
  • Reuse of abstract+concrete as resource +
  • Parametrized modules: + interface, instance, incomplete. +
  • New experimental module types: transfer, + union. + + + + +

    Canonical format GFC

    + +
  • The target of GF compiler; to reuse, just read in. + +
  • Readable by Haskell/Java/C++/C applications (by BNFC generated parsers). + + + + + +

    New features in expression language

    + +In addition to the module system: + +

    + +

  • Disjunctive patterns P | ... | Q. +
  • String patterns "foo". +
  • (?) Integer patterns 74. +
  • Binding token &+ to glue separate tokens at unlexing phase, + and unlexer to resolve this. +
  • New syntax alternatives for local definitions: let without + braces and where. +
  • Pattern variables can be used on lhs's of oper definitions. +
  • New Unicode transliterations (by Harad Hammarström). + + + + +

    New shell commands and command functionalities

    + +
  • pi = print_info: information on an identifier in scope. +
  • h = help now in long or short form, + and on individual commands. +
  • gt = generate_trees: all trees of a given + category or instantiations of a given incomplete term, up to a + given depth. +
  • gr = generate_random can now be given + an incomplete term as an argument, to constrain generation. +
  • so = show_opers shows all ope + operations with a given value type. +
  • pm = print_multi prints the multilingual + grammar resident in the current state to a ready-compiles + .gfcm file. +
  • All commands have both long and short names (see help). Short + names are easier to type, whereas long names + make scripts more readable. +
  • Meaningless command options generate warnings. + + + + +

    New editor features

    + +
  • Active text field: click the middle button in the focus to send + in refinement through the parser. +
  • Clipboard: copy complex terms into the refine menu. +
  • Two-step refinements generated by the "Generate" operation. + + + +

    Improved implementation

    + +
  • Haskell source code is organized into subdirectories. +
  • BNF Converter is used for defining the languages GF and GFC, which also + give reliable LaTeX documentation. +
  • Lexical rules sorted out by option -cflexer for efficient + parsing with large lexica. +
  • GHC optimizations and strictness flags are used for improving performance. + + + + +

    New parser (work in progress)

    + +
  • By Peter Ljunglöf, based on MCFG. +
  • Much more efficient for morphology and discontinuous constituents. +
  • Treatment of cyclic rules. +
  • Currently lots of alternative parsers via flags -parser=newX. + + + + +

    Status (21/6/2004)

    + +Grammar compiler, editor GUIs, and shell work for all platforms +(with restrictions for Solaris). + +

    + +The updated HelpFile (accessible through h command) +marks unsupported features present in GF 1.2 with *. +They will be supported again if interested users appear. + +

    + +GF1 grammars can be automatically translated to GF2 (although the +result is not as good +as manual, since indentation and comments are destroyed). The results can be +saved in GF2 files, but this is not necessary. +Some rarely used GF1 features are no longer supported (see next section). + +

    + +It is also possible to write a GF2 grammar back to GF1, with the +command pg -printer=old. + + + + +Resource libraries +and some example grammars and have been +converted. Most old example grammars work without any changes. +There is a new resource API with +many new constructions. + +

    + +A make facility works, finding out which modules have to be recompiled. + +

    + +Soundness checking of module depencencies and completeness is not +complete. This means that some errors may show up too late. + +

    + +The environment variable GF_LIB_PATH needs some more work. + +

    + +Latex and XML printing of grammars do not work yet. + + + + + +

    How to use GF 1.* files

    + +Backward compatibility with respect to old GF grammars has been +a central goal. All GF grammars, from version 0.9, should work in +the old way in GF2. The main exceptions are some features that +are rarely used. + + +

    + +Very old GF grammars (from versions before 0.9), with the completely +different notation, do not work. They should be first converted to +GF1 by using GF version 1.2. + + + + + +The import command i can be given the option -old. E.g. +

    +  i -old tut1.Eng.g2
    +
    +But this is no more necessary: GF2 detects automatically if a grammar +is in the GF1 format. + +

    + +Importing a set of GF2 files generates, internally, three modules: +

    +  abstract tut1 = ...
    +  resource ResEng = ...
    +  concrete Eng of tut1 = open ResEng in ...
    +
    +(The names are different if the file name has fewer parts.) + + +

    + +The option -o causes GF2 to write these modules into files. + + + + +The flags -abs, -cnc, and -res can be used +to give custom names to the modules. In particular, it is good to use +the -abs flag to guarantee that the abstract syntax module +has the same name for all grammars in a multilingual environmens: +

    +  i -old -abs=Numerals hungarian.gf
    +  i -old -abs=Numerals tamil.gf
    +  i -old -abs=Numerals sanskrit.gf
    +
    + +

    + +The same flags as in the import command can be used when invoking +GF2 from the system shell. Many grammars can be imported on the same command +line, e.g. +

    +  % gf2 -old -abs=Tutorial tut1.Eng.gf tut1.Fin.gf tut1.Fra.gf
    +
    + +

    + +To write a GF2 grammar back to GF1 (as one big file), use the command +

    +  > pg -old
    +
    + + + + + + + +GF2 has more reserved words than GF 1.2. When old files are read, a preprocessor +replaces every identifier that has the shape of a new reserved word +with a variant where the last letter is replaced by Z, e.g. +instance is replaced by instancZ. This method is of course +unsafe and should be replaced by something better. + + + + + + +

    Abstract, concrete, and resource modules

    + +Judgement forms are sorted as follows: + + + + + +Example: +
    +  abstract Sums = {
    +    cat 
    +      Exp ;
    +    fun 
    +      One : Exp ;
    +      plus : Exp -> Exp -> Exp ;
    +  }
    +
    +  concrete EnglishSums of Sums = open ResEng in {
    +    lincat 
    +      Exp = {s : Str ; n : Number} ;
    +    lin
    +      One = expSg "one" ;
    +      sum x y = expSg ("the" ++ "sum" ++ "of" ++ x.s ++ "and" ++ y.s) ;
    +  }
    +
    +  resource ResEng = {
    +    param 
    +      Number = Sg | Pl ;
    +    oper 
    +      expSG : Str -> {s : Str ; n : Number} = \s -> {s = s ; n = Sg} ;
    +  }
    +
    + + + + + +

    Opening and extending modules

    + +A concrete or resource can open a +resource. This means that + +A module of any type can moreover extend a module of the same type. +This means that + +Examples of extension: +
    +  abstract Products = Sums ** {
    +    fun times : Exp -> Exp -> Exp ;
    +  }
    +  -- names exported: Exp, plus, times
    +
    +  concrete English of Products = EnglishSums ** open ResEng in {
    +    lin times x y = expSg ("the" ++ "product" ++ "of" ++ x.s ++ "and" ++ y.s) ;
    +  }
    +
    +Another important difference: +
  • extension is single +
  • opening can be multiple: open Foo, Bar, Baz in {...} + + + +Moreover: +
  • opening can be qualified +

    +Example of qualified opening: +

    +  concrete NumberSystems of Systems = open (Bin = Binary), (Dec = Decimal) in {
    +    lin 
    +      BZero = Bin.Zero ;
    +      DZero = Dec.Zero
    +  }
    +
    + + + + +

    Compiling modules

    + +Separate compilation assumes there is one module per file. + +

    + +The module header is the beginning of the module code up to the +first left bracket ({). The header gives +

    + + + + +filename = modulename . extension + +

    + +File name extensions: +

    +Only gf files should ever be written/edited manually! + + + + + + +What the make facility does when compiling Foo.gf +
      +
    1. read the module header of Foo.gf, and recursively all headers from +the modules it depends on (i.e. extends or opens) +
    2. build a dependency graph of these modules, and do topological sorting +
    3. starting from the first module in topological order, +compare the modification times of each gf and gfc file: +
        +
      • if gf is later, compile the module and all modules depending on it +
      • if gfc is later, just read in the module +
      +
    +Inside the GF shell, also time stamps of modules read into memory are +taken into account. Thus a module need not be read from a file if the +module is in the memory and the file has not been modified. + + + + +If the compilation of a grammar fails at some module, the state of the +GF shell contains all modules read up to that point. This makes it +faster to compile the faulty module again after fixing it. + +

    + +Use the command po = print_options to see what +modules are in the state. + +

    + +To force compilation: +

    + + + +

    Module search paths

    + +Modules can reside in different directories. Use the path +flag to extend the directory search path. For instance, +
    +  -path=.:../resource/russian:../prelude
    +
    +enables files to be found in three different directories. +By default, only the current directory is included. +If a path flag is given, the current directory +. must be explicitly included if it is wanted. + +

    + +The path flag can be set in any of the following +places: +

    +A flag set on a command line overrides ones set in files. + +

    + +The value of the environment variable GF_LIB_PATH is +appended to the user-given path. + + + + +

    To do

    + +Testing + +

    + +Documentation + +

    + +Packaging + + + + + +

    Nasty details

    + + +
  • Readline in Solaris + +
  • Proper treatment file search paths + +
  • Unicode fonts in GUIs + +
  • directionality of Semitic alphabets + + + + + -- cgit v1.2.3