1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
|
<html>
<body bgcolor="#FFFFFF" text="#000000" >
<center>
<IMG SRC="gf-logo.gif">
<h1>Grammatical Framework History of Changes</h1>
Changes in functionality since May 17, 2005, release of GF Version 2.2
</center>
<p>
25/6 (BB)
Added new speech recognition grammar printers for non-recursive SRGS grammars,
as used by Nuance Recognizer 9.0. Try <tt>pg -printer=srgs_xml_non_rec</tt>
or <tt>pg -printer=srgs_abnf_non_rec</tt>.
<p>
19/6 (AR)
Extended the functor syntax (<tt>with</tt> modules) so that the functor can have
restricted import and a module body (whose function is normally to complete restricted
import). Thus the following format is now possible:
<pre>
concrete C of A = E ** CI - [f,g] with (...) ** open R in {...}
</pre>
At the same time, the possibility of an empty module body was added to other modules
for symmetry. This can be useful for "proxy modules" that just collect other modules
without adding anything, e.g.
<pre>
abstract Math = Arithmetic, Geometry ;
</pre>
<p>
18/6 (AR)
Added a warning for clashing constants. A constant coming from multiple opened modules
was interpreted as "the first" found by the compiler, which was a source of difficult
errors. Clashing is officially forbidden, but we chose to give a warning instead of
raising an error to begin with (in version 2.8).
<p>
30/1/2007 (AR)
Semantics of variants fixed for complex types. Officially, it was only
defined for basic types (Str and parameters). When used for records, results were
multiplicative, which was nor usable. But now variants should work for any type.
<p>
<hr>
<p>
22/12 (AR) <b>Release of GF version 2.7</b>.
<p>
21/12 (AR)
Overloading rules for GF version 2.7:
<ol>
<li> If a unique instance is found by exact match with argument types,
that instance is used.
<li> Otherwise, if exact match with the expected value type gives a
uniques instance, that instance is used.
<li> Otherwise, if among possible instances only one returns a non-function
type, that instance is used, but a warning is issued.
<li> Otherwise, an error results, and the list of possible instances is shown.
</ol>
These rules are still experimental, but all future developments will guarantee
that their type-correct use will work. Rule (3) is only needed because the
current type checker does not always know an expected type. It can give
an incorrect result which is captured later in the compilation. To be noticed,
in particular, is that exact match is required. Match by subtyping will be
investigated later.
<p>
21/12 (BB) Java Speech Grammar Format with SISR tags can now be generated.
Use <tt>pg -printer=jsgf_sisr_old</tt>. The SISR tags are in Working Draft
20030401 format, which is supported by the OptimTALK VoiceXML interpreter
and the IBM XHTML+Voice implementation use by the Opera web browser.
<p>
21/12 (BB) <a name="voicexml">
VoiceXML 2.0 dialog systems can now be generated from GF grammars.
Use <tt>pg -printer=vxml</tt>.
<p>
21/12 (BB) <a name="javascript">
JavaScript code for linearization and type annotation can now be
generated from a multilingual GF grammar. Use <tt>pm -printer=js</tt>.
<p>
5/12 (BB) <a name="gfcc2c">
A new tool for generating C linearization libraries
from a GFCC file. <tt>make gfcc2c</tt> in <tt>src</tt>
compiles the tool. The generated
code includes header files in <tt>lib/c</tt> and should be linked
against <tt>libgfcc.a</tt> in <tt>lib/c</tt>. For an example of
using the generated code, see <tt>src/tools/c/examples/bronzeage</tt>.
<tt>make</tt> in that directory generates a GFCC file, then generates
C code from that, and then compiles a program <tt>bronzeage-test</tt>.
The <tt>main</tt> function for that program is defined in
<tt>bronzeage-test.c</tt>.
<p>
20/11 (AR) Type error messages in concrete syntax are printed with a
heuristic where a type of the form <tt>{... ; lock_C : {} ; ...}</tt>
is printed as <tt>C</tt>. This gives more readable error messages, but
can produce wrong results if lock fields are hand-written or if subtypes
of lock-fielded categories are used.
<p>
17/11 (AR) <a name="overloading">
Operation overloading: an <tt>oper</tt> can have many types,
from which one is picked at compile time. The types must have different
argument lists. Exact match with the arguments given to the <tt>oper</tt>
is required. An example is given in
<a href="../lib/resource-1.0/doc/gfdoc/Constructors.gf"><tt>Constructors.gf</tt></a>.
The purpose of overloading is to make libraries easier to use, since
only one name for each grammatical operation is needed: predication, modification,
coordination, etc. The concrete syntax is, at this experimental level, not
extended but relies on using a record with the function name repeated
as label name (see the example). The treatment of overloading is inspired
by C++, and was first suggested by Björn Nringert.
<p>
3/10 (AR) A new low-level format <tt>gfcc</tt> ("Canonical Canonical GF").
It is going to replace the <tt>gfc</tt> format later, but is already now
an efficient format for multilingual generation.
See <a href="../src/GF/Canon/GFCC/doc/gfcc.html">GFCC document</a>
for more information.
<p>
1/9 (AR) New way for managing errors in grammar compilation:
<pre>
Predef.Error : Type ;
Predef.error : Str -> Predef.Error ;
</pre>
Denotationally, <tt>Error</tt> is the empty type and thus a
subtype of any other types: it can be used anywhere. But the
<tt>error</tt> function is not canonical. Hence the compilation
is interrupted when <tt>(error s)</tt> is translated to GFC, and
the message <tt>s</tt> is emitted. An example use is given in
<tt>english/ParadigmsEng.gf</tt>:
<pre>
regDuplV : Str -> V ;
regDuplV fit =
case last fit of {
("a" | "e" | "i" | "o" | "u" | "y") =>
Predef.error (["final duplication makes no sense for"] ++ fit) ;
t =>
let fitt = fit + t in
mkV fit (fit + "s") (fitt + "ed") (fitt + "ed") (fitt + "ing")
} ;
</pre>
This function thus cannot be applied to a stem ending with a vowel,
which is exactly what we want. In future, it may be good to add similar
checks to all morphological paradigms in the resource.
<p>
16/8 (AR) New generation algorithm: slower but works with less
memory. Default of <tt>gt</tt>; use <tt>gt -mem</tt> for the old
algorithm. The new option <tt>gt -all</tt> lazily generates all
trees until interrupted. It cannot be piped to other GF commands,
hence use <tt>gt -all -lin</tt> to print out linearized strings
rather than trees.
<hr>
22/6 (AR) <b>Release of GF version 2.6</b>.
<p>
20/6 (AR) The FCFG parser is know the default, as it even handles literals.
The old default can be selected by <tt>p -old</tt>. Since
FCFG does not support variable bindings, <tt>-old</tt> is automatically
selected if the grammar has bindings - and unless the <tt>-fcfg</tt> flag
is used.
<p>
17/6 (AR) The FCFG parser is now the recommended method for parsing
heavy grammars such as the resource grammars. It does not yet support
literals and variable bindings.
<p>
1/6 (AR) Added the FCFG parser written by Krasimir Angelov. Invoked by
<tt>p -fcfg</tt>. This parser is as general as MCFG but faster.
It needs more testing and debugging.
<p>
1/6 (AR) The command <tt>r = reload</tt> repeats the latest
<tt>i = import</tt> command.
<p>
30/5 (AR) It is now possible to use the flags <tt>-all, -table, -record</tt>
in combination with <tt>l -multi</tt>, and also with <tt>tb</tt>.
<p>
18/5 (AR) Introduced a wordlist format <tt>gfwl</tt> for
quick creation of language exercises and (in future) multilingual lexica.
The format is now very simple:
<pre>
# Svenska - Franska - Finska
berg - montagne - vuori
klättra - grimper / escalader - kiivetä / kiipeillä
</pre>
but can be extended to cover paradigm functions in addition to just
words.
<p>
3/4 (AR) The predefined abstract syntax type <tt>Int</tt> now has two
inherent parameters indicating its last digit and its size. The (hard-coded)
linearization type is
<pre>
{s : Str ; size : Predef.Ints 1 ; last : Predef.Ints 9}
</pre>
The <tt>size</tt> field has value <tt>1</tt> for integers greater than 9, and
value <tt>0</tt> for other integers (which are never negative). This parameter can
be used e.g. in calculating number agreement,
<pre>
Risala i = {s = i.s ++ table (Predef.Ints 1 * Predef.Ints 9) {
<0,1> => "risalah" ;
<0,2> => "risalatan" ;
<0,_> | <1,0> => "rasail" ;
_ => "risalah"
} ! <i.size,i.last>
} ;
</pre>
Notice that the table has to be typed explicitly for <tt>Ints k</tt>,
because type inference would otherwise return <tt>Int</tt> and therefore
fail to expand the table.
<p>
31/3 (AR) Added flags and options to some commands, to help generation:
<ul>
<li> <tt>gt -noexpand=NP,V,TV</tt> does not expand these categories,
but only generates metavariables for them.
<li> <tt>gt -doexpand=NP,V,TV</tt> only expands these categories,
and generates metavariables for others.
<li> <tt>gr -cf</tt> has the same flags.
<li> <tt>l -mark=metacat</tt> marks the metavariables with their categories.
<li> <tt>p -fail</tt> marks with <tt>#FAIL</tt> strings that have no parse.
<li> <tt>p -ambiguous</tt> marks as <tt>#AMBIGUOUS</tt>
strings that have more than one parse.
</ul>
<p>
<hr>
21/3/2006 <b>Release of GF 2.5</b>.
<p>
16/3 (AR) Added two flag values to <tt>pt -transform=X</tt>:
<tt>nodup</tt> which excludes terms where a constant is duplicated,
and
<tt>nodupatom</tt> which excludes terms where an atomic constant is duplicated.
The latter, in particular, is useful as a filter in generation:
<pre>
gt -cat=Cl | pt -transform=nodupatom
</pre>
This gives a corpus where words don't (usually) occur twice in the same clause.
<p>
6/3 (AR) Generalized the <tt>gfe</tt> file format in two ways:
<ol>
<li> Use the real grammar parser, hence <tt>(in M.C "foo")</tt> expressions
may occur anywhere. But the <i>ad hoc</i> word substitution syntax is
abandoned: ordinary <tt>let</tt> (and <tt>where</tt>) expressions
can now be used instead.
<li> The resource may now be a treebank, not just a grammar. Parsing
is thus replaced by treebank lookup, which in most cases is faster.
</ol>
A minor novelty is that the <tt>--# -resource=FILE</tt> flag can now be
relative to <tt>GF_LIB_PATH</tt>, both for grammars and treebanks.
The flag <tt> --# -treebank=IDENT</tt> gives the language whose treebank
entries are used, in case of a multilingual treebank.
<p>
4/3 (AR) Added command <tt>use_treebank = ut</tt> for lookup in a treebank.
This command can be used as a fast substitute for parsing, but also as a
way to browse treebanks.
<pre>
ut "He adds this to that" | l -multi -- use treebank lookup as parser in translation
ut -assocs | grep "ComplV2" -- show all associations with ComplV2
</pre>
<p>
3/3 (AR) Added option <tt>-treebank</tt> to the <tt>i</tt> command. This adds treebanks to
the shell state. The possible file formats are
<ol>
<li> XML file with a multilingual treebank, produced by <tt>tb -xml</tt>
<li> tab-organized text file with a unilingual treebank, produced by <tt>ut -assocs</tt>
</ol>
Notice that the treebanks in shell state are unilingual, and have strings as keys.
Multilingual treebanks have trees as keys. In case 1, one unilingual treebank per
language is built in the shell state.
<p>
1/3 (AR) Added option <tt>-trees</tt> to the command <tt>tree_bank = tb</tt>.
By this option, the command just returns the trees in the treebank. It can be
used for producing new treebanks with the same trees:
<pre>
rf old.xml | tb -trees | tb -xml | wf new.xml
</pre>
Recall that only treebanks in the XML format can be read with the <tt>-trees</tt>
and <tt>-c</tt> flags.
<p>
1/3 (AR) A <tt>.gfe</tt> file can have a <tt>--# -path=PATH</tt> on its
second line. The file given on the first line (<tt>--# -resource=FILE</tt>)
is then read w.r.t. this path. This is useful if the resource file has
no path itself, which happens when it is gfc-only.
<p>
25/2 (AR) The flag <tt>preproc</tt> of the <tt>i</tt> command (and thereby
to <tt>gf</tt> itself) causes GF to apply a preprocessor to each sourcefile
it reads.
<p>
8/2 (AR) The command <tt>tb = tree_bank</tt> for creating and testing against
multilingual treebanks. Example uses:
<pre>
gr -cat=S -number=100 | tb -xml | wf tb.xml -- random treebank into file
rf tb.txt | tb -c -- read comparison treebank from file
</pre>
<p>
10/1 (AR) Forbade variable binding inside negation and Kleene star
patterns.
<p>
7/1 (AR) Full set of regular expression patterns, with
as-patterns to enable variable bindings to matched expressions:
<ul>
<li> <i>p</i> <tt>+</tt> <i>q</i> : token consisting of <i>p</i> followed by <i>q</i>
<li> <i>p</i> <tt>*</tt> : token <i>p</i> repeated 0 or more times
(max the length of the strin to be matched)
<li> <tt>-</tt> <i>p</i> : matches anything that <i>p</i> does not match
<li> <i>x</i> <tt>@</tt> <i>p</i> : bind to <i>x</i> what <i>p</i> matches
<li> <i>p</i> <tt>|</tt> <i>q</i> : matches what either <i>p</i> or <i>q</i> matches
</ul>
The last three apply to all types of patterns, the first two only to token strings.
Example: plural formation in Swedish 2nd declension
(<i>pojke-pojkar, nyckel-nycklar, seger-segrar, bil-bilar</i>):
<pre>
plural2 : Str -> Str = \w -> case w of {
pojk + "e" => pojk + "ar" ;
nyck + "e" + l@("l" | "r" | "n") => nyck + l + "ar" ;
bil => bil + "ar"
} ;
</pre>
Semantics: variables are always bound to the <b>first match</b>, in the sequence defined
as the list <tt>Match p v</tt> as follows:
<pre>
Match (p1|p2) v = Match p1 v ++ Match p2 v
Match (p1+p2) s = [Match p1 s1 ++ Match p2 s2 | i <- [0..length s], (s1,s2) = splitAt i s]
Match p* s = Match "" s ++ Match p s ++ Match (p + p) s ++ ...
Match c v = [[]] if c == v -- for constant patterns c
Match x v = [[(x,v)]] -- for variable patterns x
Match x@p v = [[(x,v)]] + M if M = Match p v /= []
Match p v = [] otherwise -- failure
</pre>
Examples:
<ul>
<li> <tt>x + "e" + y</tt> matches <tt>"peter"</tt> with <tt>x = "p", y = "ter"</tt>
<li> <tt>x@("foo"*)</tt> matches any token with <tt>x = ""</tt>
<li> <tt>x + y@("er"*)</tt> matches <tt>"burgerer"</tt> with <tt>x = "burg", y = "erer"</tt>
</ul>
<p>
6/1 (AR) Concatenative string patterns to help morphology definitions...
This can be seen as a step towards regular expression string patterns.
The natural notation <tt>p1 + p2</tt> will be considered later.
<b>Note</b>. This was done on 7/1.
<p>
5/1/2006 (BB) New grammar printers <tt>slf_sub</tt> and <tt>slf_sub_graphviz</tt>
for creating SLF networks with sub-automata.
<hr>
22/12 <b>Release of GF 2.4</b>.
<p>
21/12 (AR) It now works to parse escaped string literals from command
line, and also string literals with spaces:
<pre>
gf examples/tram0/TramEng.gf
> p -lexer=literals "I want to go to \"Gustaf Adolfs torg\" ;"
QInput (GoTo (DestNamed "Gustaf Adolfs torg"))
</pre>
<p>
20/12 (AR) Support for full disjunctive patterns (<tt>P|Q</tt>) i.e.
not just on top level.
<p>
14/12 (BB) The command <tt>si</tt> (<tt>speech_input</tt>) which creates
a speech recognizer from a grammar for English and admits speech input
of strings has been added. The command uses an
<a href="http://htk.eng.cam.ac.uk/develop/atk.shtml">ATK</a> recognizer and
creates a recognition
network which accepts strings in the currently active grammar.
In order to use the <tt>si</tt> command,
you need to install the
<a href="http://www.cs.chalmers.se/~bringert/darcs/atkrec/">atkrec library</a>
and configure GF with <tt>./configure --with-atk</tt> before compiling.
You need to set two environment variables for the <tt>si</tt> command to
work. <tt>ATK_HOME</tt> should contain the path to your copy of ATK
and <tt>GF_ATK_CFG</tt> should contain the path to your GF ATK configuration
file. A default version of this file can be found in
<tt>GF/src/gf_atk.cfg</tt>.
<p>
11/12 (AR) Parsing of float literals now possible in object language.
Use the flag <tt>lexer=literals</tt>.
<p>
6/12 (AR) Accept <tt>param</tt> and <tt>oper</tt> definitions in
<tt>concrete</tt> modules. The definitions are just inlined in the
current module and not inherited. The purpose is to support rapid
prototyping of grammars.
<p>
2/12 (AR) The built-in type <tt>Float</tt> added to abstract syntax (and
resource). Values are stored as Haskell's <tt>Double</tt> precision
floats. For the syntax of float literals, see BNFC document.
NB: some bug still prevents parsing float literals in object
languages. <b>Bug fixed 11/12.</b>
<p>
1/12 (BB,AR) The command <tt>at = apply_transfer</tt>, which applies
a transfer function to a term. This is used for noncompositional
translation. Transfer functions are defined in a special transfer
language (file suffix <tt>.tr</tt>), which is compiled into a
run-time transfer core language (file suffix <tt>.trc</tt>).
The compiler is included in <tt>GF/transfer</tt>. The following is
a complete example of how to try out transfer:
<pre>
% cd GF/transfer
% make -- compile the trc compiler
% cd examples -- GF/transfer/examples
% ../compile_to_core -i../lib numerals.tr
% mv numerals.trc ../../examples/numerals
% cd ../../examples/numerals -- GF/examples/numerals
% gf
> i decimal.gf
> i BinaryDigits.gf
> i numerals.trc
> p -lang=Cncdecimal "123" | at num2bin | l
1 0 0 1 1 0 0 1 1 1 0
</pre>
Other relevant commands are:
<ul>
<li> <tt>i file.trc</tt>: import a transfer module
<li> <tt>pg -printer=transfer</tt>: create a syntax datatype in <tt>.tr</tt> format
</ul>
For more information on the commands, see <tt>help</tt>. Documentation on
the transfer language: to appear.
<p>
17/11 (AR) Made it possible for lexers to be nondeterministic.
Now with a simple-minded implementation that the parser is sent
each lexing result in turn. The option <tt>-cut</tt> is used for
breaking after first lexing leading to successful parse. The only
nondeterministic lexer right now is <tt>-lexer=subseqs</tt>, which
first filters with <tt>-lexer=ignore</tt> (dropping words neither in
the grammar nor literals) and then starts ignoring other words from
longest to shortest subsequence. This is usable for parser tasks
of keyword spotting type, but expensive (2<sup>n</sup>) in long input.
A smarter implementation is therefore desirable.
<p>
14/11 (AR) Functions can be made unparsable (or "internal" as
in BNFC). This is done by <tt>i -noparse=file</tt>, where
the nonparsable functions are given in <tt>file</tt> using the
line format <tt>--# noparse Funs</tt>. This can be used e.g. to
rule out expensive parsing rules. It is used in
<tt>lib/resource/abstract/LangVP.gf</tt> to get parse values
structured with <tt>VP</tt>, which is obtained via transfer.
So far only the default (= old) parser generator supports this.
<p>
14/11 (AR) Removed the restrictions how a lincat may look like.
Now any record type that has a value in GFC (i.e. without any
functions in it) can be used, e.g. {np : NP ; cn : Bool => CN}.
To display linearization values, only <tt>l -record</tt> shows
nice results.
<p>
9/11 (AR) GF shell state can now have several abstract syntaxes with
their associated concrete syntaxes. This allows e.g. parsing with
resource while testing an application. One can also have a
parse-transfer-lin chain from one abstract syntax to another.
<p>
7/11 (BB) Running commands can now be interrupted with Ctrl-C, without
killing the GF process. This feature is not supported on Windows.
<p>
1/11 (AR) Yet another method for adding probabilities: append
<tt> --# prob Double</tt> to the end of a line defining a function.
This can be (1) a <tt>.cf</tt> rule (2) a <tt>fun</tt> rule, or
(3) a <tt>lin</tt> rule. The probability is attached to the
first identifier on the line.
<p>
1/11 (BB) Added generation of weighted SRGS grammars. The weights
are calculated from the function probabilities. The algorithm
for calculating the weights is not yet very good.
Use <tt>pg -printer=srgs_xml_prob</tt>.
<p>
31/10 (BB) Added option for converting grammars to SRGS grammars in XML format.
Use <tt>pg -printer=srgs_xml</tt>.
<p>
31/10 (AR) Probabilistic grammars. Probabilities can be used to
weight random generation (<tt>gr -prob</tt>) and to rank parse
results (<tt>p -prob</tt>). They are read from a separate file
(flag <tt>i -probs=File</tt>, format <tt>--# prob Fun Double</tt>)
or from the top-level grammar file itself (option <tt>i -prob</tt>).
To see the probabilities, use <tt>pg -printer=probs</tt>.
<br>
As a by-product, the probabilistic random generation algorithm is
available for any context-free abstract syntax. Use the flag
<tt>gr -cf</tt>. This algorithm is much faster than the
old (more general) one, but it may sometimes loop.
<p>
12/10 (AR) Flag <tt>-atoms=Int</tt> to the command <tt>gt = generate_trees</tt>
takes away all zero-argument functions except Int per category. In
this way, it is possible to generate a corpus illustrating each
syntactic structure even when the lexicon (which consists of
zero-argument functions) is large.
<p>
6/10 (AR) New commands <tt>dc = define_command</tt> and
<tt>dt = define_tree</tt> to define macros in a GF session.
See <tt>help</tt> for details and examples.
<p>
5/10 (AR) Printing missing linearization rules:
<tt>pm -printer=missing</tt>. Command <tt>g = grep</tt>,
which works in a way similar to Unix grep.
<p>
5/10 (PL) Printing graphs with function and category dependencies:
<tt>pg -printer=functiongraph</tt>, <tt>pg -printer=typegraph</tt>.
<p>
20/9 (AR) Added optimization by <b>common subexpression elimination</b>.
It works on GFC modules and creates <tt>oper</tt> definitions for
subterms that occur more than once in <tt>lin</tt> definitions. These
<tt>oper</tt> definitions are automatically reinlined in functionalities
that don't support <tt>oper</tt>s in GFC. This conversion is done by
module and the <tt>oper</tt>s are not inherited. Moreover, the subterms
can contain free variables which means that the <tt>oper</tt>s are not
always well typed. However, since all variables in GFC are type-specific
(and local variables are <tt>lin</tt>-specific), this does not destroy
subject reduction or cause illegal captures.
<br>
The optimization is triggered by the flag <tt>optimize=OPT_subs</tt>,
where <tt>OPT</tt> is any of the other optimizations (see <tt>h -optimize</tt>).
The most aggressive value of the flag is <tt>all_subs</tt>. In experiments,
the size of a GFC module can shrink by 85% compared to plain <tt>all</tt>.
<p>
18/9 (AR) Removed superfluous spaces from GFC printing. This shrinks
the GFC size by 5-10%.
<p>
15/9 (AR) Fixed some bugs in dependent-type type checking of abstract
modules at compile time. The type checker is more severe now, which means
that some old grammars may fail to compile - but this is usually the
right result. However, the type checker of <tt>def</tt> judgements still
needs work.
<p>
14/9 (AR) Added printing of grammars to a format without parameters, in
the spirit of Peanos "Latino sine flexione". The command <tt>pg -unpar</tt>
does the trick, and the result can be saved in a <tt>gfcm</tt> file. The generated
concrete syntax modules get the prefix <tt>UP_</tt>. The translation is briefly:
<pre>
(P => T)* = T*
(t ! p)* = t*
(table {p => t ; ...})* = t*
</pre>
In order for this to be maximally useful, the grammar should be written in such
a way that the first value of every parameter type is the desired one. For
instance, in Peano's case it would be the ablative for noun cases, the singular for
numbers, and the 2nd person singular imperative for verb forms.
<p>
14/9 (BB) Added finite state approximation of grammars.
Internally the conversion is done <tt>cfg -> regular -> fa -> slf</tt>, so the
different printers can be used to check the output of each stage.
The new options are:
<dl>
<dt><tt>pg -printer=slf</tt></dt>
<dd>A finite automaton in the HTK SLF format.</dd>
<dt><tt>pg -printer=slf_graphviz</tt></dt>
<dd>The same FA as in SLF, but in Graphviz format.</dd>
<dt><tt>pg -printer=fa_graphviz</tt></dt>
<dd>A finite automaton with labelled edges, instead of labelled nodes which SLF has.</dd>
<dt><tt>pg -printer=regular</tt></dt>
<dd>A regular grammar in a simple BNF.</dd>
</dl>
<p>
4/9 (AR) Added the option <tt>pg -printer=stat</tt> to show
statistics of gfc compilation result. To be extended with new information.
The most important stats now are the top-40 sized definitions.
<p>
<hr>
1/7 <b>Release of GF 2.3</b>.
<p>
1/7 (AR) Added the flag <tt>-o</tt> to the <tt>vt</tt> command
to just write the <tt>.dot</tt> file without going to <tt>.ps</tt>
(cf. 20/6).
<p>
29/6 (AR) The printer used by Embedded Java GF Interpreter
(<tt>pm -header</tt>) now produces
working code from all optimized grammars - hence you need not select a
weaker optimization just to use the interpreter. However, the
optimization <tt>-optimize=share</tt> usually produces smaller object
grammars because the "unoptimizer" just undoes all optimizations.
(This is to be considered a temporary solution until the interpreter
knows how to handle stronger optimizations.)
<p>
27/6 (AR) The flag <tt>flags optimize=noexpand</tt> placed in a
resource module prevents the optimization phase of the compiler when
the <tt>.gfr</tt> file is created. This can prevent serious code
explosion, but it will also make the processing of modules using the
resource slowwer. A favourable example is <tt>lib/resource/finnish/ParadigmsFin</tt>.
<p>
23/6 (HD,AR) The new editor GUI <tt>gfeditor</tt> by Hans-Joachim
Daniels can now be used. It is based on Janna Khegai's <tt>jgf</tt>.
New functionality include HTML display (<tt>gfeditor -h</tt>) and
programmable refinement tooltips.
<p>
23/6 (AR) The flag <tt>unlexer=finnish</tt> can be used to bind
Finnish suffixes (e.g. possessives) to preceding words. The GF source
notation is e.g. <tt>"isä" ++ "&*" ++ "nsa" ++ "&*" ++ "ko"</tt>,
which unlexes to <tt>"isänsäkö"</tt>. There is no corresponding lexer
support yet.
<p>
22/6 (PL,AR) The MCFG parser (<tt>p -mcfg</tt>) now works on all
optimized grammars - hence you need not select a weaker optimization
to use this parser. The same concerns the CFGM printer (<tt>pm -printer=cfgm</tt>).
<p>
20/6 (AR) Added the command <tt>visualize_tree</tt> = <tt>vt</tt>, to
display syntax trees graphically. Like <tt>vg</tt>, this command uses
GraphViz and Ghostview. The foremost use is to pipe the parser to this
command.
<p>
17/6 (BB) There is now support for lists in GF abstract syntax.
A list category is declared as:
<pre>
cat [C]
</pre>
or
<pre>
cat [C]{n}
</pre>
where <tt>C</tt> is a category and <tt>n</tt> is a non-negative integer.
<tt>cat [C]</tt> is equivalent to <tt>cat [C]{0}</tt>. List category
syntax can be used whereever categories are used.
<p>
<tt>cat [C]{n}</tt> is equivalent to the declarations:
<pre>
cat ListC
fun BaseC : C^n -> ListC
fun ConsC : C -> ListC -> ListC
</pre>
where <tt>C^0 -> X</tt> means <tt>X</tt>, and <tt>C^m</tt> (where
m > 0) means <tt>C -> C^(m-1)</tt>.
<p>
A lincat declaration on the form:
<pre>
lincat [C] = T
</pre>
is equivalent to
<pre>
lincat ListC = T
</pre>
The linearizations of the list constructors are written
just like they would be if the function declarations above
had been made manually, e.g.:
<pre>
lin BaseC x_1 ... x_n = t
lin ConsC x xs = t'
</pre>
<p>
10/6 (AR) Preprocessor of <tt>.gfe</tt> files can now be performed as part of
any grammar compilation. The flag <tt>-ex</tt> causes GF to look for
the <tt>.gfe</tt> files and preprocess those that are younger
than the corresponding <tt>.gf</tt> files. The files are first sorted
and grouped by the resource, so that each resource only need be compiled once.
<p>
10/6 (AR) Editor GUI can now be alternatively invoked by the shell
command <tt>gf -edit</tt> (equivalent to <tt>jgf</tt>).
<p>
10/6 (AR) Editor GUI command <tt>pc Int</tt> to pop <tt>Int</tt>
items from the clip board.
<p>
4/6 (AR) Sequence of commands in the Java editor GUI now possible.
The commands are separated by <tt> ;; </tt> (notice the space on
both sides of the two semicolons). Such a sequence can be sent
from the "GF Command" pop-up field, but is mostly intended
for external processes that communicate with GF.
<p>
3/6 (AR) The format <tt>.gfe</tt> defined to support
<b>grammar writing by examples</b>. Files of this format are first
converted to <tt>.gf</tt> files by the command
<pre>
gf -examples File.gfe
</pre>
See <a href="../lib/resource/doc/example/QuestionsI.gfe">
<tt>../lib/resource/doc/examples/QuestionsI.gfe</tt></a>
for an example.
<p>
31/5 (AR) Default of p -rawtrees=k changed to 999999.
<p>
31/5 (AR) Support for restricted inheritance. Syntax:
<pre>
M -- inherit everything from M, as before
M [a,b,c] -- only inherit constants a,b,c
M-[a,b,c] -- inherit everything except a,b,c
</pre>
Caution: there is no check yet for completeness and
consistency, but restricted inheritance can create
run-time failures.
<p>
29/5 (AR) Parser support for reading GFC files line per line.
The category <tt>Line</tt> in <tt>GFC.cf</tt> can be used
as entrypoint instead of <tt>Grammar</tt> to achieve this.
<p>
28/5 (AR) Environment variables and path wild cards.
<ul>
<li> <tt>GF_LIB_PATH</tt> gives the location of <tt>GF/lib</tt>
<li> <tt>GF_GRAMMAR_PATH</tt> gives a list of directories appended
to the explicitly given path
<li> <tt>DIR/*</tt> is expanded to the union of all subdirectories
of <tt>DIR</tt>
</ul>
<p>
26/5/2005 (BB) Notation for list categories.
</body>
</html>
|