1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<TITLE>MOLTO Multilingual Phrasebook</TITLE>
</HEAD><BODY BGCOLOR="white" TEXT="black">
<P ALIGN="center"><CENTER><H1>MOLTO Multilingual Phrasebook</H1>
<FONT SIZE="4">
<I>Krasimir Angelov, Olga Caprotti, Ramona Enache, Thomas Hallgren, Inari Listenmaa, Aarne Ranta, Jordi Saludes, Adam Slaski</I><BR>
Showcase for project FP7-ICT-247914, Deliverable D10.2.
</FONT></CENTER>
<P></P>
<HR NOSHADE SIZE=1>
<P></P>
<UL>
<LI><A HREF="#toc1">Purpose</A>
<LI><A HREF="#toc2">Points illustrated</A>
<UL>
<LI><A HREF="#toc3">From the user perspective</A>
<LI><A HREF="#toc4">From the programmer's perspective</A>
</UL>
<LI><A HREF="#toc5">Files</A>
<UL>
<LI><A HREF="#toc6">Grammars</A>
<LI><A HREF="#toc7">Ontology</A>
<LI><A HREF="#toc8">Run-time system and user interface</A>
</UL>
<LI><A HREF="#toc9">Effort and cost</A>
<LI><A HREF="#toc10">Example-based grammar writing prototype</A>
<LI><A HREF="#toc11">To Do</A>
<LI><A HREF="#toc12">How to contribute</A>
<LI><A HREF="#toc13">Conclusions (tentative)</A>
<LI><A HREF="#toc14">Acknowledgements</A>
</UL>
<P></P>
<HR NOSHADE SIZE=1>
<P></P>
<P>
<HR>
<font size=-1>
</P>
<P>
History
</P>
<UL>
<LI>1 September. Version 1.1: bug fixes, some new constructions.
<LI>2 June. Version 1.0 released!
<LI>29 May. Link to Google translate with the current language pair and phrase.
<LI>27 May. Polish added.
<LI>26 May. Version 0.9:
Catalan added, mass/count noun distinction to reduce overgeneration,
improved web interface.
<LI>20 May. Version 0.8:
Spanish added, Bulgarian complete.
<LI>9 May. Version 0.7:
Danish and Norwegian added (preliminary versions induced from statistical models
and resource grammars).
<LI>3 May. Version 0.6:
Extended API (now final for release), Dutch added; new user interface with text
input enabled.
<LI>10 April. Some additions in API, comments in implementation; regenerated clones.
<LI>8 April. Added German.
<LI>7 April. Added the Clone script, applied to initiate the rest of MOLTO languages.
<LI>6 April. Version 0.4: weekdays, nationalities
<LI>30 March. Version 0.3: disambiguation grammar for English
<LI>28 March. Version 0.2: Swe, Ita; cat Action; small phrases.
<LI>26 March 2010. Version 0.1: Eng, Fin, Fre, Ron; dedicated minibar UI.
</UL>
<P>
<A HREF="missing.txt">Missing constructs</A>
</P>
<P>
<A HREF="http://www.grammaticalframework.org/demos/phrasebook/">Back to the phrasebook</A>
</P>
<P>
</font>
<HR>
</P>
<A NAME="toc1"></A>
<H1>Purpose</H1>
<P>
This phrasebook is a program for translating touristic phrases
between 14 European languages included in the
<A HREF="http://www.molto-project.eu">MOLTO</A> project
(Multilingual On-Line Translation):
</P>
<UL>
<LI>Bulgarian, Catalan, Danish, Dutch, English,
Finnish, French, German, Italian, Norwegian,
Polish, Romanian, Spanish, Swedish
</UL>
<P>
A Russian version is not yet finished but is projected later. Also other languages may be added.
</P>
<P>
The phrasebook is implemented by using the GF programming language
(<A HREF="http://grammaticalframework.org">Grammatical Framework</A>).
It is the first demo for the MOLTO project, released in the third month (by June 2010).
The first version is a very small system, but it will extended in the course of the project.
</P>
<P>
The phrasebook has the following requirement specification:
</P>
<UL>
<LI>high quality: reliable translations to express yourself in any of the languages
<LI>translation between all pairs of languages
<LI>runnable in web browsers
<LI>runnable on mobile phones (via web browser; Android stand-alone forthcoming)
<LI>easily extensible by new words (forthcoming: semi-automatic extensions by users)
</UL>
<P>
The phrasebook is available as open-source software, licensed under GNU LGPL.
The source code resides in
<A HREF="http://www.grammaticalframework.org/examples/phrasebook/"><CODE>www.grammaticalframework.org/examples/phrasebook/</CODE></A>
</P>
<A NAME="toc2"></A>
<H1>Points illustrated</H1>
<A NAME="toc3"></A>
<H2>From the user perspective</H2>
<P>
Interlingua-based translation
</P>
<UL>
<LI>we translate meanings, rather than words
</UL>
<P>
Incremental parsing
</P>
<UL>
<LI>the user is at every point guided by the list of possible next words
</UL>
<P>
Mixed modalities
</P>
<UL>
<LI>selection of words ("fridge magnets") combined with text input
</UL>
<P>
Quasi-incremental translation: many basic types are also used as phrases
</P>
<UL>
<LI>one can translate both words and complete sentences, and get intermediate results
</UL>
<P>
Disambiguation, esp. of politeness distinctions
</P>
<UL>
<LI>if a phrase has many translations, each of them is shown and given an explanation
(currently just in English, later in any source language)
</UL>
<P>
Fall-back to statistical translation
</P>
<UL>
<LI>currently just a link to Google translate (forthcoming: tailor-made statistical models)
</UL>
<P>
Feed-back from users
</P>
<UL>
<LI>users are welcomed to send comments, bug reports, and better translation suggestions
</UL>
<A NAME="toc4"></A>
<H2>From the programmer's perspective</H2>
<P>
The use of resource grammars and functors
</P>
<UL>
<LI>the translator was implemented on top of an earlier linguistic knowledge base,
the <A HREF="http://www.grammaticalframework.org/lib">GF Resource Grammar Library</A>
</UL>
<P>
Example-based grammar writing and grammar induction from statistical models
(<A HREF="http://translate.google.com">Google translate</A>)
</P>
<UL>
<LI>many of the grammars were created semi-automatically by generalization from
examples
</UL>
<P>
Compile-time transfer: especially, in Action in Words
</P>
<UL>
<LI>the structural differences between languages are treated at compile time,
for maximal run-time efficiency
</UL>
<P>
The level of skills involved in grammar development
</P>
<UL>
<LI>testing different configurations (see table below)
</UL>
<P>
Grammar testing
</P>
<UL>
<LI>use of treebanks with guided random generation for initial evaluation and regression testing
</UL>
<A NAME="toc5"></A>
<H1>Files</H1>
<A NAME="toc6"></A>
<H2>Grammars</H2>
<P>
<CODE>Sentences</CODE>: general syntactic structures implementable in a uniform way.
Concrete syntax via the functor <CODE>SencencesI</CODE>.
</P>
<P>
<CODE>Words</CODE>: words and predicates, typically language-dependent.
Separate concrete syntaxes.
</P>
<P>
<CODE>Greetings</CODE>: idiomatic phrases, string-based.
Separate concrete syntaxes.
</P>
<P>
<CODE>Phrasebook</CODE>: the top module putting everything together.
Separate concrete syntaxes.
</P>
<P>
<CODE>DisambPhrasebook</CODE>: disambiguation grammars generating feedback phrases if
the input language is ambiguous.
</P>
<P>
<CODE>Numeral</CODE>: resource grammar module directly inherited from the library.
</P>
<P>
Here is the module structure as produced in GF by
</P>
<PRE>
> i -retain DisambPhrasebookEng.gf
> dg -only=Phrasebook*,Sentences*,Words*,Greetings*,Numeral,NumeralEng,DisambPhrasebookEng
> ! dot -Tpng _gfdepgraph.dot >pgraph.png
</PRE>
<P></P>
<P>
<IMG ALIGN="middle" SRC="pgraph.png" BORDER="0" ALT="">
</P>
<A NAME="toc7"></A>
<H2>Ontology</H2>
<P>
The abstract syntax defines the <B>ontology</B> behind the phrasebook.
Some explanations can be found in the
<A HREF="Ontology.html">ontology document</A>, which is produced from the
abstract syntax files
<A HREF="http://www.grammaticalframework.org/examples/phrasebook/Sentences.gf"><CODE>Sentences.gf</CODE></A>
and
<A HREF="http://www.grammaticalframework.org/examples/phrasebook/Words.gf"><CODE>Words.gf</CODE></A>
by <CODE>make doc</CODE>.
</P>
<A NAME="toc8"></A>
<H2>Run-time system and user interface</H2>
<P>
The phrasebook uses
the
<A HREF="http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos">PGF server</A>
written in Haskell and the
<A HREF="http://www.grammaticalframework.org/demos/minibar/about.html">minibar library</A>
written in JavaScript. Since the sources of these systems are available, anyone can build the phrasebook
locally on her own computer.
</P>
<A NAME="toc9"></A>
<H1>Effort and cost</H1>
<TABLE BORDER="1" CELLPADDING="4">
<TR>
<TH>Language</TH>
<TH>Grammarian's language skills</TH>
<TH>Grammarian's GF skills</TH>
<TH>Informant used for development</TH>
<TH>Informant used for testing</TH>
<TH>Use of external tools</TH>
<TH>Impact of external tools</TH>
<TH>Changes on the resource grammar</TH>
<TH COLSPAN="2">Development time</TH>
</TR>
<TR>
<TD>Bulgarian</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">?</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>Catalan</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">?</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">#</TD>
</TR>
<TR>
<TD>Danish</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>Dutch</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>English</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">_</TD>
<TD ALIGN="center">#</TD>
</TR>
<TR>
<TD>Finnish</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">?</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>French</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">?</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">#</TD>
</TR>
<TR>
<TD>German</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">###</TD>
</TR>
<TR>
<TD>Italian</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">?</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>Norwegian</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>Polish</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>Romanian</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">###</TD>
</TR>
<TR>
<TD>Spanish</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">#</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">?</TD>
<TD ALIGN="center">_</TD>
<TD ALIGN="center">##</TD>
</TR>
<TR>
<TD>Swedish</TD>
<TD ALIGN="center">##</TD>
<TD ALIGN="center">###</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">+</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">?</TD>
<TD ALIGN="center">-</TD>
<TD ALIGN="center">##</TD>
</TR>
</TABLE>
<P>
Explanation on scores
</P>
<UL>
<LI>Grammarian's language skills
<UL>
<LI>- : no skills
<LI># : passive knowledge
<LI>## : fluent non-native
<LI>### : native speaker
</UL>
</UL>
<UL>
<LI>Grammarian's GF skills
<UL>
<LI>- : no skills
<LI># : basic skills (2-day GF tutorial)
<LI>## : medium skills (previous experience of similar task)
<LI>### : advanced skills (resource grammar writer/substantial contributor)
</UL>
</UL>
<UL>
<LI>Informant used for development/Informant needed for testing/Use of external tools
<UL>
<LI>- : no
<LI>+ : yes
</UL>
</UL>
<UL>
<LI>Impact of external tools
<UL>
<LI>? : not investigated
<LI>- : no effect on the Phrasebook
<LI># : small impact (literal translation, simple idioms)
<LI>## : medium effect (translation of more forms of words, contextual preposition)
<LI>### : great effect (no extra work needed, translations are correct)
</UL>
</UL>
<UL>
<LI>Changes on the resource grammars
<UL>
<LI>- : no changes
<LI># : 1-3 minor changes
<LI>## : 4-10 minor changes, 1-3 medium changes
<LI>### : >10 changes of any kind
</UL>
</UL>
<UL>
<LI>Overall effort (including extra work on resource grammars)
<UL>
<LI># : less than 8 person hours
<LI>## : 8-24 person hours
<LI>### : >24 person hours
</UL>
</UL>
<A NAME="toc10"></A>
<H1>Example-based grammar writing prototype</H1>
<P>
The figure presents the process of creating a Phrasebook using an example-based
approach for the language X, where X = {Danish, Dutch, German, Norwegian}.
</P>
<P>
<IMG ALIGN="middle" SRC="picpic.jpg" BORDER="0" ALT="">
</P>
<UL>
<LI>the first step assumes an analysis of the resource grammar and extracts the necessary
information that functions that build new lexical entries would need.
A model is built so that the proper forms of the word can be rendered,
and additional information, such as gender, can be inferred. The script applies
these rules to each entry that we want to translate into the target language, and
one obtains a set of constructions.
<LI>they are furthermore given to an external translator tool (Google translate)
or a native speaker for translation. One needs the configuration file even if the
translator is human, because formal knowledge of grammar is not assumed.
<LI>the translations into the target language are further more processed in order to
build the linearizations of the categories first, decoding the information received.
Furthermore, having the words in the lexicon, one can parse the translations of
functions with the GF parser and generalize from that.
<LI>the resulting grammar is tested with the aid of a script that generates
constructions covering all the functions and categories from the grammar, along
with some other constructions that proved to be problematic in some language.
The result of the script contains for each construction in the target language
its English correspondent and the abstract syntax tree. A native speaker
evaluates the results and if corrections are needed, the algorithm runs again
with the new examples. Depending on the language skills of the grammar writer,
the changes can be made directly into the GF files, and the correct examples
given by the native informant are just kept for validating the results.
The algorithm is repeated as long as corrections are needed.
</UL>
<P>
The time needed for preparing the configuration files for a grammar will not be needed
in the future, since the files are reusable for other applications.
The time for the second step can be saved if automatic tools, like Google translate
are used. This is only possible in languages with a simpler morphology and syntax
and large corpora available.
Good results were obtained for German and Dutch with Google translate, but for
languages like Romanian or Polish, which are both complex and lack enough resources,
the results are discouraging.
</P>
<P>
If the statistical oracle works well, the only step where the presence of a human
translator is needed is the evaluation and feedback step. An average of 4 hours per
round and 2 rounds were needed in average for the languages for which we performed
the experiment. It is possible that more effort is needed for more complex languages.
</P>
<A NAME="toc11"></A>
<H1>To Do</H1>
<P>
Disambiguation grammars for other languages than English
</P>
<P>
Extend the abstract lexicon in <CODE>Words</CODE> by hand or (semi)automatically for
</P>
<UL>
<LI>food stuff
<LI>places
<LI>actions
</UL>
<P>
Customizable phone distribution: make your own selection of the 2^15 language subsets
when downloading the phrasebook to a phone
</P>
<A NAME="toc12"></A>
<H1>How to contribute</H1>
<P>
The basic things "everyone" can do is
</P>
<UL>
<LI>complete <A HREF="missing.txt">missing words</A> in concrete syntaxes
<LI>add new abstract words in <CODE>Words</CODE> and greetings in <CODE>Greetings</CODE>
</UL>
<P>
The missing concrete syntax entries are added to the <CODE>Words</CODE><I>L</I><CODE>.gf</CODE>
files for each language <I>L</I>. The
<A HREF="http://www.grammaticalframework.org/lib/doc/synopsis.html#toc78">morphological paradigms</A>
of the GF resource library should be used. Actions (prefixed with <CODE>A</CODE>, as <CODE>AWant</CODE>) are
a little more demanding, since they also require syntax constructors. Greetings (prefixed
with <CODE>G</CODE>) are pure strings.
</P>
<P>
Some explanations can be found in the
<A HREF="Implementation.html">implementation document</A>, which is produced from the
concrete syntax files
<A HREF="http://www.grammaticalframework.org/examples/phrasebook/SentencesI.gf"><CODE>SentencesI.gf</CODE></A>
and
<A HREF="http://www.grammaticalframework.org/examples/phrasebook/WordsEng.gf"><CODE>WordsEng.gf</CODE></A>
by <CODE>make doc</CODE>.
</P>
<P>
Here are the steps to follow for contributors:
</P>
<OL>
<LI>Make sure you have the latest sources
from <A HREF="http://www.grammaticalframework.org/doc/gf-developers.html">GF Darcs</A>,
using <CODE>darcs pull</CODE>.
<LI>Also make sure that you have compiled the library by <CODE>make present</CODE> in <CODE>gf/lib/src/</CODE>.
<LI>Work in the directory
<A HREF="http://www.grammaticalframework.org/examples/phrasebook/"><CODE>gf/examples/phrasebook/</CODE></A>.
<LI>After you've finished your contribution, recompile the phrasebook by <CODE>make pgf</CODE>.
<LI>Save your changes in <CODE>darcs record .</CODE> (in the <CODE>phrasebook</CODE> subdirectory).
<LI>Make a patch file with <CODE>darcs send -o my_phrasebook_patch</CODE>, which you can
send to GF maintainers.
<LI>(Recommended:) Test the phrasebook on your local server:
<OL>
<LI>Go to <CODE>gf/src/server/</CODE> and follow the instructions in the
<A HREF="http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos">project Wiki</A>.
<LI>Make sure that <CODE>Phrasebook.pgf</CODE> is available to you GF server (see project wiki).
<LI>Launch <CODE>lighttpd</CODE> (see project wiki).
<LI>How you can open <CODE>gf/examples/phrasebook/www/phrasebook.html</CODE> and use your phrasebook!
</OL>
</OL>
<UL>
<LI>Don't delete anything! But you are free to correct incorrect forms.
<LI>Don't change the module structure!
<LI>Don't compromise quality to gain coverage: <I>non multa sed multum!</I>
</UL>
<A NAME="toc13"></A>
<H1>Conclusions (tentative)</H1>
<P>
The grammarian need not be a native speaker of the language.
</P>
<P>
For many languages, the grammarian need not even know the language - native informants are
enough.
</P>
<P>
However, evaluation by native speakers is necessary.
</P>
<P>
Correct and idiomatic translations are possible.
</P>
<P>
A typical development time was 2-3 person working days per language.
</P>
<P>
Google translate helps in bootstrapping grammars, but must be checked.
</P>
<UL>
<LI>in particular, unreliable for morphologically rich languages
</UL>
<P>
Resource grammars should give some more support
</P>
<UL>
<LI>higher-level access to constructions like negative expressions
<LI>large-scale morphological lexica
</UL>
<A NAME="toc14"></A>
<H1>Acknowledgements</H1>
<P>
The Phrasebook has been built in the MOLTO project funded by the European Commission.
</P>
<P>
The authors are grateful to their native speaker informants helping to bootstrap and evaluate
the grammars:
Richard Bubel,
Grégoire Détrez,
Rise Eilert,
Karin Keijzer,
Michał Pałka,
Willard Rafnsson,
Nick Smallbone.
</P>
<!-- html code generated by txt2tags 2.5 (http://txt2tags.sf.net) -->
<!-- cmdline: txt2tags -thtml -\-toc doc-phrasebook.txt -->
</BODY></HTML>
|