1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<META NAME="generator" CONTENT="http://txt2tags.sf.net">
<TITLE>MOLTO Multilingual Phrasebook</TITLE>
</HEAD><BODY BGCOLOR="white" TEXT="black">
<P ALIGN="center"><CENTER><H1>MOLTO Multilingual Phrasebook</H1>
<FONT SIZE="4">
<I>Krasimir Angelov, Olga Caprotti, Ramona Enache, Thomas Hallgren, Inari Listenmaa, Aarne Ranta </I><BR>
</FONT></CENTER>
<P>
<HR>
<font size=-1>
</P>
<P>
History
</P>
<UL>
<LI>20 May. Version 0.9:
Spanish added, Bulgarian complete.
<P></P>
<LI>9 May. Version 0.7:
Danish and Norwegian added (preliminary versions induced from statistical models
and resource grammars).
<LI>3 May. Version 0.6:
Extended API (now final for release), Dutch added; new user interface with text
input enabled.
<LI>10 April. Some additions in API, comments in implementation; regenerated clones.
<LI>8 April. Added German.
<LI>7 April. Added the Clone script, applied to initiate the rest of MOLTO languages.
<LI>6 April. Version 0.4: weekdays, nationalities
<LI>30 March. Version 0.3: disambiguation grammar for English
<LI>28 March. Version 0.2: Swe, Ita; cat Action; small phrases.
<LI>26 March 2010. Version 0.1: Eng, Fin, Fre, Ron; dedicated minibar UI.
</UL>
<P>
<A HREF="missing.txt">Missing constructs</A>
</P>
<P>
<A HREF="http://tournesol.cs.chalmers.se/~aarne/phrasebook/phrasebook.html">Back to phrasebook</A>
</P>
<P>
</font>
<HR>
</P>
<H1>Purpose</H1>
<P>
This phrasebook is a program for translating touristic phrases
between the 15 European languages included in the
<A HREF="http://www.molto-project.eu">MOLTO</A> project
(Multilingual On-Line Translation):
</P>
<UL>
<LI>Bulgarian, Catalan, Danish, Dutch, English,
Finnish, French, German, Italian, Norwegian,
Polish, Romanian, Russian, Spanish, Swedish
</UL>
<P>
It is implemented by using the GF programming language
(<A HREF="http://grammaticalframework.org">Grammatical Framework</A>).
It is the first demo for the MOLTO project, released in the third month (by June 2010)
but to be updated in the course of the project.
</P>
<P>
The phrasebook has the following requirements:
</P>
<UL>
<LI>high quality: reliable translations to express yourself in any language
<LI>translation between all pairs of languages
<LI>runnable in web browsers
<LI>runnable on mobile phones (also off-line: forthcoming for Android phones)
<LI>easily extensible by new words (forthcoming: semi-automatic extensions by users)
</UL>
<P>
The phrasebook is available as open-source software, licensed under GNU LGPL.
The source code resides in
<A HREF="http://code.haskell.org/gf/examples/phrasebook/"><CODE>code.haskell.org/gf/examples/phrasebook/</CODE></A>
</P>
<P>
Current status (20 May 2010):
</P>
<UL>
<LI>small but useful coverage in abstract syntax
<LI>reasonable implementations for
Bulgarian, Danish, Dutch, English, Finnish, French, German,
Italian, Norwegian, Romanian, Spanish, Swedish
<LI>mostly just cloned for the rest of MOLTO languages
<LI>temporary user interdace
<LI>works on web browsers calling a server
<LI>web service not yet released, but preliminarily available in
<A HREF="http://tournesol.cs.chalmers.se/~aarne/phrasebook/phrasebook.html"><CODE>http://tournesol.cs.chalmers.se/~aarne/phrasebook/phrasebook.html</CODE></A>
</UL>
<H1>Points illustrated</H1>
<P>
Interlingua-based translation.
</P>
<P>
Incremental parsing.
</P>
<P>
The use of resource grammars and functors.
</P>
<P>
Example-based grammar writing and grammar induction from statistical models (Google).
</P>
<P>
Compile-time transfer: especially, in Action in Words.
</P>
<P>
Quasi-incremental translation: many basic types are also used as phrases.
</P>
<P>
Disambiguation, esp. of politeness distinctions.
</P>
<H1>Ontology</H1>
<P>
The abstract syntax defines the <B>ontology</B> behind the phrasebook.
Some explanations can be found in the
<A HREF="Ontology.html">ontology document</A>, which is produced from the
abstract syntax files
<A HREF="http://code.haskell.org/gf/examples/phrasebook/Sentences.gf"><CODE>Sentences.gf</CODE></A>
and
<A HREF="http://code.haskell.org/gf/examples/phrasebook/Words.gf"><CODE>Words.gf</CODE></A>
by <CODE>make doc</CODE>.
</P>
<H1>Files</H1>
<P>
<CODE>Sentences</CODE>: general syntactic structures implementable in a uniform way.
Concrete syntax via the functor <CODE>SencencesI</CODE>.
</P>
<P>
<CODE>Words</CODE>: words and predicates, typically language-dependent.
Separate concrete syntaxes.
</P>
<P>
<CODE>Greetings</CODE>: idiomatic phrases, string-based.
Separate concrete syntaxes.
</P>
<P>
<CODE>Phrasebook</CODE>: the top module putting everything together.
Separate concrete syntaxes.
</P>
<P>
<CODE>DisambPhrasebook</CODE>: disambiguation grammars generating feedback phrases if
the input language is ambiguous.
</P>
<P>
Here is the module structure as produced in GF by
</P>
<PRE>
> i -retain DisambPhrasebookEng.gf
> dg -only=Phrasebook*,Sentences*,Words*,Greetings*,DisambPhrasebookEng
> ! dot -Tpng _gfdepgraph.dot >pgraph.png
</PRE>
<P></P>
<P>
<IMG ALIGN="middle" SRC="pgraph.png" BORDER="0" ALT="">
</P>
<H1>To Do</H1>
<P>
Improved translation interface
</P>
<UL>
<LI>a nicer way to show disambiguation (maybe hidden by default)
</UL>
<P>
Complete the missing words and phrases
</P>
<P>
Disambiguation grammars for other languages than English
</P>
<P>
Extend the abstract lexicon in <CODE>Words</CODE> by hand or (semi)automatically for
</P>
<UL>
<LI>food stuff
<LI>languages
<LI>places
</UL>
<P>
Link to Google translate, for fall-back and for comparison
</P>
<P>
Feedback facility in the UI
</P>
<P>
Customizable distribution: make your own selection of the 2^15 language subsets
when downloading the phrasebook to a phone
</P>
<H1>How to contribute</H1>
<P>
The basic things "everyone" can do is
</P>
<UL>
<LI>complete <A HREF="missing.txt">missing words</A> in concrete syntaxes
<LI>add new abstract words in <CODE>Words</CODE> and greetings in <CODE>Greetings</CODE>
</UL>
<P>
The missing concrete syntax entries are added to the <CODE>Words</CODE><I>L</I><CODE>.gf</CODE>
files for each language <I>L</I>. The
<A HREF="http://code.haskell.org/gf/lib/doc/synopsis.html#toc78">morphological paradigms</A>
of the GF resource library should be used. Actions (prefixed with <CODE>A</CODE>, as <CODE>AWant</CODE>) are
a little more demanding, since they also require syntax constructors. Greetings (prefixed
with <CODE>G</CODE>) are pure strings.
</P>
<P>
Some explanations can be found in the
<A HREF="Implementation.html">implementation document</A>, which is produced from the
concrete syntax files
<A HREF="http://code.haskell.org/gf/examples/phrasebook/SentencesI.gf"><CODE>SentencesI.gf</CODE></A>
and
<A HREF="http://code.haskell.org/gf/examples/phrasebook/WordsEng.gf"><CODE>WordsEng.gf</CODE></A>
by <CODE>make doc</CODE>.
</P>
<P>
Here are the steps to follow for contributors:
</P>
<OL>
<LI>Make sure you have the latest sources
from <A HREF="http://www.grammaticalframework.org/doc/gf-developers.html">GF Darcs</A>,
using <CODE>darcs pull</CODE>.
<LI>Also make sure that you have compiled the library by <CODE>make present</CODE> in <CODE>gf/lib/src/</CODE>.
<LI>Work in the directory
<A HREF="http://code.haskell.org/gf/examples/phrasebook/"><CODE>gf/examples/phrasebook/</CODE></A>.
<LI>After you've finished your contribution, recompile the phrasebook by <CODE>make pgf</CODE>.
<LI>Save your changes in <CODE>darcs record .</CODE> (in the <CODE>phrasebook</CODE> subdirectory).
<LI>Make a patch file with <CODE>darcs send -o my_phrasebook_patch</CODE>, which you can
send to GF maintainers.
<LI>(Recommended:) Test the phrasebook on your local server:
<OL>
<LI>Go to <CODE>gf/src/server/</CODE> and follow the instructions in the
<A HREF="http://code.google.com/p/grammatical-framework/wiki/LaunchWebDemos">project Wiki</A>.
<LI>Make sure that <CODE>Phrasebook.pgf</CODE> is available to you GF server (see project wiki).
<LI>Launch <CODE>lighttpd</CODE> (see project wiki).
<LI>How you can open <CODE>gf/examples/phrasebook/www/phrasebook.html</CODE> and use your phrasebook!
</OL>
</OL>
<UL>
<LI>Don't delete anything! But you are free to correct incorrect forms.
<LI>Don't change the module structure!
<LI>Don't compromise quality to gain coverage: <I>non multa sed multum!</I>
<P></P>
</UL>
<!-- html code generated by txt2tags 2.4 (http://txt2tags.sf.net) -->
<!-- cmdline: txt2tags -thtml phrasebook.txt -->
</BODY></HTML>
|