From c6f4edaea5f1074ba682fac5d711016f0136998f Mon Sep 17 00:00:00 2001 From: "John J. Camilleri" Date: Wed, 4 Jul 2018 10:09:58 +0200 Subject: Remove examples directory; these now live in gf-contrib All changes have been reflected in the gf-contrib repository: https://github.com/GrammaticalFramework/gf-contrib Now, for WebSetup to build the example grammars, one must have gf-contrib cloned in the same top-level directory as GF. When this isn't the case, WebSetup displays a notice without failing. --- examples/phrasebook/doc-phrasebook.html | 688 -------------------------------- 1 file changed, 688 deletions(-) delete mode 100644 examples/phrasebook/doc-phrasebook.html (limited to 'examples/phrasebook/doc-phrasebook.html') diff --git a/examples/phrasebook/doc-phrasebook.html b/examples/phrasebook/doc-phrasebook.html deleted file mode 100644 index a6b42a255..000000000 --- a/examples/phrasebook/doc-phrasebook.html +++ /dev/null @@ -1,688 +0,0 @@ - - - - - -MOLTO Multilingual Phrasebook - -

MOLTO Multilingual Phrasebook

- -Krasimir Angelov, Olga Caprotti, Ramona Enache, Thomas Hallgren, Inari Listenmaa, Aarne Ranta, Jordi Saludes, Adam Slaski
-Showcase for project FP7-ICT-247914, Deliverable D10.2. -
- -

-
-

- - -

-
-

-

-


- -

-

-History -

- - -

-Missing constructs -

-

-Back to the phrasebook -

-

- -


-

- -

Purpose

-

-This phrasebook is a program for translating touristic phrases -between 14 European languages included in the -MOLTO project -(Multilingual On-Line Translation): -

- - -

-A Russian version is not yet finished but is projected later. Also other languages may be added. -

-

-The phrasebook is implemented by using the GF programming language -(Grammatical Framework). -It is the first demo for the MOLTO project, released in the third month (by June 2010). -The first version is a very small system, but it will extended in the course of the project. -

-

-The phrasebook has the following requirement specification: -

- - -

-The phrasebook is available as open-source software, licensed under GNU LGPL. -The source code resides in -www.grammaticalframework.org/examples/phrasebook/ -

- -

Points illustrated

- -

From the user perspective

-

-Interlingua-based translation -

- - -

-Incremental parsing -

- - -

-Mixed modalities -

- - -

-Quasi-incremental translation: many basic types are also used as phrases -

- - -

-Disambiguation, esp. of politeness distinctions -

- - -

-Fall-back to statistical translation -

- - -

-Feed-back from users -

- - - -

From the programmer's perspective

-

-The use of resource grammars and functors -

- - -

-Example-based grammar writing and grammar induction from statistical models -(Google translate) -

- - -

-Compile-time transfer: especially, in Action in Words -

- - -

-The level of skills involved in grammar development -

- - -

-Grammar testing -

- - - -

Files

- -

Grammars

-

-Sentences: general syntactic structures implementable in a uniform way. -Concrete syntax via the functor SencencesI. -

-

-Words: words and predicates, typically language-dependent. -Separate concrete syntaxes. -

-

-Greetings: idiomatic phrases, string-based. -Separate concrete syntaxes. -

-

-Phrasebook: the top module putting everything together. -Separate concrete syntaxes. -

-

-DisambPhrasebook: disambiguation grammars generating feedback phrases if -the input language is ambiguous. -

-

-Numeral: resource grammar module directly inherited from the library. -

-

-Here is the module structure as produced in GF by -

-
-    > i -retain DisambPhrasebookEng.gf
-    > dg -only=Phrasebook*,Sentences*,Words*,Greetings*,Numeral,NumeralEng,DisambPhrasebookEng
-    > ! dot -Tpng _gfdepgraph.dot >pgraph.png
-
-

-

- -

- -

Ontology

-

-The abstract syntax defines the ontology behind the phrasebook. -Some explanations can be found in the -ontology document, which is produced from the -abstract syntax files -Sentences.gf -and -Words.gf -by make doc. -

- -

Run-time system and user interface

-

-The phrasebook uses -the -PGF server -written in Haskell and the -minibar library -written in JavaScript. Since the sources of these systems are available, anyone can build the phrasebook -locally on her own computer. -

- -

Effort and cost

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
LanguageGrammarian's language skillsGrammarian's GF skillsInformant used for developmentInformant used for testingUse of external toolsImpact of external toolsChanges on the resource grammarDevelopment time
Bulgarian######---?###
Catalan######---?##
Danish-###+++#####
Dutch-###+++#####
English#####-+--_#
Finnish######---?###
French#####-+-?##
German####+++#######
Italian####---?####
Norwegian####+-+#####
Polish######+++####
Romanian######--+#######
Spanish###---?_##
Swedish#####-+-?-##
- -

-Explanation on scores -

- - - - - - - - - - - - - -

Example-based grammar writing prototype

-

-The figure presents the process of creating a Phrasebook using an example-based -approach for the language X, where X = {Danish, Dutch, German, Norwegian}. -

-

- -

- - -

-The time needed for preparing the configuration files for a grammar will not be needed -in the future, since the files are reusable for other applications. -The time for the second step can be saved if automatic tools, like Google translate -are used. This is only possible in languages with a simpler morphology and syntax -and large corpora available. -Good results were obtained for German and Dutch with Google translate, but for -languages like Romanian or Polish, which are both complex and lack enough resources, -the results are discouraging. -

-

-If the statistical oracle works well, the only step where the presence of a human -translator is needed is the evaluation and feedback step. An average of 4 hours per -round and 2 rounds were needed in average for the languages for which we performed -the experiment. It is possible that more effort is needed for more complex languages. -

- -

To Do

-

-Disambiguation grammars for other languages than English -

-

-Extend the abstract lexicon in Words by hand or (semi)automatically for -

- - -

-Customizable phone distribution: make your own selection of the 2^15 language subsets -when downloading the phrasebook to a phone -

- -

How to contribute

-

-The basic things "everyone" can do is -

- - -

-The missing concrete syntax entries are added to the WordsL.gf -files for each language L. The -morphological paradigms -of the GF resource library should be used. Actions (prefixed with A, as AWant) are -a little more demanding, since they also require syntax constructors. Greetings (prefixed -with G) are pure strings. -

-

-Some explanations can be found in the -implementation document, which is produced from the -concrete syntax files -SentencesI.gf -and -WordsEng.gf -by make doc. -

-

-Here are the steps to follow for contributors: -

-
    -
  1. Make sure you have the latest sources - from GF Darcs, - using darcs pull. -
  2. Also make sure that you have compiled the library by make present in gf/lib/src/. -
  3. Work in the directory - gf/examples/phrasebook/. -
  4. After you've finished your contribution, recompile the phrasebook by make pgf. -
  5. Save your changes in darcs record . (in the phrasebook subdirectory). -
  6. Make a patch file with darcs send -o my_phrasebook_patch, which you can - send to GF maintainers. -
  7. (Recommended:) Test the phrasebook on your local server: -
      -
    1. Go to gf/src/server/ and follow the instructions in the - project Wiki. -
    2. Make sure that Phrasebook.pgf is available to you GF server (see project wiki). -
    3. Launch lighttpd (see project wiki). -
    4. How you can open gf/examples/phrasebook/www/phrasebook.html and use your phrasebook! -
    -
- - - - -

Conclusions (tentative)

-

-The grammarian need not be a native speaker of the language. -

-

-For many languages, the grammarian need not even know the language - native informants are -enough. -

-

-However, evaluation by native speakers is necessary. -

-

-Correct and idiomatic translations are possible. -

-

-A typical development time was 2-3 person working days per language. -

-

-Google translate helps in bootstrapping grammars, but must be checked. -

- - -

-Resource grammars should give some more support -

- - - -

Acknowledgements

-

-The Phrasebook has been built in the MOLTO project funded by the European Commission. -

-

-The authors are grateful to their native speaker informants helping to bootstrap and evaluate -the grammars: -Richard Bubel, -Grégoire Détrez, -Rise Eilert, -Karin Keijzer, -Michał Pałka, -Willard Rafnsson, -Nick Smallbone. -

- - - - -- cgit v1.2.3