diff options
| author | aarne <aarne@cs.chalmers.se> | 2006-01-08 20:51:32 +0000 |
|---|---|---|
| committer | aarne <aarne@cs.chalmers.se> | 2006-01-08 20:51:32 +0000 |
| commit | 316802e52c0b4ba3d74cbea66663c79280804316 (patch) | |
| tree | 2846d099ed72a9b4d48804b4292202f80fda5a86 | |
| parent | cba2b83ded437395e4c45415d90b96f61dbee6e7 (diff) | |
html version of multimodal doc
| -rw-r--r-- | doc/multimodal.html | 849 |
1 files changed, 849 insertions, 0 deletions
diff --git a/doc/multimodal.html b/doc/multimodal.html new file mode 100644 index 000000000..b1f202a9d --- /dev/null +++ b/doc/multimodal.html @@ -0,0 +1,849 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> +<HTML> +<HEAD> +<META NAME="generator" CONTENT="http://txt2tags.sf.net"> +<TITLE>Demonstrative Expressions and Multimodal Grammars</TITLE> +</HEAD><BODY BGCOLOR="white" TEXT="black"> +<P ALIGN="center"><CENTER><H1>Demonstrative Expressions and Multimodal Grammars</H1> +<FONT SIZE="4"> +<I>Author: Aarne Ranta <aarne (at) cs.chalmers.se></I><BR> +Last update: Sun Jan 8 21:50:32 2006 +</FONT></CENTER> + +<P></P> +<HR NOSHADE SIZE=1> +<P></P> + <UL> + <LI><A HREF="#toc1">Abstract</A> + <LI><A HREF="#toc2">Multimodal grammars</A> + <UL> + <LI><A HREF="#toc3">Representing demonstratives in semantics and grammar</A> + <LI><A HREF="#toc4">Asynchronous syntax in GF</A> + <LI><A HREF="#toc5">Example multimodal grammar: abstract syntax</A> + <LI><A HREF="#toc6">Digression: discontinuous constituents</A> + <LI><A HREF="#toc7">From grammars to dialogue systems</A> + </UL> + <LI><A HREF="#toc8">Adding multimodality to a unimodal grammar</A> + <UL> + <LI><A HREF="#toc9">The multimodal conversion</A> + <LI><A HREF="#toc10">An example of the conversion</A> + <LI><A HREF="#toc11">Multimodal conversion combinators</A> + </UL> + <LI><A HREF="#toc12">Multimodal resource grammars</A> + <UL> + <LI><A HREF="#toc13">Resource grammar API</A> + <LI><A HREF="#toc14">Multimodal API: functions for building demonstratives</A> + <LI><A HREF="#toc15">Multimodal API: functions for building sentences and phrases</A> + <LI><A HREF="#toc16">Language-independent implementation: examples</A> + <LI><A HREF="#toc17">Multimodal API: interface to unimodal expressions</A> + <LI><A HREF="#toc18">Instantiating multimodality to different languages</A> + <LI><A HREF="#toc19">Language-independent reimplementation of TramDemo</A> + <LI><A HREF="#toc20">The order problem</A> + <LI><A HREF="#toc21">A recipe for using a resource library</A> + </UL> + </UL> + +<P></P> +<HR NOSHADE SIZE=1> +<P></P> +<A NAME="toc1"></A> +<H2>Abstract</H2> +<P> +This document shows a method to write grammars +in which spoken utterances are accompanied by +pointing gestures. A computer application of such +grammars are <B>multimodal dialogue systems</B>, in +which the pointing gestures are performed by +mouse clicks and movements. +</P> +<P> +After an introduction to the notions of +<B>demonstratives</B> and <B>integrated multimodality</B>, +we will show by a concrete example +how multimodal grammars can be written in GF +and how they can be used in dialogue systems. +The explanation is given in three stages: +</P> +<OL> +<LI>How to write a multimodal grammar by hand. +<LI>How to add multimodality to a unimodal grammar. +<LI>How to use a multimodal resource grammar. +</OL> + +<A NAME="toc2"></A> +<H2>Multimodal grammars</H2> +<P> +<B>Demonstrative expressions</B> are an old idea. Such +expressions get their meaning from the context. +</P> + <BLOCKQUOTE> + <I>This train</I> is faster than <I>that airplane</I>. + </BLOCKQUOTE> +<P></P> + <BLOCKQUOTE> + I want to go from <I>this place</I> to <I>this place</I>. + </BLOCKQUOTE> +<P></P> +<P> +In particular, as in these examples, the meaning +can be obtained from accompanying pointing gestures. +</P> +<P> +Thus the meaning-bearing unit is neither the words nor the +gestures alone, but their combination. Demonstratives +thus provide an example of <B>integrated multimodality</B>, +as opposed to parallel multimodality. In parallel +multimodality, speech and other modes of communication +are just alternative ways to convey the same information. +</P> +<A NAME="toc3"></A> +<H3>Representing demonstratives in semantics and grammar</H3> +<P> +When formalizing the semantics of demonstratives, we can combine syntax with coordinates: +</P> + <BLOCKQUOTE> + I want to go from this place to this place + </BLOCKQUOTE> +<P></P> +<P> +is interpreted as something like +</P> +<PRE> + want(I, go, this(place,(123,45)), this(place,(98,10))) +</PRE> +<P> +Now, the same semantic value can be given in many ways, by performing +the clicks at different points of time in relation to the speech: +</P> + <BLOCKQUOTE> + I want to go from this place CLICK(123,45) to this place CLICK(98,10) + </BLOCKQUOTE> +<P></P> + <BLOCKQUOTE> + I want to go from this place to this place CLICK(123,45) CLICK(98,10) + </BLOCKQUOTE> +<P></P> + <BLOCKQUOTE> + CLICK(123,45) CLICK(98,10) I want to go from this place to this place + </BLOCKQUOTE> +<P></P> +<P> +How do we build the value compositionally in parsing? +Traditional parsing is sequential: its input is a string of tokens. +It works for demonstratives only if the pointing is adjacent to +the spoken expression. In the actual input, the demonstrative word +can be separated from the accompanying click by other words. The two +can also be simultaneous. +</P> +<A NAME="toc4"></A> +<H3>Asynchronous syntax in GF</H3> +<P> +What we need is a notion of <B>asynchronous parsing</B>, as opposed to +sequential parsing (where demonstrative words and clicks must be +adjacent). +</P> +<P> +We can implement asynchronous parsin in GF by exploiting the generality +of <B>linearization types</B>. A linearization type is the type of +the <B>concrete syntax objects</B> assigned to semantic values. +What a GF grammar defines is a relation +</P> +<PRE> + abstract syntax trees <---> concrete syntax objects +</PRE> +<P> +When modelling context-free grammar in GF, +the concrete syntax objects are just strings. +But they can be more structured objects as well - in general, they are +<B>records</B> of different kinds of objects. For example, +a demonstrative expression can be linearized into a record of two strings. +</P> +<PRE> + {s = "this place" ; + this place (coord 123 45) <---> p = "(123,45)" + } +</PRE> +<P> +The record +</P> +<PRE> + {s = "I want to go from this place to this place" ; + p = "(123,45) (98,10" + } +</PRE> +<P> +represents any combination of the sentence and the clicks, as long +as the clicks appear in this order. +</P> +<A NAME="toc5"></A> +<H3>Example multimodal grammar: abstract syntax</H3> +<P> +A simple example of a multimodal GF grammar is the one called +the Tram Demo grammar. It was written by Björn Bringert within +the TALK project as a part of a dialogue system that +deals with queries about tram timetables. The system interprets +a speech input in combination with mouse clicks on a digital map. +</P> +<P> +The abstract syntax of (a minimal fragment of) the Tram Demo +grammar is +</P> +<PRE> + cat + Input, Dep, Dest, Click ; + fun + GoFromTo : Dep -> Dest -> Input ; -- "I want to go from x to y" + DepHere : Click -> Dep ; -- "from here" with click + DestHere : Click -> Dest ; -- "to here" with click + + CCoord : Int -> Int -> Click ; -- click coordinates +</PRE> +<P> +An English concrete syntax of the grammar is +</P> +<PRE> + lincat + Input, Dep, Dest = {s : Str ; p : Str} ; + Click = {p : Str} ; + + lin + GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s ; p = x.p ++ y.p} ; + DepHere c = {s = ["from here"] ; p = c.p} ; + DestHere c = {s = ["to here"] ; p = c.p} ; + + CCoord x y = {p = "(" ++ x.s ++ "," ++ y.s ++ ")"} ; +</PRE> +<P> +When the grammar is used in the actual system, standard parsing methods +are used for interpreting the integrated speech and click input. +Parsing appears on two levels: the speech input parsing +performed by the Nuance speech recognition program (without the clicks), +and the semantics-yielding parser sending input to the dialogue manager. +The latter parser just attaches the clicks to the speech input. The order +of the clicks is preserved, and the parser can hence associate each of +the clicks with proper demonstratives. Here is the grammar used in the +two parsing phases. +</P> +<PRE> + cat + Query, -- whole content + Speech ; -- speech only + fun + QueryInput : Input -> Query ; -- the whole content shown + SpeechInput : Input -> Speech ; -- only the speech shown + + lincat + Query, Speech = {s : Str} ; + lin + QueryInput i = {s = i.s ++ ";" ++ i.p} ; + SpeechInput i = {s = i.s} ; +</PRE> +<P></P> +<A NAME="toc6"></A> +<H3>Digression: discontinuous constituents</H3> +<P> +The GF representation of integrated multimodality is +similar to the representation of <B>discontinous constituents</B>. +For instance, assume <I>has arrived</I> is a verb phrase in English, +which can be used both in declarative sentences and questions, +</P> + <BLOCKQUOTE> + she <I>has arrived</I> + </BLOCKQUOTE> +<P></P> + <BLOCKQUOTE> + <I>has</I> she <I>arrived</I> + </BLOCKQUOTE> +<P></P> +<P> +In the question, the two words are separated from each other. If +<I>has arrived</I> is a constituent of the question, it is thus discontinuous. +To represent such constituents in GF, records can be used: +we split verb phrases (<CODE>VP</CODE>) into a finite and infinitive part. +</P> +<PRE> + lincat VP = {fin, inf : Str} ; + + lin Indic np vp = {s = np.s ++ vp.fin ++ vp.inf} ; + lin Quest np vp = {s = vp.fin ++ np.s ++ vp.inf} ; +</PRE> +<P></P> +<A NAME="toc7"></A> +<H3>From grammars to dialogue systems</H3> +<P> +The general recipe for using GF when building dialogue systems +is to write a grammar with the following components: +</P> +<UL> +<LI>The abstract syntax defines the semantics (the "ontology") + of the domain of the system. +<LI>The concrete syntaxes define alternative modes of input and output. +</UL> + +<P> +The engineering advantages of this approach have to do partly with +the declarativity of the description, partly with the tools provided +by GF to derive different components of the system: +</P> +<UL> +<LI>The type checker guarantees that all the input and output + modes match with the ontology. +<LI>The grammar compiler generates parsers for each input grammar + and generators for each output grammar. +<LI>Translators between GF's abstract syntax and other ontology + description languages enable communication with different + kinds of dialogue managers and cover e.g. Prolog terms and XML objects. +<LI>Translators from GF's concrete syntax to speech recognition formats + make it possible to generate e.g. Nuance grammars and ATK language + models. +</UL> + +<P> +An example of this process is Björn Bringert's TramDemo. +More recently, grammars have been integrated to the GoDiS dialogue +manager by Prolog representations of abstract syntax. +</P> +<A NAME="toc8"></A> +<H2>Adding multimodality to a unimodal grammar</H2> +<P> +This section gives a recipe for making any unimodal grammar +multimodal, by adding pointing gestures to chosen expressions. The recipe +guarantees that the resulting grammar remains semantically well-formed, +i.e. type correct. +</P> +<A NAME="toc9"></A> +<H3>The multimodal conversion</H3> +<P> +The <B>multimodal conversion</B> of a grammar consists of seven +steps, of which the first is always the same, the second +involves a decision, and the rest are derivative: +</P> +<OL> +<LI>Add the category <CODE>`Point`</CODE> with a standard linearization type. +<PRE> + cat Point ; + lincat Point = {point : Str} ; +</PRE> +<LI>(Decision) Decide which constructors are demonstrative, i.e. take + a pointing gesture as an argument. Add a <CODE>Point`</CODE> as their last argument. + The new type signatures for such constructors <I>d</I> have the form +<PRE> + fun d : ... -> Point -> D +</PRE> +<LI>(Derivative) Add a <CODE>point</CODE> field to the linearization type <I>L</I> of any + demonstrative category <I>D</I>, i.e. a category that has at least one demonstrative + constructor: +<PRE> + lincat D = L ** {point : Str} ; +</PRE> +<LI>(Derivative) If some other category <I>C</I> has a constructor <I>d</I> that takes + demonstratives as arguments, make it demonstrative by adding a <I>point</I> field + to its linearization type. +<LI>(Derivative) Store the <CODE>point</CODE> field in the linearization <I>t</I> of any + constructor <I>d</I> that has been made demonstrative: +<PRE> + lin d x1 ... xn p = t x1 ... xn ** {point = p.point} ; +</PRE> +<LI>(Derivative) For each constructor <I>f</I> that takes demonstratives <I>D_1,...,D_n</I> + as arguments, collect the <I>point</I> fields of the arguments in the <I>point</I> + field of the value: +<PRE> + lin f x_1 ... x_m = + t x_1 ... x_m ** {point = x_d1.point ++ ... ++ x_dn.point} ; +</PRE> + Make sure that the pointings <CODE>x_d1.point ... x_dn.point</CODE> are concatenated + in the same order as the arguments appear in the <I>linearization</I> <I>t</I>, + which is not necessarily the same as the abstract argument order. +<LI>(Derivative) To preserve type correctness, add an empty + <CODE>point</CODE> field to the linearization <I>t</I> of any + constructor <I>c</I> of a demonstrative category: +<PRE> + lin c x1 ... xn = t x1 ... xn ** {point = []} ; +</PRE> +</OL> + +<A NAME="toc10"></A> +<H3>An example of the conversion</H3> +<P> +Start with a Tram Demo grammar with no demonstratives, but just +tram stop names and the indexical <I>here</I> (interpreted as e.g. the user's +standing place). +</P> +<PRE> + cat + Input, Dep, Dest, Name ; + fun + GoFromTo : Dep -> Dest -> Input ; + DepHere : Dep ; + DestHere : Dest ; + DepName : Name -> Dep ; + DestName : Name -> Dest ; + + Almedal : Name ; +</PRE> +<P> +A unimodal English concrete syntax of the grammar is +</P> +<PRE> + lincat + Input, Dep, Dest, Name = {s : Str} ; + + lin + GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s} ; + DepHere = {s = ["from here"]} ; + DestHere = {s = ["to here"]} ; + DepName n = {s = ["from"] ++ n.s} ; + DestName n = {s = ["to"] ++ n.s} ; + + Almedal = {s = "Almedal"} ; +</PRE> +<P> +Let us follow the steps of the recipe. +</P> +<OL> +<LI>We add the category <CODE>Point</CODE> and its linearization type. +<LI>We decide that <CODE>DepHere</CODE> and <CODE>DestHere</CODE> involve a pointing gesture. +<LI>We add <CODE>point</CODE> to the linearization types of <CODE>Dep</CODE> and <CODE>Dest</CODE>. +<LI>Therefore, also add <CODE>point</CODE> to <CODE>Input</CODE>. (But <CODE>Name</CODE> remains unimodal.) +<LI>Add <CODE>p.point</CODE> to the linearizations of <CODE>DepHere</CODE> and <CODE>DestHere</CODE>. +<LI>Concatenate the points of the arguments of <CODE>GoFromTo</CODE>. +<LI>Add an empty <CODE>point</CODE> to <CODE>DepName</CODE> and <CODE>DestName</CODE>. +</OL> + +<P> +In the resulting grammar, one category is added and +two functions are changed in the abstract syntax (annotated by the step numbers): +</P> +<PRE> + cat + Point ; -- 1 + fun + DepHere : Point -> Dep ; -- 2 + DestHere : Point -> Dest ; -- 2 + +</PRE> +<P> +The concrete syntax in its entirety looks as follows +</P> +<PRE> + lincat + Dep, Dest = {s : Str ; point : Str} ; -- 3 + Input = {s : Str ; point : Str} ; -- 4 + Name = {s : Str} ; + Point = {point : Str} ; -- 1 + lin + GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s ; -- 6 + point = x.point ++ y.point + } ; + DepHere p = {s = ["from here"] ; -- 5 + point = p.point + } ; + DestHere p = {s = ["to here"] : -- 5 + point = p.point + } ; + DepName n = {s = ["from"] ++ n.s ; -- 7 + point = [] + } ; + DestName n = {s = ["to"] ++ n.s ; -- 7 + point = [] + } ; + Almedal = {s = "Almedal"} ; +</PRE> +<P> +What we need in addition, to use the grammar in applications, are +</P> +<OL> +<LI>Constructors for <CODE>Point</CODE>, e.g. coordinate pairs. +<LI>Top-level categories, like <CODE>Query</CODE> and <CODE>Speech</CODE> in the original. +</OL> + +<P> +But their proper place is probably in another grammar module, so that +the core Tram Demo grammar can be used in different systems e.g. +encoding clicks in different ways. +</P> +<A NAME="toc11"></A> +<H3>Multimodal conversion combinators</H3> +<P> +GF is a functional programming language, and we exploit this +by providing a set of combinators that makes the multimodal conversion easier +and clearer. We start with the type of sequences of pointing gestures. +</P> +<PRE> + Point : Type = {point : Str} ; +</PRE> +<P> +To make a record type multimodal is to extend it with <CODE>Point</CODE>. +The record extension operator <CODE>**</CODE> is needed here. +</P> +<PRE> + Dem : Type -> Type = \t -> t ** Point ; +</PRE> +<P> +To construct, use, and concatenate pointings: +</P> +<PRE> + mkPoint : Str -> Point = \s -> {point = s} ; + + noPoint : Point = mkPoint [] ; + + point : Point -> Str = \p -> p.point ; + + concatPoint : (x,y : Point) -> Point = \x,y -> + mkPoint (point x ++ point y) ; +</PRE> +<P> +Finally, to add pointing to a record, with the limiting case of no demonstrative needed. +</P> +<PRE> + mkDem : (t : Type) -> t -> Point -> Dem t = \_,x,s -> x ** s ; + + nonDem : (t : Type) -> t -> Dem t = \t,x -> mkDem t x noPoint ; +</PRE> +<P> +Let us rewrite the Tram Demo grammar by using these combinators: +</P> +<PRE> + oper + SS : Type = {s : Str} ; + lincat + Input, Dep, Dest = Dem SS ; + Name = SS ; + + lin + GoFromTo x y = {s = ["I want to go"] ++ x.s ++ y.s} ** + concatPoint x y ; + DepHere = mkDem SS {s = ["from here"]} ; + DestHere = mkDem SS {s = ["to here"]} ; + DepName n = nonDem SS {s = ["from"] ++ n.s} ; + DestName n = nonDem SS {s = ["to"] ++ n.s} ; + + Almedal = {s = "Almedal"} ; +</PRE> +<P> +The type synonym <CODE>SS</CODE> is introduced to make the combinator applications +concise. Notice the use of partial application in <CODE>DepHere</CODE> and +<CODE>DestHere</CODE>; an equivalent way to write is +</P> +<PRE> + DepHere p = mkDem SS {s = ["from here"]} p ; +</PRE> +<P></P> +<A NAME="toc12"></A> +<H2>Multimodal resource grammars</H2> +<P> +The main advantage of using GF when building dialogue systems is +that various components of the system +can be automatically generated from GF grammars. +Writing these grammars, however, can still be a considerable +task. A case in point are multilingual systems: +how to localize e.g. a system built in a car to +the languages of all those customers to whom the +car is sold? This problem has been the main focus of +GF for some years, and the solution on which most work has been +done is the development of <B>resource grammar libraries</B>. +These libraries work in the same way as program libraries +in software engineering, enabling a division of labour +between linguists and domain experts. +</P> +<P> +One of the goals in the resource grammars of different +languages has been to provide a <B>language-independent API</B>, +which makes the same resource grammar functions available for +different languages. For instance, the categories +<CODE>S</CODE>, <CODE>NP</CODE>, and <CODE>VP</CODE> are available in all of the +10 languages currently supported, and so is the function +</P> +<PRE> + PredVP : NP -> VP -> S +</PRE> +<P> +which corresponds to the rule <CODE>S -> NP VP</CODE> in phrase +structure grammar. However, there are several levels of abstraction +between the function <CODE>PredVP</CODE> and the phrase structure rule, +because the rule is implemented in so different ways in different +languages. In particular, discontinuous constituents are needed in +various degrees to make the rule work in different languages. +</P> +<P> +Now, dealing with discontinuous constituents is one of the demanding +aspects of multilingual grammar writing that the resource grammar +API is designed to hide. But the proposed treatment of integrated +multimodality is heavily dependent on similar things. What can we +do to make multimodal grammars easier to write (for different languages)? +There are two orthogonal answers: +</P> +<OL> +<LI>Use resource grammars to write a unimodal dialogue grammar and + then apply the multimodal + conversion to manually chosen parts. +<LI>Use <B>multimodal resource grammars</B> to derive multimodal + dialogue system grammars directly. +</OL> + +<P> +The multimodal resource grammar library has been obtained from +the unimodal one by applying the multimodal conversion manually. +In addition, the API has been simplified +by leaving out structures needed in written technical documents +(the original application area of GF) but not in spoken dialogue. +</P> +<P> +In the following subsections, we will show a part of the +multimodal resource grammar API, limited to a fragment that +is needed to get the main ideas and to reimplement the +Tram Demo grammar. The reimplementation shows one more advantage +of the resource grammar approach: dialogue systems can be +automatically instantiated to different languages. +</P> +<A NAME="toc13"></A> +<H3>Resource grammar API</H3> +<P> +The resource grammar API has three main kinds of entries: +</P> +<OL> +<LI>Language-independent linguistic structures (``linguistic ontology''), e.g. +<PRE> + PredVP : NP -> VP -> S ; -- "Mary helps him" +</PRE> +<LI>Language-specific syntax extensions, e.g. Swedish and German fronting +topicalization +<PRE> + TopicObj : NP -> VP -> S ; -- "honom hjälper Mary" +</PRE> +<LI>Language-specific lexical constructors, e.g. Germanic <I>Ablaut</I> patterns +<PRE> + irregV : (sing,sang,sung : Str) -> V ; +</PRE> +</OL> + +<P> +The first two kinds of entries are <CODE>cat</CODE> and <CODE>fun</CODE> definitions +in an abstract syntax. The multimodal, restricted API has +e.g. the following categories. Their names are obtained from +the corresponding unimodal categories by prefixing <CODE>M</CODE>. +</P> +<PRE> + MS ; -- multimodal sentence or question + MQS ; -- multimodal wh question + MImp ; -- multimodal imperative + MVP ; -- multimodal verb phrase + MNP ; -- multimodal (demonstrative) noun phrase + MAdv ; -- multimodal (demonstrative) adverbial + + Point ; -- pointing gesture +</PRE> +<P></P> +<A NAME="toc14"></A> +<H3>Multimodal API: functions for building demonstratives</H3> +<P> +Demonstrative pronouns can be used both as noun phrases and +as determiners. +</P> +<PRE> + this_MNP : Point -> MNP ; -- this + thisDet_MNP : CN -> Point -> MNP ; -- this car +</PRE> +<P> +There are also demonstrative adverbs, and prepositions give +a productive way to build more adverbs. +</P> +<PRE> + here_MAdv : Point -> MAdv ; -- here + here7from_MAdv : Point -> MAdv ; -- from here + + MPrepNP : Prep -> MNP -> MAdv ; -- in this car +</PRE> +<P></P> +<A NAME="toc15"></A> +<H3>Multimodal API: functions for building sentences and phrases</H3> +<P> +A handful of predication rules construct sentences, questions, and imperatives. +</P> +<PRE> + MPredVP : MNP -> MVP -> MS ; -- this plane flies here + MQPredVP : MNP -> MVP -> MQS ; -- does this plane fly here + MQuestVP : IP -> MVP -> MQS ; -- who flies here + MImpVP : MVP -> MImp ; -- fly here! +</PRE> +<P> +Verb phrases are constructed from verbs (inherited as such from +the unimodal API) by providing their complements. +</P> +<PRE> + MUseV : V -> MVP ; -- flies + MComplV2 : V2 -> MNP -> MVP ; -- takes this + MComplVV : VV -> MVP -> MVP ; -- wants to take this +</PRE> +<P> +A multimodal adverb can be attached to a verb phrase. +</P> +<PRE> + MAdvVP : MVP -> MAdv -> MVP ; -- flies here +</PRE> +<P></P> +<A NAME="toc16"></A> +<H3>Language-independent implementation: examples</H3> +<P> +The implementation makes heavy use of the multimodal conversion +combinators. It adds a <CODE>point</CODE> field to whatever the implementation of the unimodal +category is in any language. Thus, for example +</P> +<PRE> + lincat + MVP = Dem VP ; + MNP = Dem NP ; + MAdv = Dem Adv ; + + lin + this_MNP = mkDem NP this_NP ; + -- i.e. this_MNP p = this_NP ** {point = p.point} ; + + MComplV2 verb obj = mkDem VP (ComplV2 verb obj) obj ; + + MAdvVP vp adv = mkDem VP (AdvVP vp adv) (concatPoint vp adv) ; +</PRE> +<P></P> +<A NAME="toc17"></A> +<H3>Multimodal API: interface to unimodal expressions</H3> +<P> +Using nondemonstrative expressions as demonstratives: +</P> +<PRE> + DemNP : NP -> MNP ; + DemAdv : Adv -> MAdv ; +</PRE> +<P> +Building top-level phrases: +</P> +<PRE> + PhrMS : Pol -> MS -> Phr ; + PhrMS : Pol -> MS -> Phr ; + PhrMQS : Pol -> MQS -> Phr ; + PhrMImp : Pol -> MImp -> Phr ; +</PRE> +<P></P> +<A NAME="toc18"></A> +<H3>Instantiating multimodality to different languages</H3> +<P> +The implementation above has only used the resource grammar API, +not the concrete implementations. The library <CODE>Demonstrative</CODE> +is a <B>parametrized module</B>, also called a <B>functor</B>, which +has the following structure +</P> +<PRE> + incomplete concrete DemonstrativeI of Demonstrative = + Cat, TenseX ** open Test, Structural in { + + -- lincat and lin rules + + } +</PRE> +<P> +It can be <B>instantiated</B> to different languages as follows. +</P> +<PRE> + concrete DemonstrativeEng of Demonstrative = + CatEng, TenseX ** DemonstrativeI with + (Test = TestEng), + (Structural = StructuralEng) ; + + concrete DemonstrativeSwe of Demonstrative = + CatSwe, TenseX ** DemonstrativeI with + (Test = TestSwe), + (Structural = StructuralSwe) ; +</PRE> +<P></P> +<A NAME="toc19"></A> +<H3>Language-independent reimplementation of TramDemo</H3> +<P> +Again using the functor idea, we reimplement <CODE>TramDemo</CODE> +as follows: +</P> +<PRE> + incomplete concrete TramI of Tram = open Multimodal in { + + lincat + Query = Phr ; Input = MS ; + Dep, Dest = MAdv ; Click = Point ; + lin + QInput = PhrMS PPos ; + + GoFromTo x y = + MPredVP (DemNP (UsePron i_Pron)) + (MAdvVP (MAdvVP (MComplVV want_VV (MUseV go_V)) x) y) ; + + DepHere = here7from_MAdv ; + DestHere = here7to_MAdv ; + DepName s = MPrepNP from_Prep (DemNP (UsePN (SymbPN (MkSymb s)))) ; + DestName s = MPrepNP to_Prep (DemNP (UsePN (SymbPN (MkSymb s)))) ; + +</PRE> +<P> +Then we can instantiate this to all languages for which +the <CODE>Multimodal</CODE> API has been implemented: +</P> +<PRE> + concrete TramEng of Tram = TramI with + (Multimodal = MultimodalEng) ; + + concrete TramSwe of Tram = TramI with + (Multimodal = MultimodalSwe) ; + + concrete TramFre of Tram = TramI with + (Multimodal = MultimodalFre) ; +</PRE> +<P></P> +<A NAME="toc20"></A> +<H3>The order problem</H3> +<P> +It was pointed out in the section on the multimodal conversion that +the concrete word order may be different from the abstract one, +and vary between different languages. For instance, Swedish +topicalization +</P> + <BLOCKQUOTE> + Det här tåget vill den här kunden inte ta. + </BLOCKQUOTE> +<P></P> +<P> +(``this train, this customer doesn't want to take'') may well have +an abstract syntax of a form in which the customer appears +before the train. +</P> +<P> +This is a problem for the implementor of the resource grammar. +It means that some parts of the resource must be written manually +and not as a functor. +However, the <I>user</I> of the resource can safely +ignore the word order problem, if it is correctly dealt with in +the resource. +</P> +<A NAME="toc21"></A> +<H3>A recipe for using a resource library</H3> +<P> +In the beginning, we believed resource grammars are all that +an application grammarian needs to write a concrete syntax. +However, experience has shown that it can be heavy to start +the grammar development in this way: selecting functions from +a resource API requires more abstract thinking than just +writing things (maybe even in a context-free grammar notation, +also supported by GF). This experience has led to the following +steps for grammar development, which, while permitting +a quick start of the work, towards the end increase abstraction +to localize the grammar in different languages. +</P> +<OL> +<LI>Encode domain ontology in and abstract syntax, <CODE>Domain</CODE>. +<LI>Write a rough concrete syntax in English, <CODE>DomainRough</CODE>. + This can be oversimplified and overgenerating. +<LI>Reimplement by resource, and build a functor <CODE>DomainI</CODE>. +<LI>Instantiate this functor to different languages, and test. +<LI>If a rule doesn't satisfy in a language, use its resource in + a different way (<B>compile-time transfer</B>). +</OL> + + +<!-- html code generated by txt2tags 2.3 (http://txt2tags.sf.net) --> +<!-- cmdline: txt2tags -\-toc multimodal.txt --> +</BODY></HTML> |
