summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authoraarne <aarne@cs.chalmers.se>2006-06-10 23:00:54 +0000
committeraarne <aarne@cs.chalmers.se>2006-06-10 23:00:54 +0000
commitd44e420b6edb0f8d48bf92503372815491d11fa3 (patch)
treecdd3c416aa2ed3acd874ff2b6bc9afae3e8293c0
parent1f2310da94aa0d56214329ff15ada03ca6d65297 (diff)
more overview of linguistic structures
-rw-r--r--doc/resource.txt136
1 files changed, 135 insertions, 1 deletions
diff --git a/doc/resource.txt b/doc/resource.txt
index c93e7f2d2..7d5640203 100644
--- a/doc/resource.txt
+++ b/doc/resource.txt
@@ -1,9 +1,11 @@
The GF Resource Grammar Library
+==Overview of linguistic structures==
+
The outermost linguistic structure is Text. Texts are composed
from Phrases followed by punctuation marks - either of ".", "?" or
"!! (with their proper variants in Spanish and Arabic). Here is an
-example of a text.
+example of a Text.
John walks. Why? He doesn't want to sleep!
@@ -29,3 +31,135 @@ text is thus well-formed.
John walks. John walks? John walks!
+What is the difference between Phrase and Utterance? Just technical:
+a Phrase is an Utterance with an optional leading conjunction ("but")
+and an optional tailing vocative ("John", "please").
+
+The richest of the categories below Utterance is S, Sentence. A Sentence
+is formed from a Clause, by fixing its Tense, Anteriority, and Polarity.
+The difference between Sentence and Clause is thus also rather technical.
+For example, each of the following strings has a distinct syntax tree
+of category Sentence:
+
+ John walks
+ John doesn't walk
+ John walked
+ John didn't walk
+ John has walked
+ John hasn't walked
+ John will walk
+ John won't walk
+ ...
+
+whereas in the category Clause all of them are just different forms of
+the same tree.
+
+The following syntax tree of the Text "John walks." gives an overview
+of the structural levels.
+
+ Node Type of subtree Alternative constructors
+
+ 1. TFullStop : Text TQuestMark
+ 2. (PhrUtt : Phr
+ 3. NoPConj : PConj but_PConj
+ 4. (UttS : Utt UttQS
+ 5. (UseCl : S UseQCl
+ 6. TPres : Tense TPast
+ 7. ASimul : Anter AAnter
+ 8. PPos : Pol PNeg
+ 9. (PredVP : Cl
+10. (UsePN : NP UsePron, DetCN
+11. john_PN) : PN mary_PN
+12. (UseV : VP ComplV2, ComplV3
+13. walk_V)))) : V sleep_V
+14. NoVoc) : Voc please_Voc
+15. TEmpty : Text
+
+Here are some examples of the results of changing constructors.
+
+ 1. TFullStop -> TQuestMark John walks?
+ 3. NoPConj -> but_PConj But John walks.
+ 6. TPres -> TPast John walked.
+ 7. ASimul -> AAnter John has walked.
+ 8. PPos -> PNeg John doesn't walk.
+11. john_PN -> mary_PN Mary walks.
+13. walk_V -> sleep_V John sleeps.
+14. NoVoc -> please_Voc John sleeps please.
+
+All constructors cannot of course be changed so freely, because the
+resulting tree would not remain well-typed. Here are some changes involving
+many constructors:
+
+ 4- 5. UttS (UseCl ...) -> UttQS (UseQCl (... QuestCl ...)) Does John walk?
+10-11. UsePN john_PN -> UsePron we_Pron We walk.
+12-13. UseV walk_V -> ComplV2 love_V2 this_NP John loves this.
+
+The linguistic phenomena mostly discussed in traditional grammars and modern
+syntax belong to the level of Clauses, that is, lines 9-13, and occasionally
+to Sentences, lines 5-13. At this level, the major categories are
+NP (Noun Phrase) and VP (Verb Phrase). A Clause typically consists of a
+NP and a VP. The internal structure of both NP and VP can be very complex,
+and these categories are mutually recursive: not only can a VP contain an NP,
+
+ [VP loves [NP Mary]]
+
+but an NP can also contain a VP
+
+ [NP every man [RS who [VP walks]]]
+
+(a labelled bracketing like this is of course just a rough approximation of
+a GF syntax tree, but still a useful device of exposition).
+
+Most of the resource modules thus define functions that are used inside
+NPs and VPs. Here is a brief overview:
+
+Noun: How to construct NPs. The main three mechanisms
+for constructing NPs are
+
+- from proper names: John
+- from pronouns: we
+- from common nouns by determiners: this man
+
+The Noun module also defines the construction of common nouns. The most frequent ways are
+
+- lexical noun items: man
+- adjectival modification: old man
+- relative clause modification: man who sleeps
+
+Verb: How to construct VPs. The main mechanism is verbs with their arguments:
+
+- one-place verbs: walks
+- two-place verbs: loves Mary
+- three-place verbs: gives her a kiss
+- sentence-complement verbs: says that it is cold
+- VP-complement verbs: wants to give her a kiss
+
+A special verb is the copula, "be" in English but not even realized by a verb in all languages.
+A copula can take different kinds of complement:
+
+- an adjectival phrase: (John is) old
+- an adverb: (John is) here
+- a noun phrase: (John is) a man
+
+The resource modules are named after the kind of phrases that are constructed in them,
+and they can be roughly classified by the "level" or "size" of expressions that are
+formed in them:
+
+- Larger than sentence: Text, Phrase
+- Same level as sentence: Sentence, Question, Relative
+- Parts of sentence: Adjective, Adverb, Noun, Verb
+- Cross-cut: Conjunction
+
+Because of mutual recursion such as embedded sentences, this classification is
+not a complete order. However, no mutual dependence is needed between the
+modules in a formal sense, but they can all be compiled separately. This is due
+to the module Cat, which defines the type system common to the other modules.
+For instance, the types NP and VP are defined in Cat, and the module Verb only
+needs to know what is given in Cat, not what is given in Noun. To implement
+a rule such as
+
+ Verb.ComplV2 : V2 -> NP -> VP
+
+it is enough to know the linearization type of NP (given in Cat), not what
+ways there are to build NPs (given in Noun), since all these ways must
+conform to the linearization type defined in Cat.