summaryrefslogtreecommitdiff
path: root/doc/overview-resource.txt
diff options
context:
space:
mode:
authoraarne <aarne@cs.chalmers.se>2008-06-27 11:32:49 +0000
committeraarne <aarne@cs.chalmers.se>2008-06-27 11:32:49 +0000
commit64d2a981a99c8f48f85c4efd0cecd1db1e5ce93a (patch)
tree8ec777785ae6b99e4ade6ab7c97a7653317b82ad /doc/overview-resource.txt
parent032531c6a690edbb377ff11ee2a743a30c5bf500 (diff)
more rm in doc
Diffstat (limited to 'doc/overview-resource.txt')
-rw-r--r--doc/overview-resource.txt300
1 files changed, 0 insertions, 300 deletions
diff --git a/doc/overview-resource.txt b/doc/overview-resource.txt
deleted file mode 100644
index 2f9b2cd04..000000000
--- a/doc/overview-resource.txt
+++ /dev/null
@@ -1,300 +0,0 @@
-==Texts. phrases, and utterances==
-
-The outermost linguistic structure is ``Text``. ``Text``s are composed
-from Phrases (``Phr``) followed by punctuation marks - either of ".", "?" or
-"!" (with their proper variants in Spanish and Arabic). Here is an
-example of a ``Text`` string.
-```
- John walks. Why? He doesn't want to sleep!
-```
-Phrases are mostly built from Utterances (``Utt``), which in turn are
-declarative sentences, questions, or imperatives - but there
-are also "one-word utterances" consisting of noun phrases
-or other subsentential phrases. Some Phrases are atomic,
-for instance "yes" and "no". Here are some examples of Phrases.
-```
- yes
- come on, John
- but John walks
- give me the stick please
- don't you know that he is sleeping
- a glass of wine
- a glass of wine please
-```
-There is no connection between the punctuation marks and the
-types of utterances. This reflects the fact that the punctuation
-mark in a real text is selected as a function of the speech act
-rather than the grammatical form of an utterance. The following
-text is thus well-formed.
-```
- John walks. John walks? John walks!
-```
-What is the difference between Phrase and Utterance? Just technical:
-a Phrase is an Utterance with an optional leading conjunction ("but")
-and an optional tailing vocative ("John", "please").
-
-
-==Sentences and clauses==
-
-TODO: use overloaded operations in the examples.
-
-The richest of the categories below Utterance is ``S``, Sentence. A Sentence
-is formed from a Clause (``Cl``), by fixing its Tense, Anteriority, and Polarity.
-For example, each of the following strings has a distinct syntax tree
-in the category Sentence:
-```
- John walks
- John doesn't walk
- John walked
- John didn't walk
- John has walked
- John hasn't walked
- John will walk
- John won't walk
- ...
-```
-whereas in the category Clause all of them are just different forms of
-the same tree.
-The difference between Sentence and Clause is thus also rather technical.
-It may not correspond exactly to any standard usage of the terms
-"clause" and "sentence".
-
-Figure 1 shows a type-annotated syntax tree of the Text "John walks."
-and gives an overview of the structural levels.
-
-#BFIG
-
-```
-Node Constructor Value type Other constructors
------------------------------------------------------------
- 1. TFullStop Text TQuestMark
- 2. (PhrUtt Phr
- 3. NoPConj PConj but_PConj
- 4. (UttS Utt UttQS
- 5. (UseCl S UseQCl
- 6. TPres Tense TPast
- 7. ASimul Anter AAnter
- 8. PPos Pol PNeg
- 9. (PredVP Cl
-10. (UsePN NP UsePron, DetCN
-11. john_PN) PN mary_PN
-12. (UseV VP ComplV2, ComplV3
-13. walk_V)))) V sleep_V
-14. NoVoc) Voc please_Voc
-15. TEmpty Text
-```
-
-#BCENTER
-Figure 1. Type-annotated syntax tree of the Text "John walks."
-#ECENTER
-
-#EFIG
-
-Here are some examples of the results of changing constructors.
-```
- 1. TFullStop -> TQuestMark John walks?
- 3. NoPConj -> but_PConj But John walks.
- 6. TPres -> TPast John walked.
- 7. ASimul -> AAnter John has walked.
- 8. PPos -> PNeg John doesn't walk.
-11. john_PN -> mary_PN Mary walks.
-13. walk_V -> sleep_V John sleeps.
-14. NoVoc -> please_Voc John sleeps please.
-```
-All constructors cannot of course be changed so freely, because the
-resulting tree would not remain well-typed. Here are some changes involving
-many constructors:
-```
- 4- 5. UttS (UseCl ...) ->
- UttQS (UseQCl (... QuestCl ...)) Does John walk?
-10-11. UsePN john_PN ->
- UsePron we_Pron We walk.
-12-13. UseV walk_V ->
- ComplV2 love_V2 this_NP John loves this.
-```
-
-
-==Parts of sentences==
-
-The linguistic phenomena mostly discussed in both traditional grammars and modern
-syntax belong to the level of Clauses, that is, lines 9-13, and occasionally
-to Sentences, lines 5-13. At this level, the major categories are
-``NP`` (Noun Phrase) and ``VP`` (Verb Phrase). A Clause typically
-consists of just an ``NP`` and a ``VP``.
-The internal structure of both ``NP`` and ``VP`` can be very complex,
-and these categories are mutually recursive: not only can a ``VP``
-contain an ``NP``,
-```
- [VP loves [NP Mary]]
-```
-but also an ``NP`` can contain a ``VP``
-```
- [NP every man [RS who [VP walks]]]
-```
-(a labelled bracketing like this is of course just a rough approximation of
-a GF syntax tree, but still a useful device of exposition).
-
-Most of the resource modules thus define functions that are used inside
-NPs and VPs. Here is a brief overview:
-
-**Noun**. How to construct NPs. The main three mechanisms
-for constructing NPs are
-- from proper names: "John"
-- from pronouns: "we"
-- from common nouns by determiners: "this man"
-
-
-The ``Noun`` module also defines the construction of common nouns.
-The most frequent ways are
-- lexical noun items: "man"
-- adjectival modification: "old man"
-- relative clause modification: "man who sleeps"
-- application of relational nouns: "successor of the number"
-
-
-**Verb**.
-How to construct VPs. The main mechanism is verbs with their arguments,
-for instance,
-- one-place verbs: "walks"
-- two-place verbs: "loves Mary"
-- three-place verbs: "gives her a kiss"
-- sentence-complement verbs: "says that it is cold"
-- VP-complement verbs: "wants to give her a kiss"
-
-
-A special verb is the copula, "be" in English but not even realized
-by a verb in all languages.
-A copula can take different kinds of complement:
-- an adjectival phrase: "(John is) old"
-- an adverb: "(John is) here"
-- a noun phrase: "(John is) a man"
-
-
-**Adjective**.
-How to constuct ``AP``s. The main ways are
-- positive forms of adjectives: "old"
-- comparative forms with object of comparison: "older than John"
-
-
-**Adverb**.
-How to construct ``Adv``s. The main ways are
-- from adjectives: "slowly"
-- as prepositional phrases: "in the car"
-
-
-==Modules and their names==
-
-This section is not necessary for users of the library.
-
-TODO: explain the overloaded API.
-
-The resource modules are named after the kind of
-phrases that are constructed in them,
-and they can be roughly classified by the "level" or "size" of expressions that are
-formed in them:
-- Larger than sentence: ``Text``, ``Phrase``
-- Same level as sentence: ``Sentence``, ``Question``, ``Relative``
-- Parts of sentence: ``Adjective``, ``Adverb``, ``Noun``, ``Verb``
-- Cross-cut (coordination): ``Conjunction``
-
-
-Because of mutual recursion such as in embedded sentences, this classification is
-not a complete order. However, no mutual dependence is needed between the
-modules themselves - they can all be compiled separately. This is due
-to the module ``Cat``, which defines the type system common to the other modules.
-For instance, the types ``NP`` and ``VP`` are defined in ``Cat``,
-and the module ``Verb`` only
-needs to know what is given in ``Cat``, not what is given in ``Noun``. To implement
-a rule such as
-```
- Verb.ComplV2 : V2 -> NP -> VP
-```
-it is enough to know the linearization type of ``NP``
-(as well as those of ``V2`` and ``VP``, all
-given in ``Cat``). It is not necessary to know what
-ways there are to build ``NP``s (given in ``Noun``), since all these ways must
-conform to the linearization type defined in ``Cat``. Thus the format of
-category-specific modules is as follows:
-```
- abstract Adjective = Cat ** {...}
- abstract Noun = Cat ** {...}
- abstract Verb = Cat ** {...}
-```
-
-
-==Top-level grammar and lexicon==
-
-The module ``Grammar`` collects all the category-specific modules into
-a complete grammar:
-```
- abstract Grammar =
- Adjective, Noun, Verb, ..., Structural, Idiom
-```
-The module ``Structural`` is a lexicon of structural words (function words),
-such as determiners.
-
-The module ``Idiom`` is a collection of idiomatic structures whose
-implementation is very language-dependent. An example is existential
-structures ("there is", "es gibt", "il y a", etc).
-
-The module ``Lang`` combines ``Grammar`` with a ``Lexicon`` of
-ca. 350 content words:
-```
- abstract Lang = Grammar, Lexicon
-```
-Using ``Lang`` instead of ``Grammar`` as a library may give
-for free some words needed in an application. But its main purpose is to
-help testing the resource library, rather than as a resource itself.
-It does not even seem realistic to develop
-a general-purpose multilingual resource lexicon.
-
-The diagram in Figure 2 shows the structure of the API.
-
-#BFIG
-
-#GRAMMAR
-
-#BCENTER
-Figure 2. The resource syntax API.
-#ECENTER
-
-#EFIG
-
-==Language-specific syntactic structures==
-
-The API collected in ``Grammar`` has been designed to be implementable for
-all languages in the resource package. It does contain some rules that
-are strange or superfluous in some languages; for instance, the distinction
-between definite and indefinite articles does not apply to Finnish and Russian.
-But such rules are still easy to implement: they only create some superfluous
-ambiguity in the languages in question.
-
-But the library makes no claim that all languages should have exactly the same
-abstract syntax. The common API is therefore extended by language-dependent
-rules. The top level of each languages looks as follows (with English as example):
-```
- abstract English = Grammar, ExtraEngAbs, DictEngAbs
-```
-where ``ExtraEngAbs`` is a collection of syntactic structures specific to English,
-and ``DictEngAbs`` is an English dictionary
-(at the moment, it consists of ``IrregEngAbs``,
-the irregular verbs of English). Each of these language-specific grammars has
-the potential to grow into a full-scale grammar of the language. These grammars
-can also be used as libraries, but the possibility of using functors is lost.
-
-To give a better overview of language-specific structures,
-modules like ``ExtraEngAbs``
-are built from a language-independent module ``ExtraAbs``
-by restricted inheritance:
-```
- abstract ExtraEngAbs = Extra [f,g,...]
-```
-Thus any category and function in ``Extra`` may be shared by a subset of all
-languages. One can see this set-up as a matrix, which tells
-what ``Extra`` structures
-are implemented in what languages. For the common API in ``Grammar``, the matrix
-is filled with 1's (everything is implemented in every language).
-
-Language-specific extensions and the use of restricted
-inheritance is a recent addition to the resource grammar library, and
-has only been exploited in a very small scale so far.