blob: e3c84c61dc12b293f0d9c73ff95c3e52baf5905d (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
How to use:
1) Sort the wordlist so it can be split into sublists. It is necessary because
the converter is quite memory-hungry, and you might not have enough RAM to
process the whole wordlist at once.
./CollectLemmas dicc.src | sort > lemmas.src
2) Split the sorted wordlist.
split -l 500000 lemmas.src
3) Splitting has probably left forms of some lemmas spread across two
sublists. Manually edit sublists so all forms for a lemma are present in just
one sublist.
4) Run the converter.
./run_conv.sh xa*
5) The converter has produced abstract and concrete syntaxes for the
dictionary. You can try them out with GF:
gf DictRus.gf
|