| Age | Commit message (Collapse) | Author |
|
of values w.r.t the type P. This was previously not checked, and caused hard-to-find run-time errors.
|
|
|
|
rules, since they are easily compilable to byte code. This fails in the Haskell runtime since let expressions are not allowed as abstract syntax expressions.
|
|
that now we can compute with lambda functions and with true tail recursion
|
|
|
|
|
|
|
|
On my laptop these changes speed up the full build of the RGL and example
grammars with 'cabal build' from ~95s to ~43s and the zero build from ~18s
to ~5s.
The main change is the introduction of the module GF.CompileInParallel that
replaces GF.Compile and the function GF.Compile.ReadFiles.getAllFiles. At
present, it is activated with the new -j flag, and it is only used when
combined with --make or --batch. In addition, to get parallel computations,
you need to add GHC run-time flags, e.g., +RTS -N -A20M -RTS, to the command
line.
The Setup.hs script has been modified to pass the appropriate flags to GF
for parallel compilation when compiling the RGL and example grammars, but you
need a recent version of Cabal for this to work (probably >=1.20).
Some additonal refactoring were made during this work. A new monad is used to
avoid warnings/error messages from different modules to be intertwined when
compiling in parallel, so some functios that were hardiwred to the IO or IOE
monads have been lifted to work in arbitrary monads that are instances in
the appropriate classes.
|
|
These flags now do what the say.
|
|
for liftIO in various places
|
|
|
|
|
|
In particular, the function compileOne has been moved to the new module
GF.CompileOne and its type has been changed from
compileOne :: ... -> CompileEnv -> FilePath -> IOE CompileEnv
to
compileOne :: ... -> SourceGrammar -> FilePath -> IOE OneCompiledModule
making it more suitable for use in a parallel compiler.
|
|
|
|
The def rules are now compiled to byte code by the compiler and then to
native code by the JIT compiler in the runtime. Not all constructions
are implemented yet. The partial implementation is now in the repository
but it is not activated by default since this requires changes in the
PGF format. I will enable it only after it is complete.
|
|
All compiler modules now use GF.Text.Pretty instead of Text.PrettyPrint
|
|
GF.Infra.Location for modularity
GF.Text.Pretty provides the class Pretty and overloaded versions of the pretty
printing combinators in Text.PrettyPrint, allowing pretty printable values to
be used directly instead of first having to convert them to Doc with functions
like text, int, char and ppIdent. Some modules have been converted to use
GF.Text.Pretty, but not all. Precedences could be added to simplify the pretty
printers for terms and patterns.
GF.Infra.Location contains the types Location and L, factored out from
GF.Grammar.Grammar, and the class HasSourcePath. This allowed the import
of GF.Grammar.Grammar to be removed from GF.Infra.CheckM, making it more
like a pure library module.
|
|
This patch also includes some commented out code that was used to search for
the source of code size explosions and an eta expansion bug.
|
|
PGF exports the public, stable API.
PGF.Internal exports additional things needed in the GF compiler & shell,
including the nonstardard version of Data.Binary.
|
|
source code. This makes it quick and lightweight to compile big grammars such as the Berkley grammar
|
|
(table { p_i => t_i } ! x).l ==> table { p_i => t_i.l } ! x
This was used in the old partial evaluator and can significantly reduce term
sizes in some cases.
|
|
Eta expansion is applied between partial evaluation and PMCFG generation.
The buggy version generated type incorrect terms, but PMCFG generation
apparently worked anyway.
|
|
|
|
+ The current type checker for concrete syntax is in
GF.Compile.TypeCheck.RConcrete, but GF.Compile.TypeCheck.Concrete was
still imported in GFI.
+ Fixed a bug that allowed Ints n as a subtype of Ints m, regardless of
m and n. It now requires n<=m. Note: the type checker still allows Int
as a subtype of Ints m, regardless of m.
+ Fixed a potential efficiency problem with large record types, by reducing
the number of recursive calls from |R|*|S| to |R| when checking if R<=S.
+ Fixed a misleading comment: "alpha g t u" checks that u is a subtype of t,
the other way around. Similarly, "checkIfEqLType gr g t u trm" checks that
u is a subtype of t, not the other way around, and not that t is equal to u.
|
|
This bug was introduced sometime between 2013-08-21 and 2013-11-01 and caused
the function convertTerm in GF.Compile.GeneratePMCFG to encounter a EPatt where
it expected Strs. I fixed it by applying the function getPatts (from the old
partial evaluator) to the pattern.
|
|
using GF.Grammar.Parser just like the ordinary GF grammars. Furthermore now GF.Speech.CFG is moved to GF.Grammar.CFG. The new module is used by both the speech conversion utils and by the compiler for CFG grammars. The parser for CFG now consumes a lot less memory and can be used with grammars with more than 4 000 000 productions.
|
|
|
|
|
|
DocumentationBul.gf
|
|
just a few lines in Rename.hs
|
|
|
|
Previously the renamer warned if there was e.g. an unqualified reference to mkAdv, which could come from either Syntax or Paradigms. The renamer picked randomly one of the alternatives, which then often failed in type checking. Now, all candidates are collected into a new structure AdHocOverload [Term], which is accessed by the type checker to make the choice based on the type of the constant. This eliminates some of the warnings and some of the error due to wrong choices. In some rare cases, the inherited constants have the same type, which cannot be resolved by overloading. In such cases, the type checker does the same as the renamer did before: pick the "first" option (i.e. the one that happens to be the first in the list returned by the renamer) and issues a warning. In this patch, only a couple of lines are changed. The typechecker (RConcrete) has more substantial changes, and will be recorded as the next patch.
|
|
e and not e itself. Fixed in RConcrete, which should soon replace Concrete; and hopefully will be replaced by some cleaner code soon, such as ConcreteNew which has been under construction for quite some time.
|
|
record updates enabled by ** expressions. Can be changed back to Concrete.hs by just changing one import command in GF.Compile.CheckGrammar.hs. The worst thing that *should* happen with the new type checker is that some old code is detected to be invalid, which happens if it contains a type-incompatible record extension, e.g. {x = "foo"} ** {x = 1}. Previously such errors were silently ignored. A set of test runs detected one such error in the RGL, which was corrected. On the positive side, the new type checker now enables record updates in the natural way: R ** {x = 1} will give value x = 1 overshadowing any value of x in R (provided the expected type of x is Int). lib/src/experimental/PredicationSwe.gf illustrates this, as opposed to PredicationSwO.gf which has to use old-style copying of even the unchanged record fields.
|
|
Someone who is familiar with the Python translation should check this.
|
|
|
|
This is to avoid one trivial reason for failures in the test suite.
|
|
definitions
|
|
This means that the -old-comp and -new-comp flags are not recognized anymore.
The only functional difference is that printnames were still normalized with
the old partial evaluator. Now that is done with the new partial evaluator.
|
|
GF.Compile.TypeCheck.Primitives
Also move the list of primitives
|
|
Also simplified its type.
|
|
GF.Compile.Coding is not used any more.
|
|
This was a fairly simple change thanks to previous work on making the Ident
type abstract and the fact that PGF.CId already uses UTF-8-encoded
ByteStrings.
One potential pitfall is that Data.ByteString.UTF8 uses the same type for
ByteStrings as Data.ByteString. I renamed ident2bs to ident2utf8 and
bsCId to utf8CId, to make it clearer that they work with UTF-8-encoded
ByteStrings.
Since both the compiler input and identifiers are now UTF-8-encoded
ByteStrings, the lexer now creates identifiers without copying any characters.
**END OF DESCRIPTION***
Place the long patch description above the ***END OF DESCRIPTION*** marker.
The first line of this file will be the patch name.
This patch contains the following changes:
M ./src/compiler/GF/Compile/CheckGrammar.hs -3 +3
M ./src/compiler/GF/Compile/GrammarToPGF.hs -2 +2
M ./src/compiler/GF/Grammar/Binary.hs -5 +1
M ./src/compiler/GF/Grammar/Lexer.x -11 +13
M ./src/compiler/GF/Infra/Ident.hs -19 +36
M ./src/runtime/haskell/PGF.hs -1 +1
M ./src/runtime/haskell/PGF/CId.hs -2 +3
|
|
1. The default encoding is changed from Latin-1 to UTF-8.
2. Alternate encodings should be specified as "--# -coding=enc", the old
"flags coding=enc" declarations have no effect but are still checked for
consistency.
3. A transitional warning is generated for files that contain non-ASCII
characters without specifying a character encoding:
"Warning: default encoding has changed from Latin-1 to UTF-8"
4. Conversion to Unicode is now done *before* lexing. This makes it possible
to allow arbitrary Unicode characters in identifiers. But identifiers are
still stored as ByteStrings, so they are limited to Latin-1 characters
for now.
5. Lexer.hs is no longer part of the repository. We now generate the lexer
from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were
needed. These bugs might already be fixed in newer versions of alex, but
we should be compatible with what is shipped in the Haskell Platform.
|
|
Move source transcoding function GF.Compile to GF.Compile.GetGrammar, in
preparation for doing transcoding before lexing.
|
|
|
|
|
|
+ Eliminated vairous ad-hoc coersion functions between specific monads
(IO, Err, IOE, Check) in favor of more general lifting functions
(liftIO, liftErr).
+ Generalized many basic monadic operations from specific monads to
arbitrary monads in the appropriate class (MonadIO and/or ErrorMonad),
thereby completely eliminating the need for lifting functions in lots
of places.
This can be considered a small step forward towards a cleaner
compiler API and more malleable compiler code in general.
|
|
square brackets
Add proper type checking of course-of-values tables:
+ Make sure that all subterms have the same type.
+ Resolve overloaded operators.
Note though that the GF book states in C.4.12 that the "course-of-values
table [...] format is not recommended for GF source code, since the
ordering of parameter values is not specified and therefore a
compiler-internal decision."
|
|
between ordinary tokens. It is also used in the English RGL to attach the commas to the previous word.
|