| Age | Commit message (Collapse) | Author |
|
Now you will get error messages like these:
example.gf:1:21:
Syntax error:
Unexpected token '}'.
Expected one of:
- '{'
- 'open'
- an identifier
|
|
extension to avoid captures; captures with iterated table extensions might still be possible, which needs further analysis
|
|
table {cases ; vvv => t \! vvv}.t
|
|
|
|
Lexer.x: Change the parser monad type P to allow the remaining input to
be returned after a partial parse. Add function
runPartial :: P t -> String -> Either (Posn, String) (String, t)
Parser.y: Add a partial parser pTerm for nonterminal Exp1.
Re-export runPartial.
|
|
features are supported yet. Based on contribution from Gleb Lobanov
|
|
This makes the documentation clearer, and can potentially catch more
programming mistakes.
|
|
using GF.Grammar.Parser just like the ordinary GF grammars. Furthermore now GF.Speech.CFG is moved to GF.Grammar.CFG. The new module is used by both the speech conversion utils and by the compiler for CFG grammars. The parser for CFG now consumes a lot less memory and can be used with grammars with more than 4 000 000 productions.
|
|
This is to avoid one trivial reason for failures in the test suite.
|
|
1. The default encoding is changed from Latin-1 to UTF-8.
2. Alternate encodings should be specified as "--# -coding=enc", the old
"flags coding=enc" declarations have no effect but are still checked for
consistency.
3. A transitional warning is generated for files that contain non-ASCII
characters without specifying a character encoding:
"Warning: default encoding has changed from Latin-1 to UTF-8"
4. Conversion to Unicode is now done *before* lexing. This makes it possible
to allow arbitrary Unicode characters in identifiers. But identifiers are
still stored as ByteStrings, so they are limited to Latin-1 characters
for now.
5. Lexer.hs is no longer part of the repository. We now generate the lexer
from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were
needed. These bugs might already be fixed in newer versions of alex, but
we should be compatible with what is shipped in the Haskell Platform.
|
|
write for instance 'ab.c' and then everything between the quites is identifier. This includes Unicode characters and non-ASCII symbols. This is useful for automatically generated GF grammars.
|
|
|
|
The fact that identifiers are represented as ByteStrings is now an internal
implentation detail in module GF.Infra.Ident. Conversion between ByteString
and identifiers is only needed in the lexer and the Binary instances.
|
|
Most of the explicit uses of ByteStrings were eliminated by using identS,
identS = identC . BS.pack
which was found in GF.Grammar.CF and moved to GF.Infra.Ident. The function
prefixIdent :: String -> Ident -> Ident
allowed one additional import of ByteString to be eliminated. The functions
isArgIdent :: Ident -> Bool
getArgIndex :: Ident -> Maybe Int
were needed to eliminate explicit pattern matching on Ident from two modules.
|
|
Instead of just "syntax error", you now get e.g.
PType is a predefined constant, it can not be redefined
This is a simple change in the parser.
|
|
flag beam_size in the top-level concrete module
|
|
files correctly.
The parser works on raw byte sequences read from source files. If parsing
succeeds the raw byte sequences are converted to proper Unicode characters
in a later phase. But the parser calls the function buildAnyTree, which can
fail and generate error messages containing source code fragments, which might
then containing raw byte sequences. To render these error messages correctly,
they need to be converted in accordance with the coding flag in the source
file. This is now done for UTF-8-encoded source files, but should ideally also
be done for other character encodings. (Latin-1-encoded files never suffered
from this problem, since raw bytes are proper Unicode characters in this case.)
|
|
|
|
compilation schema is few times faster.
|
|
separate PGF building
|
|
of the different definitions. There is a --tags option which generates a list of all identifiers with their source locations.
|
|
|
|
|
|
'instance Foo of Bar - [f,g,h]'
|
|
|
|
|
|
|
|
|
|
|
|
grammar. It may not be used accurately in the error messages yet
|
|
|
|
|
|
|
|
|
|
from deprecated
|