summaryrefslogtreecommitdiff
path: root/src/compiler/GF/Grammar/Lexer.x
AgeCommit message (Collapse)Author
2016-04-07Lexer.x & Parser.y: add a partial parser for termshallgren
Lexer.x: Change the parser monad type P to allow the remaining input to be returned after a partial parse. Add function runPartial :: P t -> String -> Either (Posn, String) (String, t) Parser.y: Add a partial parser pTerm for nonterminal Exp1. Re-export runPartial.
2016-04-06Lexer.x: fix cyclic Functor instancehallgren
It looks like I introduced this cyclic definition in August 2014, but since it isn't used, it hasn't been a problem...
2016-03-21initial support for BNFC syntax in context-free grammars for GF. Not all ↵krasimir
features are supported yet. Based on contribution from Gleb Lobanov
2015-09-29GF source lexer: allow numeric character escapes in string literalshallgren
This makes the output from PGF.showExpr (and other Haskell code that uses the Prelude.show function to show strings) parsable as GF source code in more cases. This is a workaround for the problem that GHC's implementation of the show function uses numeric escapes for printable non-ASCII characters, e.g. show "dålig" = "d\229lig"...
2014-08-13Fix warnings in 16 modules, mostly forward compatibility warnings from GHC 7.8hallgren
2014-06-12now GF keywords can be used as identifiers if they are quotedkr.angelov
2014-03-21refactor the compilation of CFG and EBNF grammars. Now they are parsed by ↵kr.angelov
using GF.Grammar.Parser just like the ordinary GF grammars. Furthermore now GF.Speech.CFG is moved to GF.Grammar.CFG. The new module is used by both the speech conversion utils and by the compiler for CFG grammars. The parser for CFG now consumes a lot less memory and can be used with grammars with more than 4 000 000 productions.
2013-11-26Represent identifiers as UTF-8-encoded ByteStringshallgren
This was a fairly simple change thanks to previous work on making the Ident type abstract and the fact that PGF.CId already uses UTF-8-encoded ByteStrings. One potential pitfall is that Data.ByteString.UTF8 uses the same type for ByteStrings as Data.ByteString. I renamed ident2bs to ident2utf8 and bsCId to utf8CId, to make it clearer that they work with UTF-8-encoded ByteStrings. Since both the compiler input and identifiers are now UTF-8-encoded ByteStrings, the lexer now creates identifiers without copying any characters. **END OF DESCRIPTION*** Place the long patch description above the ***END OF DESCRIPTION*** marker. The first line of this file will be the patch name. This patch contains the following changes: M ./src/compiler/GF/Compile/CheckGrammar.hs -3 +3 M ./src/compiler/GF/Compile/GrammarToPGF.hs -2 +2 M ./src/compiler/GF/Grammar/Binary.hs -5 +1 M ./src/compiler/GF/Grammar/Lexer.x -11 +13 M ./src/compiler/GF/Infra/Ident.hs -19 +36 M ./src/runtime/haskell/PGF.hs -1 +1 M ./src/runtime/haskell/PGF/CId.hs -2 +3
2013-11-25Change how GF deals with character encodings in grammar fileshallgren
1. The default encoding is changed from Latin-1 to UTF-8. 2. Alternate encodings should be specified as "--# -coding=enc", the old "flags coding=enc" declarations have no effect but are still checked for consistency. 3. A transitional warning is generated for files that contain non-ASCII characters without specifying a character encoding: "Warning: default encoding has changed from Latin-1 to UTF-8" 4. Conversion to Unicode is now done *before* lexing. This makes it possible to allow arbitrary Unicode characters in identifiers. But identifiers are still stored as ByteStrings, so they are limited to Latin-1 characters for now. 5. Lexer.hs is no longer part of the repository. We now generate the lexer from Lexer.x with alex>=3. Some workarounds for bugs in alex-3.0 were needed. These bugs might already be fixed in newer versions of alex, but we should be compatible with what is shipped in the Haskell Platform.
2012-05-04alex 3 incompatibility workaroundhallgren
As a temporary workaround, alex is no longer invoked automatically when building with cabal. Developers who want to modify the lexer need to run alex on Lexer.x manually and record the modified Lexer.hs. src/compiler/GF/Grammar/lexer/Lexer.x -- hidden from cabal src/compiler/GF/Grammar/Lexer.hs -- update it manually
2010-06-18Yay!! Direct generation of PMCFG from GF grammarkrasimir
2010-03-18syntax for inaccessible patterns in GFkrasimir
2010-02-08allow negative integers in the grammar syntaxkrasimir
2009-12-13reorganize the directories under src, and rescue the JavaScript interpreter ↵krasimir
from deprecated