Temporary fix for the grave accent a encoding problem: change compatPrint to id. - gf-core.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	bjorn <bjorn@bringert.net>	2008-09-15 12:38:37 +0000
committer	bjorn <bjorn@bringert.net>	2008-09-15 12:38:37 +0000
commit	dbb0f3f3e464044aef1de3abfc1286569ea6543f (patch)
tree	733f5399186b2ae323673d5a44738308fb4d3375 /src/GF/Compile/Coding.hs
parent	a6345877f832fe8b88e1a81124b9ba673796ff69 (diff)

Temporary fix for the grave accent a encoding problem: change compatPrint to id.

The problem is that lower case a with a grave accent is coded in UTF-8 as \195\160. Unicode character \160 is non-breaking space, so Haskell's words function will break a UTF-8 encoded string at this character. String literals in the .gfo file are UTF-8 encoded in generateModuleCode, just before the call to prGrammar (which uses compactPrint, which used words). The real solution would be to pretty-print the grammar to Unicode, and then encode as UTF-8. The problem with that is Latin-1 identifers. They are now kept in Latin-1 in the .gfo file, since Alex can't handle Unicode. The real solution to that would be to fix Alex to handle Unicode, but that is non-trivial. GHC interally uses a very hacky .x file to be able to lex UTF-8 source files. An alternative solution that doesn't address the weirdness of using two different encodings in the same .gfo as we do now, is to incorporate compactPrint into the grammar printer, to avoid having to do any postprocessing.

Diffstat (limited to 'src/GF/Compile/Coding.hs')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: