UTF-8 Support
Issue #1
resolved
nfer should support specifications and events with non-ASCII characters.
This may be fairly simple. I believe that adding support to the specifications should be as simple as being more flexible in what is accepted during parsing. Tests certainly need to be added.
Comments (3)
-
reporter -
reporter - changed status to open
I have spent some time looking at this, and suggested at least one change. As a result, I am changing its status to open.
-
reporter - changed status to resolved
This has been implemented (I'm confused why the related commit message did not resolve the issue or appear here). A functional test was added with Chinese characters for event names to demonstrate that it works.
- Log in to comment
This post has a good suggestion for building Flex support for UTF-8 identifiers. https://stackoverflow.com/questions/9611682/flexlexer-support-for-unicode
The gist of it is that you need to add character classes like the following:
Where UANY matches any ASCII or UTF-8 char, UANYN omits newlines, and UONLY omits ASCII.