- edited description
YAML Filter fails to parse keys using Explicit Key syntax (?key:value)
Hi,
As mentioned above, the YAML filter isn’t able to parse files with complex mapping keys:
net.sf.okapi.common.exceptions.OkapiBadFilterInputException: Error parsing YAML file: Lexical error at line 8948, column 8. Encountered: " " (32), after : "?"
I’ve attached two files:
complex_mapping_keys.yml
contains real-world examples that I’ve encountered on files that need to be translatedcomplex_mapping_list_keys.yml
contains the example 2.11 extracted from YAML official specs (https://yaml.org/spec/current.html), just for the sake of completeness
Thank you,
NOTE: There are two issues: (1) parsing the ?key:value syntax, (2) handling a key that is a map or list. This issue will track the first so that complex_mapping_keys.yml can be handled. I (@bhlkuro) will create another issue to track (2) for completeness because (2) seems less important practically for the localization field. Handling complex_mapping_list_keys.yml is deferred to the implementation of (2).
Comments (10)
-
-
- changed status to open
This is a long standing weakness of our yaml parser. We need to add support for complex keys
-
I have changed the filters using javacc so that they javacc code is generated by the maven build. This will make it easier to address this issue Unfortunately this is going to require some significant changes to the yaml parser and this is probably an infrequent use case.
-
Hi, @Jim Hargrave (OLD)
I would like to know how the status of development to fix this issue? I tested the 1.42.0-SNAPSHOT version but I got this same error for complex keys mappings. I saw some significative changes on this commit: https://bitbucket.org/okapiframework/okapi/commits/a66a84b1f955f32b5f0268b269992f5cd3750a2d. But I believe that they changes were not introduced on 1.42.0-SNAPSHOT.
Thank you.
-
That commit was a general cleanup in preparation for more changes. I actually attempted to make the fix but it ended up more complicated than I expected and I had to revert those changes. There is still a chance I can address this before the fall 1.42.0 release as I am now currently focusing on all current filter failures. YAML is just a bit lower on the priority list :-(
-
Okay, thanks for your feedback.
-
- changed title to YAML Filter fails to parse keys using Explicit Key syntax (?key:value)
- edited description
Changing title to use the proper name from the Yaml specification and clearing the scope of this issue.
-
-
assigned issue to
-
assigned issue to
-
I came up with this code:
https://bitbucket.org/bhlkuro/okapi/src/yaml-explicit-key/
This can parse complex_mapping_keys.yml and produce the .xlf file. The generated .xlf file misses a layer. Instead of:
<group id="sg6"> <trans-unit id="tu10" resname="en/Id::Header/id_header" xml:space="preserve">
it generates:
<trans-unit id="tu10" resname="en/id_header" xml:space="preserve">
Notice “/Id::Header” is missing. Tracing the way the parser parsed the Yaml file, it is clear that the line
? "Id::Header" :
was handled as a key-value pair with null value and immediately returned. As a result, the next key-value pairs were handled as though a sibling of “Id::Header” key-value pair.
This was probably because the parsing rules don’t consider the case where “:” appears across the lines. An attempt was made to change the rules to include this case. But because of the hundred lines of code in the token manager section of JavaCC code that creates artificial tokens at very specific conditions, I could not come up to the working code in a reasonable time.
@Jim Hargrave (OLD) suggested that we may need to refactor the parser code before trying to handle the explicit key syntax.
I explored another idea of using existing Yaml parser such as Snakeyaml Engine. It would almost work except that Snakeyaml does not generate a token for the whitespaces between other tokens, and therefore it does not work for our purpose where re-generating the original Yaml code is essential.
-
- removed responsible
- Log in to comment