HTML sub-filter in JSON filter not working in OmegaT

Issue #37 resolved
Manuel Souto Pico created an issue

Preconditions

  • Okapi Filters plugin for OmegaT installed
  • An OmegaT project (attached) including:

    • /source: JSON file that contains HTML text
    • /omegat:

      • okf_json@foo.fprm, with parameter subfilter=okf_html
      • filters.xml with <option name="useCustom" value="okf_json@foo.fprm"/>

The source file is JSON and it contains HTML data, e.g.

{
    "key": "abc123",
    "text": "<div id=\"foo\"><strong>bar</strong></div>"
}

The filter definition (omegat/okf_json@foo.fprm ):

extractIsolatedStrings.b=false
extractAllPairs.b=false
exceptions=current_translation
useKeyAsName.b=true
useFullKeyPath.b=false
useLeadingSlashOnKeyPath.b=false
escapeForwardSlashes.b=true
useCodeFinder.b=false
noteRules=
extractionRules=
idRules=key
genericMetaRules=
codeFinderRules.count.i=1
codeFinderRules.rule0=</?([A-Z0-9a-z]*)\b[^>]*>
codeFinderRules.sample=&name; <tag>
codeFinderRules.useAllRulesWhenTesting.b=true
subfilter=okf_html
subfilterRules=

Steps

(Already done in the sample project attached.)

  1. Custom filter: Put the okf_json@foo.fprm file in the omegat folder of the project (done*) and indicate the relative path (just the filename) to the custom filter definition in the OmegaT project settings: Edit Project > File Filters > JSON files (Okapi) > Options > Use the following filter parameters file: okf_json@foo.fprm (done*).
  2. Subfilter: Add the HTML subfilter to parse the extracted text correctly as HTML: line subfilter=okf_html in the okf_json@foo.fprm.

Expected results

  1. The JSON file is recognized and the values of the keys specified in the filter are extracted.
  2. The HTML text contained is parsed as HTML, correctly segmented and HTML markup is converted to tags.

The sample project attached also contains an XLIFF file that shows the expected results for the original JSON file.

Actual results

  1. As expected.
  2. Text is not segmented and HTML markup is displayed as it is. The subfilter setting seems to be ignored.

Example of what one segment looks like:

Additional info

I can confirm the HTML subfilter works perfectly when I use my JSON filter definition to create the translation kit in Okapi Rainbow. The XLIFF created in Rainbow is included in the project.

Comments (5)

  1. Log in to comment