AbstractMarkupFilter subfiltering produces spurious segments
Original issue 303 created by @ysavourel on 2013-01-02T06:22:06.000Z:
See https://groups.google.com/d/topic/okapi-devel/SdXigj5Uiu4/discussion
Currently subfiltering in the AbstractMarkupFilter produces additional textunits which consist only of a single placeholder. These TUs seem to correspond to the original, pre-subfiltered content, which is then replaced by some inline resource/tag.
This behavior is sub-optimal. In the subfiltering case, we should be producing a START_GROUP event, then the event stream from the subfilter, the an END_GROUP event. No textunit should correspond to the pre-subfiltered content.
Cutting and pasting the example from the above thread:
This XML:
<xml>
<foo><html><head><title>This is the title</title></head><body><p>This is the body.</p></body></html></foo>
</xml>
Produces this XLIFF:
<body>
<group id="tu2_ssf1" resname="sub-filter:foo">
<trans-unit id="tu2_tu1" resname="foo_3" restype="x-title">
<source xml:lang="en">This is the title</source>
</trans-unit>
<trans-unit id="tu2_tu2" resname="foo_6" restype="x-paragraph">
<source xml:lang="en">This is the body.</source>
</trans-unit>
</group>
<trans-unit id="tu2" restype="x-foo">
<source xml:lang="en"><x id="1"/></source>
</trans-unit>
<group id="tu1_ssf2" resname="sub-filter:xml">
</group>
<trans-unit id="tu1" restype="x-xml">
<source xml:lang="en"><x id="1"/></source>
</trans-unit>
</body>
So, there's a couple things going on here. The subfiltered TUs appear in the tu_ssf1 group. This is followed by
the tu2 TU, which consists only of a placeholder -- presumably representing the subfiltered content.There's then a another group+TU pair, except in this case the group is also empty. This corresponds to
subfiltering the whitespace between the <xml> and <foo> elements.
Comments (3)
-
Account Deleted -
Account Deleted Comment 2. originally posted by @ysavourel on 2013-04-13T07:38:41.000Z:
Fix merged to dev, commit 43d37c87bd023b91b817e59c67c744e290dc78c9
I'll resolve this after Jim runs his private tests.
-
Account Deleted - changed status to resolved
Comment 3. originally posted by @ysavourel on 2013-04-13T15:42:35.000Z:
- Log in to comment
Comment 1. originally posted by @ysavourel on 2013-04-05T19:04:22.000Z:
I am actively working on this. It looks like it's possible to move the reference into skeleton and have everything still work. The nastier part is that in order to avoid producing an empty TU, the skeleton from the partially-built TU needs to be shifted into a document part. Unfortunately, that skeleton includes a [$$self$] reference, which needs to be stripped (as it's no longer true). I'm still looking for the cleanest way to do this.