MIF Filter: inner anchored frames content is not extracted

Issue #909 resolved
Denis Konovalyenko created an issue

In the case when one frame is anchored into another one, its content is not extracted. E.g.

The actual extraction is:

<body>
<trans-unit id="1" xml:space="preserve">
<source xml:lang="en">Paragraph 0.</source>
<target xml:lang="fr">Paragraph 0.</target>
</trans-unit>
<trans-unit id="2" xml:space="preserve">
<source xml:lang="en">In anchored frame 1.</source>
<target xml:lang="fr">In anchored frame 1.</target>
</trans-unit>
</body>

An expected extraction may be:

<body>
<trans-unit id="1" xml:space="preserve">
<source xml:lang="en">Paragraph 0.</source>
<target xml:lang="fr">Paragraph 0.</target>
</trans-unit>
<trans-unit id="2" xml:space="preserve">
<source xml:lang="en">In anchored frame 1.</source>
<target xml:lang="fr">In anchored frame 1.</target>
</trans-unit>
<trans-unit id="3" xml:space="preserve">
<source xml:lang="en">In anchored frame 2.</source>
<target xml:lang="fr">In anchored frame 2.</target>
</trans-unit>
<trans-unit id="4" xml:space="preserve">
<source xml:lang="en">In anchored frame 3.</source>
<target xml:lang="fr">In anchored frame 3.</target>
</trans-unit>
<trans-unit id="5" xml:space="preserve">
<source xml:lang="en">In anchored frame 4.</source>
<target xml:lang="fr">In anchored frame 4.</target>
</trans-unit>
<trans-unit id="6" xml:space="preserve">
<source xml:lang="en">In anchored frame 5.</source>
<target xml:lang="fr">In anchored frame 5.</target>
</trans-unit>
<trans-unit id="7" xml:space="preserve">
<source xml:lang="en">In anchored frame 6.</source>
<target xml:lang="fr">In anchored frame 6.</target>
</trans-unit>
</body>

Please be aware, the text lines extraction will be covered in a separate issue.

For more details please refer to the attached file.

Comments (4)

  1. Denis Konovalyenko reporter

    The following cases are slightly similar to the aforementioned one.

    1. The text flow below is formed of independent paragraphs, the first and the last of which contain frame references:

    2. A textual frame is placed inside a table cell.

  2. Log in to comment