OpenXML Filter: segmentation quality reduced for some PPTX documents
Issue #977
resolved
Please consider the following extraction:
<source xml:lang="en">The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown
fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. <x id="1" c
type="x-x" equiv-text="<tags1/>"/>The quick brown fox jumps over the lazy dog. The
quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy d
og. <x id="2" ctype="x-x" equiv-text="<tags2/>"/>The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over
the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps
over the lazy dog. <x id="3" ctype="x-x" equiv-text="<tags3/>"/></source>
There is extra <x id="3">
code in the end.
The expected output mustn't contain it:
<source xml:lang="en">The quick brown fox jumps over the lazy dog. The quick brown fox
jumps over the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown
fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog. <x id="1" c
type="x-x" equiv-text="<tags1/>"/>The quick brown fox jumps over the lazy dog. The
quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy dog.
The quick brown fox jumps over the lazy dog. The quick brown fox jumps over the lazy d
og. <x id="2" ctype="x-x" equiv-text="<tags2/>"/>The quick brown fox jumps over the
lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps over
the lazy dog. The quick brown fox jumps over the lazy dog. The quick brown fox jumps
over the lazy dog. </source>
For more details please refer to the attached document.
Comments (2)
-
reporter -
reporter - changed status to resolved
The pull request #434 has been merged.
- Log in to comment
A corresponding pull request #434 has been opened.