IDML Filter: the extraction of the hyperlink text source inner elements is not fully supported
It is possible for the HyperlinkTextSource
story element to contain inner ones, most of which are aligned with the CharacterStyleRange
internal elements.
Below is the hyperlink text source description from the specification:
An example document with UI:
its related structure:
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/$ID/NormalParagraphStyle" LeftIndent="18" FirstLineIndent="-18" BulletsAndNumberingListType="NumberedList" NumberingContinue="false">
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Hyperlink">
<HyperlinkTextSource Self="u105" Name="http://hyperlink-1.net 1" Hidden="false" AppliedCharacterStyle="n">
<Content>http://hyperlink-1.net</Content>
</HyperlinkTextSource>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
<Br />
</CharacterStyleRange>
</ParagraphStyleRange>
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/$ID/NormalParagraphStyle" LeftIndent="18" FirstLineIndent="-18" BulletsAndNumberingListType="NumberedList">
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Hyperlink">
<HyperlinkTextSource Self="u108" Name="http://hyperlink-2.net 1" Hidden="false" AppliedCharacterStyle="n">
<Content>http://hyperlink-2.net</Content>
<Br />
</HyperlinkTextSource>
</CharacterStyleRange>
<HyperlinkTextSource Self="u109" Name="http://hyperlink-3.net 1" Hidden="false" AppliedCharacterStyle="n">
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Hyperlink">
<Content>http://hyperlink-3.net</Content>
<Br />
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Hyperlink" Underline="false">
<Properties>
<AppliedFont type="string">Arial</AppliedFont>
</Properties>
<Content>Hyperlink text source as a character style</Content>
</CharacterStyleRange>
<CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Hyperlink">
<Br />
</CharacterStyleRange>
</HyperlinkTextSource>
</ParagraphStyleRange>
and its extraction:
<file original="Stories/Story_ue7.xml" source-language="en" target-language="fr" datatype="xml">
<body>
<trans-unit id="P50B5830A-tu1" xml:space="preserve">
<source xml:lang="en"><g id="1"><g id="2">http://hyperlink-1.net</g></g></source>
<target xml:lang="fr"><g id="1"><g id="2">http://hyperlink-1.net</g></g></target>
</trans-unit>
<trans-unit id="P50B5830A-tu2" xml:space="preserve">
<source xml:lang="en"><g id="1"><g id="2">http://hyperlink-2.net<x id="3"/></g></g><g id="4"><g id="5">http://hyperlink-3.net<x id="6"/></g><g id="7">Hyperlink text source as a character style<x id="8"/></g></g></source>
<target xml:lang="fr"><g id="1"><g id="2">http://hyperlink-2.net<x id="3"/></g></g><g id="4"><g id="5">http://hyperlink-3.net<x id="6"/></g><g id="7">Hyperlink text source as a character style<x id="8"/></g></g></target>
</trans-unit>
</body>
</file>
can be found attached.
All inner elements of hyperlink text sources are extracted as inline codes. So, the extraction has to be improved to reflect the aforementioned possibility (handling the Br
tags as textual unit boundaries at least).
Comments (10)
-
reporter -
- changed milestone to 1.45.0
-
assigned issue to
-
- edited description
- changed title to IDML Filter: the extraction of the hyperlink text source inner elements is not fully supported
- attached list-of-hyperlinks.idml
-
- edited description
-
- edited description
- attached list-of-hyperlinks-2.idml
-
- edited description
-
- edited description
-
A related pull request #669 was opened.
-
A new configuration option was introduced:
extractHyperlinkTextSourcesInline
The default value is
false
.When it is set to
false
, the extraction of hyperlink text sources is performed as reference groups of textual units. E.g.:<trans-unit id="P50C39A8B-tu4" xml:space="preserve"> <source xml:lang="en"><x id="1"/></source> <target xml:lang="fr"><x id="1"/></target> </trans-unit> <group id="P77553333-rg1"> <trans-unit id="P8441FDF-tu1" xml:space="preserve"> <source xml:lang="en">A hyperlink </source> <target xml:lang="fr">A hyperlink </target> </trans-unit> <trans-unit id="P8441FDF-tu2" xml:space="preserve"> <source xml:lang="en">text source 1<g id="1"> and text source 2 and text source 3.</g></source> <target xml:lang="fr">text source 1<g id="1"> and text source 2 and text source 3.</g></target> </trans-unit> </group>
for markup:
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Paragraph Style 1"> <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Character Style 1"> <HyperlinkTextSource Self="u124" Name="Hyperlink 4" Hidden="false" AppliedCharacterStyle="n"> <Content>A hyperlink</Content> <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Character Style 1"> <Br/> </CharacterStyleRange> <Content>text source 1</Content> <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]"> <Content>and text source 2</Content> </CharacterStyleRange> <Content>and text source 3.</Content> </HyperlinkTextSource> <Br/> </CharacterStyleRange> </ParagraphStyleRange>
-
- changed status to resolved
Pull request #669 was merged.
- Log in to comment
Hi @Denis Konovalyenko you have looked into an issue related to IDML filter before here. Do you mind looking into this as you might have a better context.