DocBook 5.0 XML Filter

DocBook (5.0) is an XML based standard format of document publishing.
DocBook defines various elements including in-line elements such as If the vanilla XMLFilter is used for extraction, inline elements such as emphasis, link, and literal.
This is a short docbook example:

<article xmlns='http://docbook.org/ns/docbook'>
  <title>Example emphasis</title>
  <para>The <emphasis>most</emphasis> important example of this phenomenon occurs in
  A. Nonymous's book <citetitle>Power Snacking</citetitle>.
  </para>
</article>

If we apply a vanilla XMLFilter to this, we’ll get 6 trans-units for:

Example emphasis
The
most
important example of this phenomenon … book
Power Snacking
.

This isn’t what we want. What we’d like to have for this example docbook would be two trans-units:

Example emphasis
The <g id="1">most</g> important example of this phenomenon ... book <g id="2">Power Snacking</g>.

This issue suggests to create a predefined XML Filter filter configuration, perhaps named okf_xml-docbook.

Comments (4)