IDML Filter: Merge tags that differ only by kerning, tracking, leading or baseline shift

Create issue
Issue #756 resolved
Chase Tingley created an issue

I have an example of this that I can't share publicly; I'll work on getting a clean sample.

Some IDML documents contain text that micromanages the kerning on a nearly per-character basis:

            <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]" FillColor="Color/Black" BaselineShift="1.2" KerningValue="-2">
                <Content>A</Content>
            </CharacterStyleRange>
            <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]" FillColor="Color/Black" BaselineShift="1.2" KerningValue="-10">
                <Content>B</Content>
            </CharacterStyleRange>
            <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]" FillColor="Color/Black" BaselineShift="1.2" KerningValue="-30">
                <Content>C</Content>
            </CharacterStyleRange>
            <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]" FillColor="Color/Black" BaselineShift="1.2" KerningValue="-40">
                <Content>D</Content>
            </CharacterStyleRange>

This produces unusable segments:

<source xml:lang="en"><g id="1">A</g><g id="2">B<g id="3">C</g>...

This is similar to the thing we've seen in DOCX where horizontal and vertical spacing is managed per-character; we added an option to ignore those formatting differences. I think in this case it should always be safe to merge tags that differ only in kerning value.

Comments (22)

  1. Xu Lihang

    I see the wiki says prior to m34, there is a parameter to Simplify inline codes when possible, but it is now deprecated... The xliff's inline tags works as placeholders and we translators cannot have much freedom. I think it will be very convenient to convert paragraphstylerange to tag <p1>, and characterstylerange to tag <c1 id="0">, where the num stands for a style id, id stands for a local style rank. Adjacent stylerange with the same style id and different local style can be merged like this:

    Before:

    <p1><c0 id="0">Th</c0>
    <c0 id="1">e </c0>
    <c0 id="2">Rea</c0>
    <c0 id="3">l </c0>
    <c0 id="4">Scienc</c0>
    <c0 id="5">e </c0>
    <c0 id="6">of
    </c0>
    <c0 id="7">Supers</c0>
    </p1>
    

    After:

    <p1><c0 id="0">The Real Science of
    Supers</c0></p1>
    

    which is how I do in the open source basiccat project

  2. Denis Konovalyenko

    @tingley , I have taken a look at this and composed an example, which is based on the provided story snippet (please find the 765-character-kerning.idml file attached). For better comprehension, the corresponding screenshot of this example can be observed below as well:

    756-character-kerning.png

    As you can see, there are other character range style options, which can also be present. For instance, tracking is really similar to kerning:

    756-character-tracking.png

    The already mentioned vertical and horizontal spacing:

    756-character-vertical-scale.png 756-character-horizontal-scale.png

    Font size, leading, baseline shift and skew:

    756-character-font-size.png 756-character-leading.png 756-character-baseline-shift.png 756-character-skew.png

    Do you think it would be better to take care of kerning and tracking in the scope of this issue or create a new one for tracking? What is your opinion about other styles? Would it be reasonable to create separate issues for those cases or they might not ever happen and should not be looked into at all?

    Regarding the implementation details, what if we go with not only "ignore" option but rather introduce a kind of OPTIONAL min and max threshold values, within which the styles would be considerd equal (they would be provided as configuration parameters for the IDML filter along with the mentioned "ignorance" flag)? E.g.:

    Kerning/tracking would not be taken into account (ignored) if the min kerning/tracking threshold <= actual kerning/tracking <= max kerning/tracking threshold? If the min/max thresholds are not be provided, the kerning/tracking character styles would be completely ignored.

  3. Chase Tingley reporter

    @DenisKonovalyenko I think the idea of a threshold would be a practical approach to this problem. Do we even need a minimum threshold, though? I would think that in practice it would basically always be set to 0. (In other words, the desired behavior is "ignore kerning/tracking/etc that is insignificant and below a certain threshold", which only requires one value.

    My guideline is "if it affects the appearance of the characters, leave it alone". I think we should cover a subset of the ones you've highlighted for now:

    And leave these ones alone for now:

    • font size. (I think in practice most font size changes are intentional)
    • horizontal and vertical scale -- these look like they are actually stretching the text, rather than affecting the spacing
    • skew

    I am not sure what to do about baseline shift. Maybe ignoring it with a threshold would be good.

    Let's cover all the options we want to ignore in a single fix.

  4. Denis Konovalyenko

    @tingley , thanks for the quick answer!

    Regarding the need for the minimum threshold. I see your point, but, in my opinion, if we allow two (min and max) thresholds, there will be a better user experience provided when someone would be able to specify explicit values rather than guess what kind of the minimum threshold value is set implicitly in the system. Moreover, kerning and tracking can have negative threshold values as well as positive (please refer to the original story snippet - it contains -2, -10, -30, -40 kerning values). The same is true for the baseline shift (e.g. -7pt..7pt). Vertical and horizontal scales can have meaningful (e.g. 25%..175%), skew: -85..+85 (degrees sharp). The only standing out of these are font size and leading: 0pt..12pt, but someone might want to set 10pt..14pt range and it would not be possible without the first minimum threshold.

    Summing up, a more unified approach with presenting the opportunity of specifying 2 thresholds would make everyone life easier, I believe.

    Also, if we agree on this, I would also like to clarify the following implementation details (when 2 thresholds would be available for a user to set):

    1. If ignorance flag is set for a style but thresholds are not specified, then the style will be ignored completely.
    2. If ignorance flag is set for a style and only min threshold is specified, then the style will be ignored if the style value is greater than or equal to the min threshold value.
    3. If ignorance flag is set for a style and only max threshold is specified, then the style will be ignored if the style value is less than or equal to the max threshold value.
    4. If ignorance flag is set for a style and min and max thresholds are specified, then the style will be ignored if the style value is greater than or equal to the min threshold value and less than or equal to the max threshold value.

    And the last but not least, the scope of this issue is going to be limited by the following styles:

    • kerning
    • tracking
    • leading
    • baseline shift

    A separate issue for font size, vertical scale, horizontal scale and skew will be created.

  5. Xu Lihang

    I think we should also consider the target text. If some of these styles are applied to characters within a word, it will be difficult to find an equivalent one in the target text. Then it is best to abandon these styles.

    So we can decide whether to ignore these style based on whether they are applied to a word or only characters.

  6. Denis Konovalyenko

    @xulihang, the styles correction is going to be applied before Okapi events are formed (depending on the filter configuration parameters specified). So, I think, everything would be aligned.

  7. Xu Lihang

    @DenisKonovalyenko, I am not very clear. Could you explain more?

    Like the source text is "Real". It should be translated as "真" in Chinese. The first "R" has a different kerning or fontsize. The extracted xliff is

    <g id="1">R</g><g id="2">eal</g>
    

    As for the translation, it can only be like this:

    <g id="1"></g><g id="2"></g>
    
  8. Chase Tingley reporter

    @xulihang This is an issue with any format, right? We could have the same scenario with HTML? For example

    <span style="...">R</span>eal?
    
  9. Denis Konovalyenko

    @tingley , IDML filter configuration in the Rainbow editor is looking like this:

    rainbow-editor-idml-filter-configuration.png

    All ignorance threshold inputs are set as optional (net.sf.okapi.common.uidescription.TextInputPart#allowEmpty is true). What I do not really like is that the thresholds of integer type (kerning and tracking) are filled with 0s not even when the configuration is opened for editing but rather at the time it is being saved, which leads to confusions with igrnoreCharacterKerning.b=true and characterKerningMinIgnoranceThreshold.i=0, characterKerningMaxIgnoranceThreshold.i=0 (the style ignorance is in effect but the thresholds are not meaningful).

    A reasonable solution for this would be the usage of string type thresholds with "manual" checks for allowed values in the corresponding "set" methods of net.sf.okapi.filters.idml.Parameters. UX would be degraded a bit but not that significantly - the error message would be presented without implying selection and focus to the affected input field.

    If you come up with something better, please let me know.

  10. Chase Tingley reporter

    It's not great UX but the ultimate solution would be to use a better widget, I think, and that's not really a priority for me right now. I think understanding the range of these values would probably require consulting our docs anyways, so if the behavior is a little strange with the integer types, it's ok.

  11. Denis Konovalyenko

    @tingley , you have asked for "reasonable" values to use while testing in the scope of the pull request #279. So, below you may find my thoughts on this.

    1. Kerning

    756-character-kerning.png

    -50 - 50 threshold values looks appropriate for me.

    1. Tracking

    756-character-tracking.png

    The same as for the kerning - -50 - 50.

    1. Leading

    756-character-leading.png

    The behaviour really depends on the font size (the below line would have been even overlapped with the above line if all characters leading had been about 0). So, it would be better to understand first what the possible font sizes are in the document before specifying the thresholds.

    1. Baseline Shift

    756-character-baseline-shift.png

    In my opinion, -2 - 2 values can be good starting points to consider.

  12. Log in to comment