Some Unicode characters displayed wrongly in the text results panel

Issue #75 on hold
jsbien created an issue

Note hit #21, it's actually a with dot above.

Comments (19)

  1. jsbien reporter

    Problem occurs also on the marasca WWW page, easily reproduced with the query [] ci przenieśli do on IMPACT GT corpus (2-d). Font bug? I will ask Wilk.

  2. jsbien reporter
    • changed status to open

    I'm not sure I'm able now to reproduce the problem, but I notice another one, which more specific: it seems the display uses some fallback technique which in this case is incorrect. For example if the font doesn't contain a specific ligature, the ligature is converted into a sequence of characters.

    The display should use only the font selected - can it be achieved?

  3. Michał Rudolf repo owner

    It is hard to know what happened from such a general description, but one clarification: "the [missing] ligature is converted into a sequence of characters" doesn't implicate that the display doesn't use the selected font. Is that really true? What is the source of the replacement?

    Also, with such a low-level errors, more technical details are needed: OS, OS version, Qt version.

  4. jsbien reporter

    Install and select Parkosz font: https://bitbucket.org/jsbien/parkosz-font/downloads/Parkosz.ttf, which contains only a very small number of characters, cf. https://bitbucket.org/jsbien/parkosz-font/downloads/fntsampleParkosz.pdf.

    Already when selecting you will notice that the sample is displayed using also other fonts, the same holds for search results.

    By default some fall-back mechanism is used which should be switched off. I guess it applies to any OS and any Qt version.

  5. Michał Rudolf repo owner

    I tested some non-Qt programs and it seems this behavior is a low-level system fallback. There is no way to disable other then reimplement font handling from scratch.

  6. Michał Rudolf repo owner

    I doubt it, isn't it Gtk library? Even if we did, the suggested solution require user to recompile and install his own copy of Pango library. I seriously doubt anybody will do it.

  7. jsbien reporter

    Recent version of djview4poliqarp, IMPACT 2-d, query [orth='[[=w=]]'/x & orth='[^wW]+'] font Junicode selected

    W with acute (LATIN CAPITAL LETTER W WITH ACUTE, 0x1E82) is displayed as a replacement character

    CORRECTED: this is not the replacement character.

    in the textual results panel, which is misleading, because the replacement character occurs also in the original texts.and correctly (because of the fallback?) in the metadata panel.

    CORRECTED: this is perhaps the correct behaviour.

    Looks like font selection doesn't affect the metadata panel!

    COMMENT: does it deserve a separate issue?

  8. jsbien reporter

    The same test but font TeXGyreSchola. The second hit contains LATIN SMALL LETTER W WITH DOT ABOVE (0x1E87). The character is simply omitted in the text results panel, which is definitely wrong.

  9. Log in to comment