Encoding problem - All German Umlauts are replacement characters '\ufffd'

Issue #769 new
Sven Liebehentze
created an issue

See in controlsfx_de_DE.properties.

Comments (21)

  1. Wolfgang Fahl

    I would love to fix this but the change process (signing the CLA - getting BitBucket knowledge) is too much work for this simple change. It would be much better if the source code would be hosted on github and now CLA signing would be asked for for small changes like language settings. Also at this point I do not even find the source for controlsfx_de_DE.properties. Where is this file?

    https://bitbucket.org/controlsfx/controlsfx/src/8f65cfd8545ac2bc135680cc7494955063f2b084/controlsfx/src/main/resources/controlsfx_de_DE.properties?at=default&fileviewer=file-view-default

    looks like an old one it does not have the definition for wizard.next which is shown as an example of being broken in Sven's issue

  2. Romain DEP.

    Came to report the same problem with the French locale, could it be a bug in the way the API result is handled?

    $ curl -L --user api:<token> -X GET https://www.transifex.com/api/2/project/controlsfx/resource/controlsfx-core/translation/fr_FR\?file -H "Accept-Charset: ISO-8859-1" > bytes.txt
    
    $ file bytes.txt 
    → bytes.txt: ISO-8859 text
    

    So, I guess, everything's fine on the transifex side?

  3. Jonathan Giles

    The encoding is handled in the controlsfx/build.gradle file around line 126 (the native2ascii method), and in particular the EscapeUnicode call. I don't know enough about what is required, so I hope someone can propose a patch.

  4. Jonathan Giles

    In Transifex, the text for 'Select Font' is 'Sélectionnez une police'. When I disable the filter(EscapeUnicode) call, I get 'SÈlectionnez une police'. When the build script is unchanged, I get 'S\ufffdlectionnez une police'.

    The issue I have is I don't know what is expected in the translation files - is it some escaped text, or is it always the text from transifex?

  5. Jonathan Giles

    I think the issue is probably in the impl.build.transifex.Transifex class - rather than writing out a stream, we should write out text, but I don't have time to fix this right now.

  6. Romain DEP.

    So, I downloaded the French properties file from transifex, and replaced it in controlsfx-8.40.13.jar successfully:

    proper_locale.png

    For translations outside of the iso-8859-1 range, it appears that transifex does the escaping automatically (for instance, the Arabic resource straight from transifex looks like: dlg.ok.button = \u0645\u0648\u0627\u0641\u0642).

    From that, I think we can safely assume that transifex does the bulk of the escaping and encoding for us.

  7. Romain DEP.

    No go for me:

    current

    8.40.13.png

    with the change

    8.40.14-SNAPSHOT.png

    The reason is simple, the new .properties file is now utf-8 encoded while java expects iso-88591, as such, if I do:

    iconv -t ISO-8859-1 -f UTF-8 controlsfx_fr_FR.properties -o controlsfx_fr_FR.properties.new (conversion from utf-8 to iso-8859-1) the file I obtain can be used.

  8. Jonathan Giles

    Yes, I thought about that too, but I really didn't want to fuss about different encodings for different releases! :-) Hopefully the new approach works for all releases / platforms, but I really want more testing.

  9. Log in to comment