resolve error while exporting special characters (e.g. æøå)

#70 Declined
Repository
elenhinan
Branch
develop
Repository
openrem
Branch
develop
Author
  1. Njål Brekke
Reviewers
Description

When trying to export to excel and csv the log contained errors due to the study description containing special characters (norwegian) that were not correctly converted into the ascii character set.

This patch fixed that by encoding all unicode strings as utf8 before writing.

Still couldn't export to excel due to multiple xrayfilters: (Exception: get() returned more than one XrayFilters -- it returned 2!). But this might be fixed in development, server is running 0.7.4)

CSV now outputs utf8 encoded files. Although ÆØÅ is included in standard 8-bit ascii, using utf-8 should work with more exotic characters as well.

Comments (4)

  1. Ed McDonagh

    Thanks very much for this @elenhinan. Sorry for the delay in responding.

    I will add some of those characters into the existing test files and add those fields to the export tests to ensure they fail with the current code, then try with your modifications. However, it might be a few days before I get a change to do so!

  2. Ed McDonagh

    Hi @elenhinan. I took a look at this this evening. I think the issue is that Microsoft Excel doesn't deal very well with encoding in csv files. For me, using the RF test files I have which have the characters مستشفى واحد in, your fixes don't help. They do change the presentation in Excel, but they are still wrong!

    The only way I know of to make it work is as per this stack overflow answer, which is to open Excel, then import the csv file and set the 'File origin' to 65001 : Unicode (UTF-8). With this method, the original export and the export with your modifications both work fine.

    Thankfully this doesn't affect XLSX exports. And it is also difficult for me to write test cases for, because this is an issue with Excel rather than OpenREM I believe. LibreOffice opens the csv file correctly without any modification.

    Would you please replicate my finding? Both with your Norwegian characters and data and with the RF-RDSR-Siemens-Zee.dcm.

    Thanks,

    Ed

  3. Ed McDonagh

    Hi @elenhinan - did you get any futher with this?

    I know the answer I came up with was unsatisfactory, but were you able to replicate what I found? I'd like to close this PR if we can't make the situation better (though we should document the situation for the docs).