Switch decoding fixes to use .decode()

Issue #503 resolved
Ed McDonagh created an issue

Non-ASCII encoding issues have been tackled in various ways across the versions, addressing issues #256, #385, #400, #403, #476.

Turns out I should have just added a simple ds.decode() as soon as the file was imported. I think.

This will mainly replace the function added into get_value_kw() and related functions, but will be better in that it is done the once at the start and can cope with sequences with multiple encodings.

Comments (37)

  1. Ed McDonagh reporter

    Now works for mammo, committing to test with pipelines and postgres. Lots of comments to be removed. Possibly impossible to test unicode values in tests, as everything is unicode. Could insert concatination in extractor to test instead... Refs #503

    → <<cset e809b6018a15>>

  2. Ed McDonagh reporter

    Moved Kodak float fix to new function carried out at start of import if decode fails. Removed original test, replaced with file import based test of the same. Refs #503

    → <<cset 9a2a6aa2419e>>

  3. Log in to comment