Full-width English letters are ignored by XLIFFWriter

Issue #21 resolved
Xinqi Su created an issue


Full-width English characters in segment source/target content are ignored by XLIFFWriter when writeUnit() is called. Same happens to half-width East Asian language characters.

We have such use cases

  • both full-width and half-width English characters co-exist in source content

  • both half-width English and Japanese characters co-exist in source content

Part of the content in above use cases is missing after converted to XLIFF because Okapi XLIFFWriter ignores them. Would it be possible for Okapi to handle those cases? That will be very helpful. Thank you.

Please find the relevant Okapi source code at https://tinyurl.com/y9k5bg64

