Propose an extension to fields to include language and transliteration status

Issue #1208 new
Julian White created an issue

It would be useful to be able to express the scripts used in the fields (especially name) and if it has been transliterated when its not in the script of the character set encoding of the message. This is important when services are using the returned identity claim and trying to connect them to existing records in their systems. This is more problematic when the claim data has been transliterated from its original language and the receiving system doesn't know that and is guessing.

For example both MÖLLER & MØLLER will be transliterated to MOELLER on a 9303 compliant MRZ. Its useful for the receiving party to know that if its getting MOELLER whether that is a natural script representation (i.e. the name is in fact MOELLER) or whether it needs to consider that the original script might be MÖLLER or MØLLER, which might have been stored as MOLLER in their system.

Comments (4)

  1. Julian White reporter

    Yes, I think what I would get is the language used by the OP rather than whether its been transliterated or not.

    I think what I want to know is:

    1. the name as recognised/used by the OP (mandatory - obvs)
    2. the language the OP is using for that name (mandatory where the language used for the claim is not the same as the one expected from the OP)
    3. whether the OP has transliterated that name, and if so how; options here could be:

      1. original - name is the same as in the natural language name
      2. manual - manual transliteration by someone, could be a bit random in how they do it
      3. automated_9303 - computed transliteration following ICAO 9303 rules (probably the most common)
    4. the natural language version of the name (optional)

    5. the natural language of the name (optional)

    This is most interesting when you are taking things from electronic travel documents as they can store both the transliterated and natural language version of the names via data group 1 and 11 of the chip.

    Using my previous example, if I had a German national in the UK that was trying to use eKYC-IDA API to do something on a German system its likely that the UK OP would return MOELLER to the request with a language of en-GB. The German system can’t tell from that whether it should be using MOELLER, MÖLLER or MØLLER on their side because they don’t know whether that was transliterated or not. If it wasn’t transliterated then they know its MOELLER, if it is then it could be MÖLLER or MØLLER. If by chance the UK system had the original data from the passport chip then it could also send the natural language data even if within the OP that isn’t the name they are hanging their records off, so the German system would receive the original MÖLLER or MØLLER as well MOELLER.

  2. Nat Sakimura

    In some languages and cultures, transliteration is not readily possible and the option in such a case is that one registers preferred ASCII representation of their names. In such cases, it may be useful to have information about what representation it is, e.g., Passport representation, etc.

    Phonetic transcription also does not work in many cases. FYI, Japanese names have no official phonetics representation but just the characters.

  3. Log in to comment