Bug in yaml org.yaml.snakeyaml.emitter.Emitter writeDoubleQuoted

Hello

Class org.yaml.snakeyaml.emitter.Emitter

private void writeDoubleQuoted(String text, boolean split) throws IOException {
    ...
    if (ch <= '\u00FF') {
        String s = "0" + Integer.toString(ch, 16);
        data = "\\x" + s.substring(s.length() - 2);
    } else if (ch >= '\uD800' && ch <= '\uDBFF') {
        //if (end + 1 < text.length()) { // Also need to check low part of surrogate pair
        if (end + 1 < text.length() && Character.isLowSurrogate(text.charAt(end + 1))) {
            Character ch2 = text.charAt(++end);
            String s = "000" + Long.toHexString(Character.toCodePoint(ch, ch2));
            data = "\\U" + s.substring(s.length() - 8);
        } else {
            String s = "000" + Integer.toString(ch, 16);
            data = "\\u" + s.substring(s.length() - 4);
        }
    } else {
        String s = "000" + Integer.toString(ch, 16);
        data = "\\u" + s.substring(s.length() - 4);
    }
    ```
 }

You should also check "low" surrogate part because of possible incorrect surrogate pairs in source string. If no checks, then for example:
"\uD800\uFFEF" will be converted to \U000122ff and then in ScannerImpl:scanFlowScalarNonSpaces it will be converted back as:
"\uD808\uDEFF" (new String(Character.toChars(0x122ff))) and it becomes not equal to source string =(

Surrogate pair (example) has been got from https://github.com/google/guava/blob/master/guava/src/com/google/common/base/CharMatcher.java (line 1460), when I was trying to dump class constant pool to yml-file =)

I can offer a short form of your code:

    if (ch <= '\u00FF') {
        String s = "0" + Integer.toString(ch, 16);
        data = "\\x" + s.substring(s.length() - 2);
    } else if (end + 1 < text.length() && Character.isHighSurrogate(ch) && 
    Character.isLowSurrogate(text.charAt(end + 1))) {
        Character ch2 = text.charAt(++end);
        String s = "000" + Long.toHexString(Character.toCodePoint(ch, ch2));
        data = "\\U" + s.substring(s.length() - 8);
    } else {
        String s = "000" + Integer.toString(ch, 16);
        data = "\\u" + s.substring(s.length() - 4);
    }

Thanks =)

Comments (2)