Quartz/CGEventKeyboardSetUnicodeString: No output with Python 3 when using Unicode character beyond \uFFFF

Issue #162 resolved
Ted Morin
created an issue

I've been using CGEventKeyboardSetUnicodeString to send arbitrary Unicode characters with Python 2 for a while now. In our migration to Python 3, I'm finding that UCS4 characters that are beyond the UCS2 codepoint (\uFFFF) won't get output.

I've prepared some code that runs perfectly in Python 2 (a�😁𐂀 gets written) and doesn't work beyond \uFFFF with Python 3 (only a� gets written).

# coding: utf-8

from Quartz import (
    CGEventSourceCreate,
    kCGEventSourceStateHIDSystemState,
    CGEventCreateKeyboardEvent,
    kCGSessionEventTap,
    CGEventPost,
    CGEventKeyboardSetUnicodeString,
)

OUTPUT_SOURCE = CGEventSourceCreate(kCGEventSourceStateHIDSystemState)

def _send_string_press(c):
    event = CGEventCreateKeyboardEvent(OUTPUT_SOURCE, 0, True)
    _set_event_string(event, c)
    CGEventPost(kCGSessionEventTap, event)
    event = CGEventCreateKeyboardEvent(OUTPUT_SOURCE, 0, False)
    _set_event_string(event, c)
    CGEventPost(kCGSessionEventTap, event)

def _set_event_string(event, s):
    CGEventKeyboardSetUnicodeString(event, len(s), s)

if __name__ == '__main__':
    chars = [u'a', u'�', u'😁', u'𐂀']
    print('Printing')
    for i, c in enumerate(chars):
        print('%s:' % i, c)
    print('Sending\n')
    for c in chars:
        _send_string_press(c)

I've tried working around the issue but ultimately CGEventKeyboardSetUnicodeString is asking me for a Unicode buffer, and I don't have a way to send one in UCS2 like in Python 2 (as far as I'm aware).

Comments (5)

  1. Ted Morin reporter

    I've tried using an NSString:

    c = Foundation.NSString.stringWithString_(c)
    

    But while still working with Python 2, with 3.5 instead get Fatal Python error: Impossible unicode object state, wstr and str should share memory already.

  2. Ted Morin reporter

    Good news! I was working @Benoit Pierre and we pinpointed the problem and have a fix.

    The problem lies in objc_util.m/PyObjC_PythonToCArray. The code is checking the size of the Unicode buffer with Unicode_GetSize and throwing when there's a mismatch. Then it is converting the string to UTF-16, which can have a different size (as it does in the case of Unicode characters above \uFFFF). By calculating the size after conversion, we alleviate this issue. The patch to PyObjc:

    diff -r 2a7d06203914 pyobjc-core/Modules/objc/objc_util.m
    --- a/pyobjc-core/Modules/objc/objc_util.m  Sun Jul 24 11:23:12 2016 +0200
    +++ b/pyobjc-core/Modules/objc/objc_util.m  Fri Aug 05 19:06:00 2016 -0400
    @@ -773,17 +773,6 @@
         }
    
         if (*elementType == _C_UNICHAR && PyUnicode_Check(pythonList)) {
    -        Py_ssize_t bufsize = PyUnicode_GetSize(pythonList);
    -
    -        if (*size == -1) {
    -            *size = bufsize;
    -
    -        } else if ((exactSize && *size != bufsize) || (!exactSize && *size > bufsize)) {
    -            PyErr_Format(PyExc_ValueError,
    -                "Requesting unicode buffer of %"PY_FORMAT_SIZE_T"d, have unicode buffer "
    -                "of %"PY_FORMAT_SIZE_T"d", *size, bufsize);
    -            return -1;
    -        }
    
     #if PY_VERSION_HEX >= 0x03030000
             *bufobj = _PyUnicode_EncodeUTF16(
    @@ -799,6 +788,16 @@
                 return -1;
             }
    
    +        Py_ssize_t bufsize = PyBytes_Size(*bufobj) / 2;
    +        if (*size == -1) {
    +            *size = bufsize;
    +        } else if ((exactSize && *size != bufsize) || (!exactSize && *size > bufsize)) {
    +            PyErr_Format(PyExc_ValueError,
    +            "Requesting unicode buffer of %"PY_FORMAT_SIZE_T"d, have unicode buffer "
    +            "of %"PY_FORMAT_SIZE_T"d", *size, bufsize);
    +            return -1;
    +        }
    +
             /* XXX: Update API protocol to make the extra copy not necessary
              * Cannot use the code at the end because 'buffer' is assumed to be
              * the value to return to the python caller.
    

    And an adjustment to my own code to hand-in the correct size:

    def _set_event_string(event, s):
        bytes = len(s.encode('utf-16-le')) // 2
        CGEventKeyboardSetUnicodeString(event, bytes, s)
    

    With this I'm having success in 2.7.11 and 3.5.1.

  3. Log in to comment