Armin Rigo avatar Armin Rigo committed 998ec27

Document wchar_t.

Comments (0)

Files changed (1)

doc/source/index.rst

 * intN_t, uintN_t (for N=8,16,32,64), intptr_t, uintptr_t, ptrdiff_t,
   size_t, ssize_t
 
+* wchar_t (if supported by the backend)
+
 As we will see on `the verification step`_ below, the declarations can
 also contain "``...``" at various places; these are placeholders that will
 be completed by a call to ``verify()``.
 
 The C code's integers and floating-point values are mapped to Python's
 regular ``int``, ``long`` and ``float``.  Moreover, the C type ``char``
-correspond to single-character strings in Python.  (If you want it to
+corresponds to single-character strings in Python.  (If you want it to
 map to small integers, use either ``signed char`` or ``unsigned char``.)
 
+Similarly, the C type ``wchar_t`` corresponds to single-character
+unicode strings, if supported by the backend.  Note that in some
+situations (a narrow Python build with an underlying 4-bytes wchar_t
+type), a single wchar_t character may correspond to a pair of
+surrogates, which is represented as a unicode string of length 2.  If
+you need to convert a wchar_t to an integer, do not use ``ord(x)``,
+because it doesn't accept such unicode strings; use instead
+``int(ffi.cast('int', x))``, which does.
+
 Pointers, structures and arrays are more complex: they don't have an
 obvious Python equivalent.  Thus, they correspond to objects of type
 ``cdata``, which are printed for example as
     >>> str(x)        # interpret 'x' as a regular null-terminated string
     'Hello'
 
+Similarly, arrays of wchar_t can be initialized from a unicode string,
+and calling ``unicode()`` on the cdata object returns the current unicode
+string stored in the wchar_t array (encoding and decoding surrogates as
+needed if necessary).
+
 Note that unlike Python lists or tuples, but like C, you *cannot* index in
 a C array from the end using negative numbers.
 
 
     assert C.strlen("hello") == 5
 
+So far passing unicode strings as ``wchar_t *`` arguments is not
+implemented.  You need to write e.g.::
+  
+    >>> C.wcslen(ffi.new("wchar_t[]", u"foo"))
+    3
+
 CFFI supports passing and returning structs to functions and callbacks.
 Example (sketch)::
 
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.