many calls to cdef() is much much slower than few calls to cdef()

I noticed that when I changed opentls's use of cffi from calling cdef once per type definition and prototype (ie, from many hundreds or perhaps even a thousand calls) to a single call containing all the same definitions and prototypes, import time went from about 48 seconds to about 0.7 seconds.

I suggest documenting that the preferred usage is to call cdef few times with lots of definitions rather than many times with few definitions. Or making it faster to call it many times.

