Overview

Sample files demonstrating Argument Clinic output approaches

2014/01/14
Larry Hastings
larry at hastings dot org


These are samples of the proposed alternatives for how Argument Clinic
should write its output.  I assume you've at least skimmed the notes
for the prototype, here:

    https://bitbucket.org/larry/python-clinic-buffer/src/916cd7bf5e58/CLINIC.BUFFER.NOTES.TXT?at=default

I made the samples from "Modules/_pickle.c".  Partially because its
Clinic conversion is mostly complete, and partially because it
demonstrates a small wrinkle: _pickle.c is one of those files that
has multiple PyMethodDef arrays and PyTypeObject structs sprinkled
throughout the file.  I made some notes below.

I encourage you to play with the Clinic buffer prototype:

    https://bitbucket.org/larry/python-clinic-buffer/

Clone it, build it, then experiment!  Post in python-dev if you
find something you like.  Cheers!


0) Using Clinic's original behavior

File:
    _pickle.original.c

This is just here to give you an idea of the problem.  Opponents of
Clinic's current approach suggest that having the Clinic generated
text comingled with the original file degrades readability and
maintainability.


1) Using a "side file"

Files:
    _pickle.using-sidefile.c
    _pickle.using-sidefile.side.c

This takes all the Clinic output and writes it to a second file,
what I've called a "side file".  The problem here is,
where do you #include the side file?  Some of the code in the side file
uses a type that isn't defined until line 4171 (PicklerMemoObject), so
you can't include it until after that.  But there's a PyMethodDef array
that uses symbols from the side file on line 3971!

This means you have to rearrange the file a little to get it to work
with the "side file".  In _pickle's case it was easy, just move two
structures up to the top of the file.   Of course other files might
require much more work.


2) Using a single "buffer" (but with major surgery)

File:
    _pickle.using-buffer.c

This takes all the Clinic output and writes it to one giant blob of text
at the end of the file.  This suffers from the same problem as the "side
file" approach, the chicken-and-egg problem of outputting stuff in the
"buffer" that's needed higher in the file.  It's worse with "buffer" than
with "side file", though, because of the requirement that we hide the
"buffer" as late as possible in the file so it's out of the way.  (With
"side file" at least we can put the #include near the top.)

I solved the problem by moving every PyMethodDef array and PyTypeObject
to the bottom of the file.  (I moved the PyMemberDef and PyGetSetDef
arrays too, just to keep everything together.)  I also had to add forward
declarations for all the PyTypeObject structures to the top of the file.
If you search for the string "larry" you'll find both of these.

I don't think this is too bad, but it does require hacking up the file.


3) Using multiple "buffers"

File:
    _pickle.using-multiple-buffers.c

This is the same as 2) above, but I tried to modify the file as little
as possible.  Instead of reordering the contents of the file, I made
Clinic work around it by dumping the "buffer" multiple times.  That way
everything was defined before it was needed.

The way I did it: I added the junk at the top to make Clinic output to
the buffer, then I added the "dump buffer" to the bottom like normal.
Then I compiled.  Every time the compilation failed, I added a
"dump buffer" just above the problem spot and tried again.  I had to
do this six times until it compiled cleanly.

This solves the problem, but I don't think it's much of an improvement
over the original Clinic approach, and I doubt anyone will champion it.


3) Using a "modified buffer"

File:
    _pickle.using-modified-buffer.c

This is another approach to "buffer": instead of writing *everything*
to the buffer, we only write the big definitions (the docstring and
the parsing function) to the buffer.  We still write the methoddef
#define and forward declarations for the docstring and the parsing
function to the block's output.  This solves the chicken-and-egg
problem of the "buffer" approach without requiring editing the file,
at the cost of five or six lines of extra text in the clinic output block.