Add set_context_id_callback().

#3 Declined
  1. A. Jesse Jiryu Davis

Hi! I appreciate your great work on yappi. I want to make yappi useful for applications that use greenlets, like Gevent.

The problem using yappi with greenlets is that yappi doesn't know about switches between greenlets. Consider:

import time
import greenlet

def myfunction():
    print 'done'

main_greenlet = greenlet.getcurrent()
g = greenlet.greenlet(myfunction)
g.switch()  # Prints 'done'.

yappi thinks myfunction took one second of wall-clock time, but in fact it was paused while the main greenlet was sleeping for one second.

My solution is to add a function, yappi.set_context_id_callback(). If you set a callback, yappi uses it instead of the thread id to find the current context.

Then, in a separate project, I'll use the new set_context_id_callback function to make a greenlet profiler. Here's a test that demonstrates the greenlet-aware profiler. I like having the greenlet-aware profiler in a separate project. I don't think yappi should include code specific to greenlets. I prefer adding a more general-purpose function to yappi, like set_context_id_callback.

The new function doesn't add much overhead. When there is no context_id_callback, yappi has no extra overhead compared to the previous version. When there is a context_id_callback, yappi's overhead increases about 13%, in my tests.

This pull request is incomplete. I'm showing it to you now to see what you think. If you approve of my basic idea, then I'll do some additional testing before I make a final pull request.


Comments (14)

  1. Sümer Cip repo owner

    Hi ,

    First of all: It is a great idea!

    I am OK with the way you implemented this. I also agree with having another project for greenlet profiling on top of yappi. However, let me couple of days(I need some time because of my real job) to suck in some overview details about the greenlets/gevent. Maybe it can be changed a little bit...

    In the meanwhile, as a beginner to this subject, let me ask you very trivial questions:

    • Are greenlets application level threads? Meaning there is actually only one thread, but you control it via switch statements basically, right?
    • Is this gevent library the most accepted lib on the subject, are there any others, if so, can they also be integrated with the hook you implemented? (No problem if you don't know others, just your initial thoughts on this is enough)
    • Also, I am just asking to verify: the only function yappi will implement is set_context_id_callback() right? Because in your code there are new variables added to _ctx struct? I am assuming these will be moved to your greenlet-profiler project, right?

    Again: Great job!

  2. Sümer Cip repo owner

    Hmm I think I understood your implementation and the basic problem, you also traverse the callstack to advance the timing values. So, just disregard my last question. Let me some time to review your changes .

  3. A. Jesse Jiryu Davis author

    Hi Sumer.

    • "Are greenlets application level threads?" Right, the "greenlet" package is a C extension for CPython. There is typically one actual kernel thread, but it can switch the running greenlet at any time by calling greenlet.switch() explicitly. However, in some applications there are multiple kernel threads, and each kernel thread can run multiple greenlets. (Greenlets can't cross threads.) I need to think and test to make sure this pull request handles multiple threads correctly.
    • "Is this gevent library the most accepted lib on the subject, are there any others." Yes and yes. Besides Gevent there's also eventlet and my own library, Motor. Profiling Motor is the motivation for this project. All of them are built on top of the same greenlet library. I believe this pull request supports all possible libraries that use greenlets. But I haven't actually tested Gevent or Motor yet, that's next on my agenda.
  4. A. Jesse Jiryu Davis author

    I'm changing my mind about something. In my patch, if a context_id_callback is set, yappi pauses the old context and resumes the new context whenever the context_id changes. But now, I think yappi should only do this if clock_type is "cpu".

    Here's why: if clock_type is "wall", we want to know how long each function took in wall-clock time. It doesn't matter if the function took a long time because its greenlet was paused: the fact is, it took a long time. This is how yappi already works: if a function is paused waiting on a mutex, for example, then yappi will record that it took a long time in wall-clock time.

    So I should only pause / resume contexts if a context_id_callback is set and clock_type is "cpu". It's only when clock_type is "cpu" that yappi needs to be changed, because it currently gets CPU time for the whole thread, not for each greenlet.

  5. Sümer Cip repo owner

    I have prepared a long mail explaining this to you. Happy to hear that from you:)

    Bu again, there seems to be some problems regarding this algorithm of "pausing./resuming" context when a context switch occurs. Because we only detect context switch whenever a function is called. I will try to demonstrate this problem by an example in the following days. Just FYI: I have thought this kind of approach in the past and cannot come up with a reasonable/general solution.

  6. Sümer Cip repo owner

    Hi Jesse,

    The problem you are trying to solve is another beast not only detecting context switch of greenlets. Subtracting the elapsed time from all of the call stack items seems to solve the problem , but you also need to think about some corner cases, too. I have thought about this problem for “normal thread wall-time profiling” in past, and discard the idea as there seems not to be a good overall solution. For greenlets, on the other hand, there may be a possibility.

    Here is an example:

    def a():
        # cpu intensive task that generates no CALL, CCALL, RETURN etc. event
        # cpu intensive task that generates no CALL, CCALL, RETURN etc. event
    def b():
        # cpu intensive task that generates no CALL, CCALL, RETURN etc. event
        # cpu intensive task that generates no CALL, CCALL, RETURN etc. event

    So in this example, if you use normal threads, it is impossible to detect context switches as no profile event is generated for context switches. However, for greenlets you may have that opportunity as switch() is actually a function which again generates a profile event. It seems to be OK but again need to tested carefully and you need to think about it a lot. Example: How will you handle recursive/mutual recursive functions when substracting paused time? Will there be any defects?

    So, in general, I don't like the idea of subtracting time information from call stack
    when a thread is paused. I don't like the idea because it seems not applicable to normal
    threads which means the code you suggests contains greenlet specific code.

    What I am suggesting is that following:

    If you look at the update rate of yappi, you will see that I update it once or twice in a year.
    I hopefully, want to be backward compatible starting from v0.82, so my suggestion is to
    start "Gappi" on top of yappi 0.82 with the changes you shared, and it seems not so hard to
    pull back the changes to yappi along the year. Because the problems you are trying to solve are complex and seem not to be in domain of yappi.

    How that sounds?

  7. A. Jesse Jiryu Davis author

    Thanks Sumer. I added a test for deep recursion with greenlets and it actually works perfectly.

    I see what you mean: this is a tough problem, and immediately accepting my patch might not be wise. On the other hand, I hope yappi can eventually include hooks to support greenlets, since I don't want to have a separate fork of yappi forever.

    What do you suggest for now? Should I just pull yappi's full source (plus my patches) into greenlet-profiler and release it as an alpha to see if it works for people in the real world?

  8. Sümer Cip repo owner

    Yes. At first, I think you should release the greenlet-profiler by including yappi's full source. I agree with you that yappi should include some hooks to support different threading models eventually. Let's see the profiler in the wild a while, and in the meantime I will find some chance to think about the problem more, too. I would specifically try to focus to somehow eliminate/move the "substracting pause time" part of your code. If we can find a way to do that kind of interaction externally, (via some other hooks), I think it would be better.

    Because, as you indicated before: not all greenlet libraries use N:1 thread model: meaning one kernel thread for all. Maybe some of them uses N:M model which makes the pause/resume implementation useless, right? We should somehow find a better way to support better integration for different threading models.

  9. A. Jesse Jiryu Davis author

    There's only one greenlet library. Although the greenlet library is used by several libraries (Eventlet, Gevent, Motor), they all use the same greenlet library at the core. And greenlets can never cross threads, so the pause / resume implementation for CPU time will work for all these libraries.

    But I agree, let's see how this works in the real world before we change yappi. I've copied yappi into greenlet-profiler. I'll announce greenlet-profiler next week and see what happens.


  10. Sümer Cip repo owner

    Hi Jesse,

    The only thing that disturbs me about the current implementation is the pause/resume functionality when a context_id_callback is set. This seems not right as when you set context_id_callback, it only should set context_id_callback nothing more.

    Well, anyway, I think I have figured out a way to export that functionality to Python code.

    Here's the draft:

    • set_id_context_callback() will be same as you implemented.
    • set_context_switch_callback() function will be defined.
    • when yappi detects a context switch, it will call the callback if configured. (this may also be useful for other use cases, too)
    • advance_callstack_() function(I cannot decide on a proper name yet) which will advance the initial timing values with a provided value. As, you will provide the value, you will have to bookkeep previous context paused time and maybe other info you want.

    Any thoughts?

  11. A. Jesse Jiryu Davis author

    This sounds good to me.

    I also added a set_context_name_callback so that get_thread_stats() works better. I've added it here:

    With the context_name_callback set, get_thread_stats().print_all() looks like this in a Gevent application:

    name           tid              ttot      scnt
    Greenlet       4386064816       1.031498  11
    Hub            4386064976       0.000719  32

    Otherwise it's just:

    name           tid              ttot      scnt
    _MainThread    4388202608       1.031153  11
    _MainThread    4388202928       0.000668  32

    I love the idea of a context_switch_callback. In the case of greenlets there's actually a greenlet.settrace function, so I could pass the context_switch_callback to greenlet.settrace instead of yappi.set_context_switch_callback. However, there may be other circumstances, besides greenlets, in which we want yappi to be responsible for calling the context_switch_callback.

    How about we name the final function adjust_time(long context_id, long amount)? amount can be positive or negative, it's added to the t0 for all items on the context's current call stack.

    Whoever calls adjust_time needs access to the tickcount() when the context is paused, and when the context is resumed. Should Yappi make tickcount() public too?

  12. A. Jesse Jiryu Davis author

    Hi! Would you like me to develop set_context_switch_callback, adjust_time, and tickcount, or do you plan to do this?

  13. Sümer Cip repo owner


    It would be nice that to have a contributor like you:)

    So following shall be done:

    • implement set_context_switch_callback() and write a unittest for it.
    • instead of making tickcount() public, change the name of get_clock_type() to get_clock() and return the current clock as a dict item there. Currently get_clock_type() returns:
    PyDict_SetItemString(result, "type", type);
    PyDict_SetItemString(result, "api", api);
    PyDict_SetItemString(result, "resolution", resolution);
    Add one more:
    PyDict_SetItemString(result, "time", tickfactor()*tickcount());
    • adjust_time() function name creates an assumption that time is broke if we not call it. Let's name it pause_context() or other name you like and maybe we will also change it.
    • It would be nice to have a unittest for greenlets that have chained-recursive functions, like a->b->c->b->c. There are lots of examples for this scenario in yappi's original test_functionality module.
    • set_context_name_callback() would be great. Please also write a unittest for that,

    But warning: Everytime you define a callback and call a function from yappi, you have the risk of creating a deadlock: as the profiled code may be in the process of an import and if you also import a module in those callback's, then you will wait forever on the global import lock of Python. Believe me that happens quite a while: that is the reason I am using PyImport_ImportModuleNoBlock() in _get_current_thread_class_name(). Anyway, keep that in mind and maybe in your documentation for GreenletProfiler.

  14. A. Jesse Jiryu Davis author

    I'm going to follow your instructions with a series of small pull requests. Meanwhile, let's close this pull request.