Issue #54 new

API cleanup

Andreas Beham
created an issue

A few issues regarding the API showed up. This issue should collect some of them and discuss fixes:

  • Timeouts, Put, Get, Process, ... are events which are not to be scheduled by the user, consequently succeed, fail and trigger methods do not make sense
  • schedule can basically called by anyone at any time with any event. Events can be "scheduled" even though they may have already been triggered.
  • Resources offer a release method that may accept request tokens that originate from different resources

Comments (13)

  1. Andreas Beham reporter

    Just a thought: What if we didn't schedule timeout during init ? But only when created through env.timeout(). That way you can even have timeouts that you can succeed later on for yourself.

  2. luensdorf

    To amend your list, succeed, fail and trigger do only make sense for a user event. It also doesn't make sense for put, get, condition, process and so on.

  3. luensdorf

    About schedule() (and also enqueue() in future): we could mark these methods as internal.

    I see two roles of users, the "ordinary" user and the developer. The ordinary user writes simulation models and should not use schedule(). The developer may want to extend simpy with custom events or resources and needs access to schedule().

  4. luensdorf

    But to schedule a timeout, the user would need access to schedule(). See the above remark, why I think that should not be necessary for the "ordinary" user.

    Meta: Am I blind or is it really not possible to reply on issue comments in bitbucket?

  5. Andreas Beham reporter

    The API would potentially get simpler if the event's succeed() and fail() was combined with the environment's schedule(). The event would retain trigger() to enable chain reactions, but loses succeed and fail. Instead of schedule() the environment would have

    def succeed(self, event, priority=NORMAL, delay=0, value=None):
      if (event._value is not PENDING):
        raise RuntimeError('')
      event.ok = True
      event._value = value
      heappush(self._queue,
                     (self._now + delay, priority, next(self._eid), event))
    
    def fail(self, event, exception, priority=NORMAL, delay=0):
      if self._value is not PENDING:
        raise RuntimeError('%s has already been triggered' % self)
      if not isinstance(exception, BaseException):
        raise ValueError('%s is not an exception.' % exception)
      event.ok = False
      event._value = exception
      heappush(self._queue,
                     (self._now + delay, priority, next(self._eid), event))
    

    That's the first step. In the second step I would remove all self.env.schedule() (which calls to either succeed() or fail()) in the events entirely and introduce further methods into Environment:

    def timeout(self, delay=0, value=None, ok=True):
      t = Event(self)
      if ok:
        self.succeed(t, delay=delay, value=value)
      else:
        self.fail(t, value, delay=delay)
      return t
    
    def process(self, generator):
      p = Process(self, generator)
      p._target = Initialize(self, p)
      self.succeed(p._target, priority=URGENT)
      return p
    
    ...
    

    I've used Event, but one could also retain the Timeout class, but it wouldn't be needed anymore.

  6. luensdorf

    I guess with simplification you mean the removal of succeed() and fail() from Event, right? That is a simplification but incomplete, because - as you said - trigger() still remains. A BaseEvent type wouldn't have any of these methods.

    My main concern is however, that this won't make it simpler but more difficult to introduce new event types. In your example you need to write classes for events (Process) as well as a matching method (process()). This spreads the code for an event type in two places. How would this look like for conditions and put/get requests? We need classes because events may carry additional data (like the event list of a condition or the resource in put/get requests), but we don't need the methods because the initializer can do all the scheduling if necessary.

    The event methods of environment are only shortcuts. They should never carry functionality, otherwise a developer writing new event types would also always need to provide a new environment type with additional methods.

    I guess we shouldn't have provided the convenience methods in the first place. It isn't really that much shorter to write yield env.timeout(3) instead of yield Timeout(env, 3). The shortcuts don't require you to import the event types, but this also isn't that useful, since it doesn't work with resources.

  7. Andreas Beham reporter

    That is a simplification but incomplete, because - as you said - trigger() still remains

    trigger() remains because of the chain reaction feature that is mentioned in the docs. It's a generic method for any event to chain it to another event. I agree with the doc that this could potentially be useful, though I have yet to see a case where this is actually used. I think trigger() is not referenced anywhere in the base framework nor in the examples. You cannot remove trigger() unless you also give up chaining.

    In your example you need to write classes for events (Process) as well as a matching method (process())

    Hmm, it actually would require a matching method only for events which are core to the framework. In my opinion, as it is there are only two core events that would make sense to create through the environment: Process and Timeout. The rest could also remain untouched. Resource is kind of a small environment for itself. It has its own event queue and manages when to schedule which event for itself. Theoretically, a resource could process events itself instead of scheduling them. So request or release don't have much to do with the environment.

    This spreads the code for an event type in two places.

    I would say it separates the decision when to create an object and when to trigger it. This makes the object easier to understand. Second it puts the code for scheduling an event in one place. Now it's spread between succeed() and schedule(). When I tried to understand what was going on in SimPy initially, I had to hop around different classes, because the code that does something was always somewhere else: A timeout is created, this calls init, this calls schedule, this pushes to the heap and so on. This creates long call stacks in your head while reading code which is much harder to understand. In the timeout() method I proposed above it's clearer what happens and the call stack is not as deep.

    The event methods of environment are only shortcuts. They should never carry functionality, otherwise a developer writing new event types would also always need to provide a new environment type with additional methods.

    Yes, but how does the developer not need to write a new environment for a new event type in the current case? Interesting, a question I've had on my mind for some time: How many times does a developer write a new event type vs how many time they use their own environment type. I think an own environment is pretty convenient as I can place all resources therein. An environment is omnipresent so that's a convenient place to put the model objects. However, I haven't thought about a new event type. What are the use cases for new event types?

    I guess we shouldn't have provided the convenience methods in the first place. It isn't really that much shorter to write yield env.timeout(3) instead of yield Timeout(env, 3).

    I feel it's more natural that way than having to remember that somehow a Timeout needs to know the environment? Also in my case you get Intellisense shownig up pretty quick. After typing "env." you get a nice list of what you can do. I have even added "TimeoutExponential(MTTF)" to the environment class.

  8. luensdorf

    You cannot remove trigger() unless you also give up chaining.

    Yes, but as with succeed() and fail(), trigger() only makes sense for user events and should not be available for other events.

    In my opinion, as it is there are only two core events that would make sense to create through the environment: Process and Timeout. The rest could also remain untouched.

    Why are these events special? I don't understand why they should be treated differently.

    Theoretically, a resource could process events itself instead of scheduling them. So request or release don't have much to do with the environment.

    That is certainly possible, but wouldn't that require resource to have step(), succeed() and fail() methods, too?

    I would say it separates the decision when to create an object and when to trigger it.

    Second it puts the code for scheduling an event in one place. Now it's spread between succeed() and schedule().

    What do you mean with scheduling? schedule() does scheduling in one place. succeed() and fail() set the value of the event and schedule the event for processing. These are separate things in my mind.

    When I tried to understand what was going on in SimPy initially, I had to hop around different classes, because the code that does something was always somewhere else: A timeout is created, this calls init, this calls schedule, this pushes to the heap and so on. This creates long call stacks in your head while reading code which is much harder to understand. In the timeout() method I proposed above it's clearer what happens and the call stack is not as deep.

    I guess this is also the fault of our convenience approach. env.timeout() is not a separate method, it is the constructor of Timeout. The call stack is at most only 2 levels deep: Timeout.init -> Environment.schedule. Your approach is also two levels deep: Enviroment.timeout -> Timeout.init (or Event.init as in your example).

    Yes, but how does the developer not need to write a new environment for a new event type in the current case?

    It is only necessary to write a new Event class. You can think of resources as such an example. They don't require a subclass of Environment.

    Interesting, a question I've had on my mind for some time: How many times does a developer write a new event type vs how many time they use their own environment type.

    That I don't know. But if we take simpy itself as an example, we've got two environment classes but many event types. The events do also work in both environments because all they need is the schedule() method.

    I think an own environment is pretty convenient as I can place all resources therein. ... Also in my case you get Intellisense shownig up pretty quick. After typing "env." you get a nice list of what you can do. I have even added "TimeoutExponential(MTTF)" to the environment class.

    It is convenient, but also abusing a class as a namespace. If you put event types in a namespace, I'm sure auto-completion will also work.

    An environment is omnipresent so that's a convenient place to put the model objects. However, I haven't thought about a new event type. What are the use cases for new event types?

    That depends on the system you want to model. If you want to model network traffic, you need to create events for read or write attempts. These events need the amount of data to be read respectively the data to be sent and be triggered when data is available or the data can be sent. Or think of resources, I'm pretty sure there use-cases for strange put and get requests (I think we got a request for a filtered-priority resource at one time). Event types are also useful for debugging. Instead of a generic event, they have attributes with proper names and also have readable string representations.

    In summary: I cannot yet imagine how your implementation would look like, especially how you want to modify process and timeout and how to implement resources. From my current understanding, I'm skeptical if there is a substantial simplification/improvement. Would you like to come up with a rough proof-of-concept implementation (I would of course prefer python, but I've also no problem with C#)? If there is something to play with, I can help myself to understand :)

  9. Log in to comment