Game instability in coop with extreme uptimes.

Issue #8 new
Frank Sapone created an issue

If you have a long uptime (not sure of the length, but last true verified one was about a month long uptime) on the same map in coop game instability can result. One time I've seen triggers not firing off properly for opening doors or joining the game and seeing issues about bad frames and negative numbers to these frames.

Simply typing "map <mapname>" in the console fixes this issue. Running from exits back and forth won't resolve it in game because this uses changelevel which does not reset whatever is overflowing.

I've created a hack which basically waits 24 hours and as long as no players are currently on the server it will force my command sv_resetmap to the server which was created for this purpose.

Comments (5)

  1. Maik Merten

    In Quake similar effects can be observed. There the issue is that the "time" clock is represented by a 32 bit single-precision float. As the absolute value increases, precision will diminish.

    http://stackoverflow.com/questions/872544/precision-of-floating-point

    "If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^23. Any larger than this and the distance between floating point numbers is greater than 0.5."

    Thus if a precision of +/-0.5 seconds was sufficient (it is not!) you'd run into problems after 97 days. In Daikatana, as the server tickrate is of course much higher than just two Hertz and the clock should progress with each tick, problems will become apparent sooner.

    This may be fixable either

    a) by increasing the precision of timestamps (time, thinktimes etc.) to double precision or

    b) by applying, a modulo of, e.g., 3600 seconds (one hour) to both time and thinktimes to wrap back to zero before precision issues occur. This could perhaps be done as a postprocessing step at the end of a each tick.

  2. Frank Sapone reporter

    I tried doing a wrap around timer with a local command to force the timer to wrap around to simulate it and the AI breaks until the timer wraps around to increment again. It would be very tricky to fix this bug. Maybe near impossible unless you really had way too much time on your hands to redo how the whole timer and thinking system works. So unfortunately there is no real resolve to this bug. Luckily, it only affects coop servers that have been running for a very long time and that's basically just me who has this problem.

  3. Frank Sapone reporter

    @maikmerten I've done more investigating into this issue. I started working on a Quake 2 coop mod and realized that the instability problems happen much faster (like 3-4 hours before things start getting odd). Right before level.framenum++ is incremented (which is an int) you can check it's value, then reset all timers including level.time. This has worked in some minimal testing that I check it. It may have a few special case scenarios where I would have to check things like certain relay triggers, but I'm not sure yet.

    Either way, if you're curious how I solved it in Quake 2 (for the most part) check out this commit: https://bitbucket.org/neozeed/q2dos/commits/885928ac075ed7e17317f9dccd7251f5b914b9d3

  4. Frank Sapone reporter

    With that being said, Daikatana is still a bit harder because there is more than two timers (level.framenum and level.time) and I'm pretty sure there's more float timers in the structs that are tied to level.time or level.framenum. There is also a P_RealTime, but it looks like it may not need fudging with.

    At some point here I can try the Q2 fix I've come up and re-adapt it for DK with after I've tested it a bit more.

  5. Frank Sapone reporter

    Part of the issue from the e1m1a->e1m1b transition had nothing to do with a long uptime; but rather hosting multiple servers on the same machine from the same directory. DM has e1m1b and it was generating an autosave, which the coop one would grab during transition. That has now been fixed. Longer uptimes are better than they were, but multiple days can cause some lagged movement. Haven’t had time to make special reset functions for everything; but maybe someday.

  6. Log in to comment