Quickstatements batch running list keeps getting shorter
Last week I noticed there were up to 16 simultaneously running Quickstatements batches. The last few days that had dropped down to 9 - there were usually many waiting in “Running” state, but only 9 that showed at least one “RUN” entry. As of right now, that 9 has dropped down to just 8 running batch jobs. It looks like the problem may be stopped or DONE batches that still have an entry labeled as “RUN” (but it’s not running, clearly - nothing is happening or changing with those batches). The changer from 9 to 8 seems to have coincided with one of my batches hitting that state - #12982 - it’s currently stopped, but it still has one entry labeled “RUN”. So I’m guessing something is counting the total number of “RUN” entries to determine when to allow another batch job to start, but occasionally those are getting hung up?
Anyway, an interim fix is probably to reset those bogus “RUN” entries somehow. Longer term maybe anything in “RUN” state that’s older than an hour should be ignored/reset automatically?
Comments (9)
-
reporter -
reporter Batch #12982 is back in “STOP” state, but still shows 1 entry with status “RUN”.
-
reporter Here’s an image of my current “Last Batches” screen - #12982 hasn’t changed in 5 hours, but it lists a “RUN: 1” item.
-
reporter The running list has dropped down to 7 as of a little earlier today, I think the batch run backlog is heading towards 24 hours (generally at least 15-20 waiting to run). Batch #13180 appears to be the latest culprit - it shows the same “RUN: 2” problem generally right now (when it’s finished in a few hours it’ll be “DONE” but with a “RUN: 1” entry).
-
reporter - marked as critical
Now the running list is down to just 4 (FOUR!) batches. At least 30 are queued up waiting to run - max running batch ID is 13288 but there’s one queued with ID 13318. I think the latest problem was that somehow one of my jobs acquired 4 extra “RUN”-state entries: #13284
-
reporter running list is back up to 7 batches; somehow the extra “RUN” entries on 13284 cleared up.
-
reporter Now back up to 8 - I’ll lower this from critical as at this level I think it can roughly keep up with the backlog. But if there’s supposed to be up to 16 simultaneous, it’s still only running at half capacity.
-
reporter - marked as blocker
I think it’s reached beyond critical level this morning - only ONE job is running in the queue at the moment, dozens are listed as “Running” but have not updated in 2 hours or more!
-
reporter And now nothing is running - at least last I could see. Now I can’t get to the last batches page at all!?
- Log in to comment
I restarted #12982 for now, to let it finish the items in “INIT” state. However you’ll notice (if you see this while it’s still running over the next few hours) that now it lists 2 entries as in “RUN” state, rather than just 1.