Allow to schedule a JIRA Cluster job more frequently than 1 minute

Issue #10 invalid
Serhiy Onyshchenko created an issue

As an atlassian plugin developer, I would like to run a background task for the whole cluster once each 5 seconds. But it seems, that each schedule is rounded to run with an interval at least of one minute.

Currently, the JIRA scheduler configuration returns false (hard-coded into the implementation) for config.useFineGrainedSchedules(). Thus, when one attempts to submit a schedule (either interval or cron), it gets to the "quantize" branch:

    private Schedule quantize(final Schedule schedule) {
        if (config.useFineGrainedSchedules()) {
            return schedule;
        }

        switch (schedule.getType()) {
            case INTERVAL: {
                final IntervalScheduleInfo info = schedule.getIntervalScheduleInfo();
                return Schedule.forInterval(quantizeToMinutes(info.getIntervalInMillis()), info.getFirstRunTime());
            }

            case CRON_EXPRESSION: {
                final CronScheduleInfo info = schedule.getCronScheduleInfo();
                return Schedule.forCronExpression(quantizeSecondsField(info.getCronExpression()), info.getTimeZone());
            }

            default:
                throw new IllegalStateException("Unsupported schedule type: " + schedule.getType());
        }
    }

Comments (8)

  1. Chris Fuller

    This was a deliberate design decision for atlassian-scheduler, as we really do not want plugins abusing it.

    What is it that the cluster would need done that frequently? It would very likely have serious performance implications for the cluster. How are you going to provide mutual exclusion between them? If you don't need mutual exclusion, then you may as well just have your own executor for something like this. If you do need it, you're going to need to contend for a cluster-wide lock in the database every 5 seconds.

    It is almost certainly the wrong answer for solving whatever your problem is.

    Also, this is not an issue for either atlassian-scheduler or caesium. The option is configurable; it is up to the application to decide whether or not to configure the scheduler with this option enabled or not.

  2. Chris Fuller

    Whether or not the option is enabled is not up to Caesium. The change would need to happen in JIRA, if it happens at all.

  3. Serhiy Onyshchenko reporter

    Hello, Chris. Thanks for your response. I would really like to discuss the thing with you, as I do suppose, that there is no one, who knows the scheduler API and it's guarantees better than you do.

    What is it that the cluster would need done that frequently?

    I would like my application to have a single background job, which would poll for new records created in the DB and then would invoke the required services for those records.

    How are you going to provide mutual exclusion between them?

    That is actually one of my concerns. As far, as I have noticed, if there is a schedule for each minute, and a job takes more than a minute to return the job response, the atlassian-scheduler aborts the subsequent job invocation. Is my observation correct? Are there any guarantees in the atlassian-scheduler for mutual exclusion for a job with one JobId being run on different schedule periods?

    Also, this is not an issue for either atlassian-scheduler or caesium.

    Could I ask you to point me to the right issue tracker to address this problem?

    Kind Regards, Serhiy.

  4. Chris Fuller

    I would like my application to have a single background job, which would poll for new records created in the DB and then would invoke the required services for those records.

    This sounds very similar to what the ClusterMessagingService does. Have you looked at that API to see if it already does what you need? How is it that the records would get into the database in the first place? What do those records mean? Do you use the same mechanism for JIRA Server, or does this only matter for JIRA Data Center?

    As far, as I have noticed, if there is a schedule for each minute, and a job takes more than a minute to return the job response, the atlassian-scheduler aborts the subsequent job invocation. Is my observation correct?

    Partially. In both the Quartz (pre-7.0) and Caesium schedulers, this is only guaranteed locally. If the cluster node that claims the job for execution happens to be the same node that is already executing it from an earlier time, then it will abort the job instead of running it. It is actually the atlassian-scheduler library that takes care of this (you can see the logic for it here ).

    Although Quartz is aware of all the currently running jobs in the cluster, it only provides the APIs for viewing those that are running locally. It also has ways to ensure that only one copy of a job runs at a time, but this is based on the executing job class, which belongs to the scheduler library itself, so we could not take advantage of this. Caesium does not track this information in the database at all, so with that implementation too, atlassian-scheduler can only tell the job is already running if you get lucky and it is claimed on the same server.

    Because of this, we recommend that you ensure the job schedules are spaced sufficiently far apart that they would never risk overlapping and/or use a cluster lock to ensure mutual exclusion, as otherwise the results could be erratic in a cluster.

    Are there any guarantees in the atlassian-scheduler for mutual exclusion for a job with one JobId being run on different schedule periods?

    This question doesn't make sense, because the JobId uniquely identifies a single schedule. You cannot have two of them at all. If you tried to create a second one, the first one would be deleted.

    If you mean two different schedules with the same JobRunner, that is a different story. However, the scheduler considers them two separate jobs that have no relationship other than that they would both be returned by the same call to getJobsByJobRunnerKey .

    Could I ask you to point me to the right issue tracker to address this problem?

    The proper place to request this is https://jira.atlassian.com/ in the JRA project with Suggestion as the Issue Type.

  5. Serhiy Onyshchenko reporter

    Hey, Chris. Thanks again for the elaborated response.

    This sounds very similar to what the ClusterMessagingService does. Have you looked at that API to see if it already does what you need?

    As far as I understand, the cluster messaging service is notifying listeners on multiple nodes about an event, that happened on another node. While in my use case, I need to ensure, that there is only one processing thread across the whole cluster.

    How is it that the records would get into the database in the first place?

    Asynchronously, upon receiving a REST request / UI interaction.

    Do you use the same mechanism for JIRA Server, or does this only matter for JIRA Data Center?

    On JIRA Server I could create a Single Thread Executor, with an interval of 5 seconds. On the DC, however, same code would create an executor per node.

    This question doesn't make sense, because the JobId uniquely identifies a single schedule. You cannot have two of them at all. If you tried to create a second one, the first one would be deleted.

    The question was actually an attempt to re-phrase / clarify the question about a single schedule guarantees. Sorry for that :)

    Ah, And thanks for the JAC tip.

    BR, Serhiy.

  6. Chris Fuller

    I just had a conversation with @mkonecny who gave me a little more background on what you're trying to accomplish and why. Abusing atlassian-scheduler as you intend is a very bad way to approach this problem. I would instead suggest to you consider using webhooks that targets the cluster through the load balancer to inform it of changes, having issue views check for updates out-of-band if it isn't sure it hasn't checked very recently, and possibly having an infrequent (once an hour?) job to check for anything that this has missed.

    In any case, this is not an issue with either caesium or atlassian-scheduler, both of which will allow fine-grained schedules if the application configures them that way. It is a Suggestion for JIRA itself, as that is what specifies that fine grained schedules are not allowed. We are very unlikely to change this, as trying to run jobs more frequently than once a minute is a sure sign that there is something wrong with your design.

  7. Serhiy Onyshchenko reporter

    Thanks for the feedback. You have certainly brought a good bunch of valuable information for us to consider. We will have another round of design discussions. As for upcoming scheduler-related questions, what is the best way to engage and involve you into a discussion? (AAC, BB issues, JAC, emails)? BR, Serhiy.

  8. Chris Fuller

    If it's a question that might be of interest to other developers, AAC is usually a reasonably good way to reach me (though I am about to go on leave, returning Feb 8th). I watch the caesium respository, so BB issues are another way.

    JAC would make sense except somebody seems to have restricted access to the SCHEDULER project. The JRA project is where general JIRA issues should go. I would prefer not to get involved in direct emails, as there is more chance that I'll overlook something.

  9. Log in to comment