Why you shouldn’t use Quartz Scheduler

If you need to schedule jobs in Java, it is fairly common in the industry to use Quartz directly or via Spring integration. Quartz’ home page at the time of writing claims that using quartz is a simple 3-step process: Download, add to app, execute jobs when you need to. For any of you that actually have experience with Quartz, this is truly laughable.

First of all, adding the quartz library to your app does not begin to ready your application to schedule jobs. Getting your code to run in a schedule with quartz is anything but straightforward. You have to write your implementation of the job interface, then you have to construct large xml configuration files or add code to your application to build new instances of JobDetails, Triggers using complex api such as
 .withIdentity("myTrigger", "group1").startNow().withSchedule(simpleSchedule()
and then schedule them using Schedule instance from a ScheduleFactory. All this is code you have to write for each job or the equivalent xml configuration. Quite the headache for something that was supposed to simply be “execute jobs when you need to”. Even in Quartz’ tutorial, it takes 6 lessons to setup a job. What happens when you need to make a change to a job’s schedule? Temporarily disable a job? Change the parameters bound to a job? All these require a build/test/deploy cycle which is impractical for any organization.

Quartz is also deficient in its feature set. Out of the box, it is just a code library for job execution. No monitoring console for reviewing errors and history, no useful and reasonably searchable logging, no support for multiple execution nodes, no administration interface, no alerts or notifications, inflexible and buggy recovery mechanisms for failed jobs and missed jobs.

Quartz does provide add-on support for multiple nodes, but it requires additional advanced configuration. Quartz also provides an add-on called Quartz Manager, it too needs additional advanced configuration, is a flash app and is incredibly cumbersome and impractical to use.

Simply put, Quartz doesn’t meet these basic needs:

  • No out of the box support for multiple execution nodes (pooling or clustering)
  • No administration UI that allows all job scheduling and configuration to be done outside of code
  • No monitoring
  • No alerts
  • Insufficient mechanisms for dealing with errors/failures and recovery

All this means Quartz is not really a justifiable choice as an enterprise scheduler. It is feature poor and has high implementation and ongoing utilization costs in terms of time and energy.

Obsidian Scheduler is a great choice for your java-based applications. You truly can be up and running the same day you download it. We have a live, interactive demo where you can try out the interface and see first-hand how easy it is to add/change/disable jobs, to monitor all the node activity, disable/enable nodes and even take advantage of advanced schedule configuration such as chaining and sticky nodes.

In addition to our product website, we’ve discussed Obsidian’s standout features many times here on our blog. Download it today and give it a try!

Scheduler Monitoring Done Right

One of the most important features of Obsidian is the ability to be notified of application events and also quickly locate and correct any issues that arise. While products like Quartz and cron4j give us the basic scheduling, we have to develop our own solutions for monitoring and notification of events. Unfortunately, in many corporate environments, developers are very limited in how much time they can work on infrastructure pieces and things like error handling and monitoring end up getting far too little focus. When you are developing in-house or revenue-generating software, managers just want the product out the door, and aren’t worried about maintenance implications until far too late.

With these realities in mind, we’ve built in 3 levels of functionality to facilitate monitoring. Based on decades of software and operations experience, we’ve identified the primary areas where developers and support teams spend their time, and the ways that we could reduce costs so that using our scheduler would be a huge gain for our clients.

Level One – Logging Console

The first problem we identified is many applications rely on text logs generated by frameworks like log4j to troubleshoot and locate issues. The problem is, these logs can be very verbose, especially over time, and if not correctly configured, the information you need to troubleshoot an issue just may not be there. It is also a major pain to find what you are looking for within these logs. Add in issues with access control for production environments and delays from change requests, and you have a major headache on your hands.

To alleviate this problem, we developed a dashboard console within our scheduler web application. This dashboard has records of dozens of types of application events. This dashboard can be made accessible to read-only users, and can even be used by support teams with little software experience to see issues when they arise. Events can be filtered by type, time, severity and the contents of the message. Any time an event worth logging occurs, it is put into the dashboard history and can then be navigated from the dashboard console.

The real beauty of this approach is that the information is available at your fingertips and can easily be located again and again with a few simple clicks. All events are logged so there is no chance of missing information because it is at too fine a level, and you can easily configure our default job to clean old dashboard information. It also avoids the issue of granting production access to support teams, or having to contact administrators to provide logs.

Obsidian Log Dashboard

Level Two – Notifications

Another big issue of system monitoring and maintenance is knowing immediately when things go wrong, or when unexpected things happen. We knew we needed some kind of notification support, but we had to carefully consider the best way to provide this to our users so they have the flexibility and simplicity they require to maintain their applications inexpensively.

Since we had such a clean and detailed system to log application events, we leveraged this to provide event notifications. Every event may be tied to a specific entity – this may be a scheduled job, a job configuration, etc. We also have severity for every event which range from Trace to Fatal (inspired by logging levels).

The natural way to approach notification was to allow users to subscribe to a certain severity level for events of a certain type. Not only can they be alerted when errors occur, but they can also subscribe to informational events or warnings. Users can also target a specific entity to limit when they are notified – for example, they may wish to know when any job fails, or when a specific job fails. We outlined every type of subscribable event and exposed a clean and simple interface within our administration console so that users could quickly change their event subscriptions, and even temporarily disable them.

We include email notification support out of the box currently and this also gives virtually all mobile users access to SMS notifications as well – see Wikipedia’s list of SMS email gateways to see the address to use for your carrier.

Obsidian Notifications

Level Three – Full and Complete Application Logging

While we log just about any event of any relevance within Obsidian, we also know that organizations have current standards and infrastructure for logging in place. In response to this, we use log4j to log all events logged to the dashboard, plus additional diagnostic information. Since we use the well-established log4j framework, you have access to all its appender types including text appenders, network appenders and JDBC appenders. In this way, developers can have customize Obsidian’s logging to fit their needs. We still believe our notification and event logging approach is tough to beat, but we want to make life as easy for our customers as possible. This approach also makes it easy for our clients to embed Obsidian within their applications while keeping logging consistent and simple.

If you have any questions about our experience or would like to suggest a feature for Obsidian’s monitoring and logging, leave a comment.

Obsidian Scheduler 1.2 Released!

We’ve added more to Obsidian Scheduler. Version 1.2 is now available here.

Features added in this release include

  • Description annotation on job class enables inline help in admin web app
  • Scheduled one-time runs
  • New Job Status supporting any unscheduled events (ad hoc & chaning)
  • Bundled Winstone support for quick start
  • Option to wait indefinitely for job completions on graceful shutdown
  • Licence key release on graceful shutdown

As always, our live online demo has been updated with this latest release. Fully functional and completely open for your use at http://demo.carfey.com.

Scheduler Fault Tolerance & Load Balancing

Obsidian Scheduler provides enterprise scheduling features while natively supporting pooling and clustering, or in other words, load balancing and fault tolerance. But Obsidian does so in a way that is painless and non-invasive. In fact, you don’t have to do anything. Load balancing and fault tolerance are built into each instance of Obsidian Scheduler whether you choose to run it inside the web admin app, embedded in your own application, as a standalone or any combination of these. This is critical for a scheduler since you could encounter software/hardware faults, unanticipated load or any number of other things that could cripple or bring down a scheduler instance that would otherwise impact critical items from firing. This is where pooling & clustering fits so well.

In fact, we are so passionate about fault tolerance and load balancing, that we don’t offer a single node version of Obsidian. All licences are a minimum of two nodes and your fully functional trial allows you to see two nodes running without any functional restriction. We want you to have, at minimum, a second instance running to ensure your scheduled jobs run on time and that a failure doesn’t prevent other scheduled items from completing or subsequent instances from firing.

Many enterprise server solutions support pooling and clustering but often utilize a variety of complex configuration strategies and/or pool participant inter communication approaches. Obsidian doesn’t need any of these. Every Obsidian Scheduler instance of any type automatically joins the existing pool/cluster or establishes it if it is the first one on the scene. No extra configuration required. No communication between servers necessary. No multicast, no replication of data between servers. This means that you can easily swap out hardware in case of failure or add a new member for load sharing with ease. In fact, if you have standby hardware, you can have it running, awaiting availability of a node licence and it will automatically take over as soon as a node licence is available.

Obsidian also supports fault tolerance of individual jobs. If a job stalls, fails to complete because the instance failed, fails with an exception, didn’t run because no nodes were running, was conflicted by another job – all these are job failure modes that Obsidian provides recovery and tolerance mechanisms for and are all configurable and managed via the web interface. You can even configure specialized job chaining using source chain job state. In an upcoming release, Obsidian will expose internally fully manageable workflow based on source job state and/or its output/results, really, any condition or criteria you may have. You can also use the web interface to subscribe to server and job events at a high level or just target the events you are concerned with so that you’re kept up-to-date without having to login, parse and review log files, etc.

We know that running software in production environments can be unpredictable at times and that all too frequently, bad things happen. We want Obsidian Scheduler to keep you safe and to help you feel secure. Share with us your stories or let us know if you can think of any other ways we can make Obsidian better able to adapt to scheduling problems.

Scheduler Management

When we started working on our Obsidian Scheduler, one of the primary motivations was to give our clients full control over the configuration of their server runtime and total control over job configuration, including schedule, state, parameterization, etc. More than just full control, we wanted the configuration of said items to take effect immediately across all instances, without the need for restarts and without any necessary code or config file changes.

The established means of handling this is having a user interface that fronts the configuration items. But many scheduler solutions including cron4j, default Quartz and Spring’s simple scheduler support do not anticipate needs beyond the scheduling of a job. Even from a development perspective, a developer would reap productivity gains from not having to to crack open code or config files, stop and start the server runtime to make these types of changes and have them take immediate effect.

Liveness of these changes across all running scheduler hosts is accomplished using a single data store to store/view the configuration, ensuring proper concurrency safety to avoid dirty writes, deadlocks and the like. As previously discussed on this blog, using an optimistic locking strategy globally has negligible impact on performance while protecting against dirty writes. By choosing such for Obsidian and making strategic use of pessimistic locks, semaphores and reentrant locks, we can handle 1000s of concurrent jobs and dozens of concurrent hosts on minimal hardware without collisions or failures.

One of the the great features of Obsidian is its administration user interface. A webapp that can run with or without and embedded scheduler instance, this is the control panel, if you will, of your scheduler pool and your runnable jobs.

Here are some examples of configuration items that you can control from Obsidian Scheduler’s Admin UI and whose changes take effect immediately across all pooled instances.

  • Job State – including disabling/enabling a job
  • Job Schedule – cron pattern
  • Future State/Schedules – e.g. disable a job at midnight tomorrow or change a job to run hourly as of next week
  • Job Parameters
  • The script itself in Script Jobs
  • Scheduling a job
  • Remove a running host from the pool – Without shutting it down
  • Job Chaining
  • Job Conflict Configuration – including prioritization

A convenient byproduct of putting all this management control in an interface is that it is very easy to have a group of people have read-only access (e.g. production support) to these configuration items. Obsidian does just that, including a read-only role in both native DB and LDAP controlled modes. Server-level controls are also exposed in Obsidian’s admin interface, secured to the admin role. These items provide fine-grained control over advanced Obsidian Scheduler features.

I’ve mentioned the Obsidian pooling support in passing several times in this post. This is another hallmark feature of Obsidian Scheduler. In our next post, we’ll discuss what makes this feature so special.

Obsidian Scheduler 1.1 Released!

We’ve been working hard to make Obsidian Scheduler even better. Version 1.1 is now available.

Features added in this release include

  • Perfectly distributed load balancing across runnings hosts within seconds of pool membership changes
  • Header icons indicating the number of active hosts
  • Ability to disable targeted scheduler instances without shutting them down
  • Ad hoc job submission support
  • Job State/Schedule UI enhancements

Our demo has been updated with this latest release. Check it out at http://demo.carfey.com and download the update here.

Scheduler Goals

As software professionals, we need our job schedulers to be reliable, easy to use and transparent.

When it comes to job schedulers, transparent means letting us know what is going on. We want to have available to us information such as when was the job scheduled to run, when did it start, when did it complete, which host ran it (in pool scenarios). We want to know if it failed, what the problem was. We want the option to be notified either being emailed, paged, and/or messaged when all or specific jobs fails. We want detailed information available to us should we wish to investigate problems in our executing jobs. And we want all of this without having to write code, create our own interface, parse log files or do extra configuration. The scheduler should just work that way.

We want our scheduler to be easy to use. We want an intuitive interface where we can control everything. Scheduling jobs, changing schedules, signing up for alerts, configuring workflow, investigating problems, previewing the runtime schedule of our environments, temporarily disable/re-enable pool participants, resubmit failed jobs, review job errors should all be at our fingertips and easy to use. The changes should take effect immediately in all pool participants and be logged. If we want to add/remove extra nodes based on load need, we should just be able to do so without any drama.

We want our scheduler to be reliable. It should participate in load balancing and fault tolerance without any extra work. It needs to notify us when something goes wrong. It needs to be incredibly fast so that it stays out of the way and lets the jobs run.

As you’re probably starting to see, to solve all these types of problems software long ago established using a single data store, typically a database. For reasons that are beyond me, job schedulers either don’t use a database or only provide it as an optional configuration setup, an afterthought. This is extremely short-sighted. By not driving your solution off a database, most of the needs identified above become impossible or at best, impractical. Even when used optionally, your job scheduler doesn’t provide the user interface that provides the easy access to the information you require. It’s like a reference book without an index or glossary. You can go find the information you want, but it will be much more work than it needs to be.

Carfey Software’s scheduler has all these features and more. Sign up for your trial licence now at www.carfey.com.

The Problems With Schedulers

Job scheduling is a common need in business software environments. The first thing that likely comes to mind is cron. While cron is a very useful tool, it fails to provide for even basic business needs including logging, history and monitoring and availability (beyond the availability of the host it runs on).

Open source tools such as Quartz, cron4j and Spring’s scheduler are common choices. These schedulers are all moderate improvements over cron in that you can run them within existing jvm instances, be it a running servlet container or even a J2EE server. Generally, though, these schedulers fall short of even the most basic software expectations.

Let’s first of all dispense with Spring’s scheduler – using TimerTask. Spring’s focus is not automation or scheduling. This is clear when we start to review the feature set and usage of scheduling in Spring. While some may argue for Spring’s wiring via static configuration (I would not), this is much less pracitcal when it comes to scheduling jobs. Can you work around this static configuration by rolling your own solution around it? Sure, but you’d be violating a basic principle of Spring and do you really want to introduce into your software project the maintenance of code to handle deficencies in an afterthought feature of a framework with an alternate focus? This is besides that fact that there is no built in failover, logging or monitoring.

What about other java schedulers whose focus is in fact scheduling and execution? Some of these acknowledge that they are nothing more than cron in java, but others are clear improvements over Spring’s scheduler. The most popular and comprehensive open source tool available is Quartz. Let’s evaluate Quartz.

Quartz does provide much control over when a job runs. Time of day, days of week, days of month, days of year and any combination of these with support for custom calendars. Apart from the custom calendars, you probably recognize what it supports as being very similar to cron. It probably wouldn’t surprise you that quartz uses cron patterns as a basis for their configurations. Unfortunately, they do not use a true cron pattern adding an extra seconds field at the beginning of each pattern. For anyone who has been using cron for even a short period of time, it’s truly bewildering and frustrating that they chose to not adhere to the established pattern and that when they did so, made their changes non-optional and at the beginning of each pattern. Quartz also does not support some functionality in cron such as day-of-week combined with day-of-month.

This is about all Quartz does moderately well. Our next blog post will identify all the things a world-class, full-featured scheduler needs to support and the right way to do so.

Great New Product

Carfey Software is proud to announce the soon-to-be released Carfey Scheduler. This product is a significant step forward in the software marketplace for scheduling, workflow, automation and monitoring. It is a Java-based product, but can work within virtually any software environment.

This incredible new product is easily administered with an intuitive web-based interface where detailed and categorized monitoring logs can be searched and reviewed. With full fault tolerance, fail-over, recoverability and load balancing, Carfey Software is setting a new benchmark in this space.

Go to carfey.com to be notified shortly when you can have your own trial licence.