When to use Pessimistic Locking

There are cases where we need to use a pessimistic locking strategy. While optimistic updates are an absolute minimum, we deploy a pessimistic locking strategy into a carefully thought out design. We use pessimistic locking strategies in two primary cases:

  • As a semaphore to ensure only a single process executes a certain block of code at a time
  • As a semaphore for the data itself

Let’s first make sure we agree on the term semaphore. In this context, we’re talking about a binary semaphore, or a token. The token is either available or is held by something. The token is required to be able to proceed with a certain task. A request is made to obtain the token and it is returned if available. Options when it’s not available are to wait for it, blocking until it becomes available or to fail.

Using a semaphore to ensure that certain blocks in your code are only run by a single process at a given time is a great strategy to facilitate load balancing and failover safely. We can now group related changes under token, requiring any other process endeavouring to do the same thing to wait for our lock to release. When the process has terminated either successfully or abnormally, the token is then returned by a commit or rollback.

When used as a semaphore for the data itself, it allows us to ensure the details of the data are consistent for the duration of our block of code. Changes will be prevented thereby ensuring that the actions we take are based on a consistent state for the duration of the code. For example, we may want to make a collection of changes on some foreign key related data based on the parent and need to ensure that the parent data is in an unchanged state. Pessimistically locking the row allows us to guarantee just that. And again, by relying on our existing infrastructure for managing the transaction, either successful completion or failure resulting in a commit/rollback releases our lock.

As mentioned, pessimistic locking does have the potential to cause performance problems. Most database implementations either allow you to not wait at all (thanks Oracle!), or at the very least wait for a maximum amount of time for the lock to be acquired. Knowing your database implementation and its ins and outs is crucial if you’ll be using this approach.

Our new scheduler has been tested with hundreds of concurrent jobs running across dozens of instances without any collisions or deadlocks or any degradation in performance. Drop us a line if you’d like to learn more how we were able to do so.

Database Concurrency

As previously discussed, the database is a crucial component of our new scheduler and for most business systems. With an appropriately designed data model, well written SQL and well-thought out transaction lifecycles and their makeup, most database implementations will provide little cause for complaint, even with many concurrent requests.

But you need to think carefully about the potential issues you will face when it comes to concurrent writes and even writes that were based on reads that have since become stale. What this boils down to is ensuring you have a strategy to ensure that writes are performed based on an expected prior state. These strategies are commonly known as pessimistic and optimistic concurrency control.

Pessimistic control (or pessimistic locking) is so named because it takes the view that concurrent requests will frequently be trying to overwrite each other’s changes. This solution is to place a blocking lock on the data releasing the lock once the changes are complete or undone. A read lock is sufficient despite the frequency with which a dummy write is used to lock. All requests would similarly require a blocking lock before beginning the read/write process. This approach requires state to be maintained from the read all the way through to the write. This is essentially impossible and entirely impractical for a stateless application that are all web applications. It also would have significant performance implications if requests were continually waiting to acquire locks.

Optimistic control takes the view that is just as safe as pessimistic control, but assumes that concurrent writes to the same data are likely to be infrequent and will verify during the write that the state continues to be as it was when the read was first made. Let’s look at a simple example to see what we’re talking about.

We’re tracking scores. I get a request to give William Thomas 5 more points. At the same time, you get a request to give him 3 more points. We both look at his current score and adjust accordingly. If we don’t verify that the score as it currently stands is still 87, one of us will incorrectly increment the score.

An optimistic update would ensure that the score would only update if the score at the time was still 87. Optimistic control has clear benefits when it sufficiently meets your requirements. It doesn’t block giving you a performance benefit over pessimistic, but it’s just as safe.

Let’s briefly review the strategies for optimistic writes, be they updates or deletes.

  • Verify all updating row values against expected values
  • Verify only modified values against expected values
  • Verify a version column is as expected that changes with all writes

Both the first and second choices require that potentially a lot of previous state must be maintained along with the primary key. The third option only requires that our key and the version of the row be maintained. At Carfey Software, we use this third option for all writes. We view straight updates based on primary key of no real use in a concurrent application.

Choosing a version column has some interesting implications on the transaction isolation level that is in play. Our next post will review those implications.