Clojure Software Transactional Memory (STM)

788 VIEWS

The upsurge of multi-core and multi-CPU computers demands that applications become more concurrent to effectively utilize increased computational capacity. Concurrency has traditionally been managed using locks, which unfortunately have proven to be less than ideal as they often result in deadlocks, starvation, race conditions, and error-prone code that can be difficult to debug. In this post, we will explore how to leverage Clojure’s Software Transactional Memory (STM) to craft programs that better utilize modern hardware.

Clojure’s way

Clojure addresses this issue of state and concurrency by providing immutable data structures and language-level semantics for safe concurrency through managed references to state, facilitated by a runtime that implements a control mechanism called Software Transactional memory (STM).

STM is a software-level design for controlling access to shared data in memory. It works in a way that is similar to database transactions (only that it pertains to controlling access to in-memory data as opposed to permanent data storage).

STM transactions ensure that changes are done atomically—that is, either all changes associated with a transaction are applied, or if the transaction fails, the changes are rolled back. In addition, changes must be done consistently, meaning a transaction will fail if changes fail to satisfy certain specified constraints. Lastly, changes must happen in an isolated manner. All changes made to data in a transaction are made visible only to the thread that made the changes inside the transaction.

The Clojure STM uses multi-version control concurrency which works by taking a snapshot of data before a transaction is commenced. Changes made to the snapshot are invisible to the outside world (threads) until changes are committed at the end of a successfully completed transaction.

Clojure provides the ref (reference) construct for managing data that allows synchronous and
coordinated changes. All modifications to refs must occur inside an STM transaction, which is specified using the dosync syntax.

The bank scenario

To demonstrate how this works, let’s consider a typical bank account-handling scenario. It is possible to make a lot of transactions on an account or even across several accounts. The account, however, is expected to maintain a consistent balance throughout all these transactions. Let’s create a function that uses a ref to create an account given the account name, account number and an initial starting balance.
(defn create-account [ account-name account-number balance ]
( ref { :account-name account-name
:account-number account-number
:balance balance
} :validator allowable-balance? ) )

(defn allowable-balance? [ { :keys [ balance ]}]
( or ( > balance 0 )
(throw (IllegalStateException. "Balance cannot be less than zero" ) )
))

This function returns a ref which represents an account when it is called. We have also added a
validator function to ensure that an account balance is always greater than zero. (We are a bank, after all.)

Now, let’s create three account numbers.

(def first-account ( create-account "Robert" 300045 120 ) )
( def second-account (create-account "Mike" 30046 500 ) )
( def third-account (create-account "Rose" 30047 200 ) )

This creates three account refs with the given details, respectively. Now, suppose Mike decides to be charitable and instructs the bank to make a transfer of $200 from his account to Robert’s account.

To do this, two things must be done. First, we need to debit Mike’s account by an amount of $200 and then credit it to Robert’s account. To the outside world, these two operations must be seen as happening at the same time (atomic). The transfer needs to occur in a way that the money being transferred is only in one account at a time from the perspective of any outside observers.

To do that, Clojure requires that this be done in a transaction using the dosync construct.
Let’s define a function called make-transfer that transfers a specified amount of money from one
account to the other.

(defn make-transfer [ from-account to-account transfer-amount ]
(dosync
(alter from-account update-in [ :balance ] - transfer-amount )
(alter to-account update-in [ :balance ] + transfer-amount )))

Now, let’s issue the transfer.

(transfer second-account first-account 200 )
(println "first account -> " @first-account "Second account -> " @second-account )

This is the output as displayed on my screen.

first account -> {:account-name Robert, :account-number 300045, :balance 320} Second account ->
{:account-name Mike, :account-number 30046, :balance 300}

However, a transfer like this fails:

(make-transfer second-account third-account 600 )
CompilerException java.lang.IllegalStateException: Balance cannot be less than zero, compiling:(~/clojure/concurrency.clj:44:1)

We could have performed these two operations on separate threads like this:

(future (make-transfer second-account first-account 100 ) )
(future (make-transfer second-account third-account 600) )

Understanding the alter function

The alter function is used to atomically mutate a ref. It is invoked by supplying the ref and a function that takes the ref as an argument and returns a value that will be used as the new value of the ref. Because Clojure allows several transactions to be executed concurrently in multiple threads, a transaction is only allowed to commit if the snapshot of the ref that was given to the alter function is the same as the current value of the ref—or else all changes will be dropped, and the transaction will be retried with the latest value of the ref. This lock-free approach allows threads to freely execute transactions without being blocked, including those that are solely reading.

The commute function

To limit the number of retries that have to be made at commit time by the alter function, the commute function can be used in cases where the order of the function application is insignificant. It is used in cases where it matters little which thread commits first at the end of two transactions.

So in a case where the order the bank transfers are issued in doesn’t really matter, the make-transfer function could have been implemented like this:

(defn make-transfer-2 [ from-account to-account transfer-amount ]
(dosync
(commute from-account update-in [ :balance ] - transfer-amount )
(commute to-account update-in [ :balance ] + transfer-amount )
))

Conclusion

Multi-core and multi-CPU computers have come to stay, and we will continue to witness improvements—at least for the time being. Therefore, a programing language that makes it easier to program in such an environment is of grave importance. Clojure, a dynamically typed functional language, thrives within such domains, since it was designed with concurrency in mind. So the next time you think about writing your next (or first) concurrent application, you might want to check out Clojure.


Robert Antwi is currently with Data and Management Systems Limited (DMSL) Ghana where he works with a team of data scientists to provide creative and innovative learning-based solutions.


Discussion

Click on a tab to select how you'd like to leave your comment

Leave a Comment

Your email address will not be published. Required fields are marked *

Menu