Piecewise immutable programming
May 14, 2011 Leave a comment
Copyright © 2011 Gabriel Zs. K. Horvath
Although not a programming language expert at all, I will try to muster all my limited knowledge to briefly discuss software transactions in terms of programming model. Be aware that this is all very tentative…
On the one hand the level of isolation and immutability is very much reminiscent of functional programming, on the other hand the mutability of the versioned variables is essentially imperative programming. To better understand this relationship let’s see, very tentatively, how my model can be used to code in a “pure” imperative way or in a more functional one.
By wrapping every element of the code in its own transaction and let them communicate through unchecked IO operations we should be able to reproduce the imperative programming model. Every transaction which accesses a versioned variable will be either a read-only transaction or a write-only transaction and will always successfully commit. This takes us back to a “pure” form of imperative programming.
Since the versioned variables are eventually mutable, the order in which operations are performed on them is determinant, which is not compatible with the pure functional model. We might however move closer to the pure functional programming model by, for example, coding in a more functional style: an atomic block which consists of a single read-only function whose results are written at the end of the transaction might fit the bill. Note that within the scope of a single transaction any function that doesn’t perform any write or IO operation will have some of the properties of pure functions. Also the write operation could possibly be implemented using monads…
An entire program coded in this fashion could then consist of locally almost pure functional code blocks separated by commits; as these commit operations break the functional purity of the code by mutating the state, one could then call this programming model piecewise functional programming or piecewise quasi-imperative programming or just piecewise immutable programming.
Just like monads can be used to code in an imperative way in pure functional programming, software transactions could be a way of allowing more functional programming style in an essentially mutable state model.
I am currently working on the Khronos programming language which combines software transactions with join-calculus.
An extension consists in creating a fully transactional operating system which would extend the concept of transactions everywhere to all levels of software, KhronosOS. Application to distributed systems is another obvious avenue of research. Finally the application of software transactions to database systems is probably the most obvious and trivial application of this model and is the goal of my KhronosDb project.
The idea of versioning is an old one; it is widely used for example in databases and source control systems. Its power and usefulness seems however to be generally speaking underestimated.
The idea of semi-mutability came to me as I was struggling with mutable versioned variables, trying to distinguish reads of unmodified variables from reads of locally changed variables. At first it felt like an odd idea, but I quickly convinced myself that it is actually a very powerful idea:
- It is entirely in line with the idea of maximum isolation, not only is an atomic block is isolated from all writes, it is even isolated from its own writes.
- Variables become totally immutable inside a transaction
- It re-enforces the idea that only commits can modify variables
- I believe that there is little need for mutability within a transaction, reading locally modified variables is often just a shortcut to avoid using local variables
- It allows to perform rich queries at no extra cost while avoiding the creation of spurious conflicts
- Multiple threads can execute in parallel within a transaction with no risk of race conditions as the data is immutable
So I now think that local or immediate mutability is one of the little things worth giving up if it can buy us better concurrency.
Contiguous and notifications transactions
These are a natural extension to the atomicity of transactions. They will also be needed for most application of software transactions.
I have not recently carried out an exhaustive bibliographical research on topics related to software transactions. I will only mention the most obviously related work. Since I started working on this model about three and a half years ago I have notice a general convergence of similar ideas.
Obviously the software transactions model presented here didn’t come out of the blue. My original and essentially only source of inspiration was the “Versioned boxes as the basis for memory transactions” paper . Leveraging the semantics of data types has been around for well over twenty years in some form or another. This leveraging turns out to be particularly useful and natural in the context of software transactions. Semi-mutability is also mentioned in the latest version of Coherent Reactions . The idea of Transactions everywhere was first suggested in . Microsoft Research has a team working on some similar ideas, but I haven’t looked at their work except for their Channel 9 interview . Finally I haven’t seen any mention anywhere of other features of my model such as contiguous and notification transactions, silent reads, some aspects of semantics encapsulation, lazy merging of writes and finally semi-mutability in the context of transactions.
[Update 17-7-2011] I have now read some of the papers published by the Microsoft Research group . Their model is similar to version control systems hence the name Concurrent Revisions and seems to be more oriented towards parallelizing code than concurrency as it is based on forking out (branching out) of and joining back (integrating) into a thread. The forked threads are isolated from each other’s writes. The code can mix versioned and non-versioned variables, but it looks like all forked threads runs in the context of a revision (transaction). They focus on maximum parallelism and ease of merging, and ignore write-read conflicts. In one example their merge function (cumulative) supports only increments and decrements, but there doesn’t seem to be a way of enforcing this constraint. They also discuss the merging of abstract data types, e.g. sets and lists. Their approach relies on the actual merge function defining and leveraging the semantics. This approach leads them to consider merging revisions where a variable is set in one and incremented in another. They applied their revisions model to an existing game and produced some impressive speed ups (see Concurrent Programming with Revisions and Isolation Types).
 “Versioned boxes as the basis for memory transactions“, J. Cachopo and A. Rito-Silva, 2006, http://portal.acm.org/citation.cfm?id=1228566
 “Declarative Objects“, Jonathan Edwards, PDF http://coherence-lang.org/EmergingLangs.pdf
 “Transactions Everywhere“, Bradley C. Kuszmaul and Charles E. Leiserson, http://dspace.mit.edu/bitstream/handle/1721.1/3692/CS028.pdf?sequence=2
 Concurrent Revisions: Programming With Versioned Memory, http://research.microsoft.com/en-us/projects/revisions/
[October 2013] Minor updates