Lessons learned about locks

Locks are handy and unavoidable in parallel programming. Especially in some applications like databases, when there may be multiply threads modify one piece of data. Locks can make sure only one thread can get access and modify the data in the same time.

Morpheus key-value store rely on locks heavily. There is lock for segments in trunks, and lock for cells. Recently, I am trying to improve the performance of the store, and reduce memory footprint.

My first attempt is to replace the CellMeta Java objects with simple addresses (long data type) for each cell index. This can save at least 24 bytes each, that will be considerable amount of spaces when there may be millions of cell in the store per-node. But there will be one compromise. In the beginning, I use synchronized keyword to lock on cells. But when there is no object entity that have been replaced as address, there will also be no way to ues synchronized as locks. My first solution is to create 1024 locks for each trunk, each cell will assigned to one lock according by it's hash. That means, a locks may corresponding to multiply cells. It looks like a controllable solution, even JDK use this principle to implement it's ConcurrentHashMap. But it will produce deadlock you trying to make a lock chain in parallel. A lock chain means a cell will be locked when another cell be locked. When 2 or more locks be acquired in different order, deadlocks occurs. In the the key-value store case, deadlocks may happens even when lock targets are irrelevant because cells may share the same lock. After discover this defect, I implemented a lock cache. Because I don't want to create lock for every cell in the beginning, I only want lock objects to be created when it is needed to save memory space. When a cell is asked to be locked, the store will try to find the lock in the cache according to cell hash. If it was found, lock will locked, or lock will be created. If the lock is asked to be released and lock count of a cell is zero, the lock will be removed from cache and waiting for GC to remove it from memory. The cache did a little impact on performance, but more memory efficient. It is equivalent to previous synchronized way, a bit heavy but with able to provide read write lock feature rather than spin lock monitor.

I use reentrant locks in the first time. But I used to not fully understand what the term 'reentrant' means. It means you can lock the same lock in the same thread more than one times, without cause deadlocks[1]. This feature is useful to build more clean code without concern about the lock status. The rule of using a lock is lock and unlock appears in pairs regardless what happened between the lock period even a exception thrown. In Java and Clojure, unlock should be put in finally block of a try clause, catch block is irrelevant. When you are using reentrant lock, lock time must equal to unlock one, or, deadlock and exceptions will occur. This rule looks easy, but in some situation, developers still need to track the locks, like reference count in old school iOS programming without ARC. I made a mistake by use some judgements like if (! l.isLockedByCurrentThread()) l.lock() and if (l.isLockedByCurrentThread()) l.unlock(). This will make the lock been locked in the outer layer but be released in the inner layer, which will cause inconsistency.

Programming concurrent applications is a delicate work. Developer have to consider all of the possible corner cases and use locks prudently. Use locks too much or in unnecessary places will damage the performance, but not using it when required will cause unexpected states and data corruption. Some applications claimed they have better performance to others, but unstable in practice, it is likely because the developer traded stability for performance. Lots of the lock problems also are hard to been tested without proper production level stress, a mature project should test all of the possible circumstances before release..