In this month’s column, we continue our discussion on the threading support introduced in the new C++11 standards. We first look at how we can manage threads.
In last month’s column, we had discussed how to use std::thread
to create threads and how std::mutex
can be used to protect shared data from concurrent accesses from multiple threads. While it is possible to use mutexes to protect global variables, the general practice is to group the mutex and the data it protects together into the same class. The member functions of the class use the appropriate mutex and access/update the shared data.
The intent is to hide the implementation of the synchronised access to the protected data inside the class. However, the burden of making sure that data is appropriately protected by locks and is always accessed after acquiring the required locks is on the programmer and is not automatically enforced by the language. For instance, consider the following example:
class my_data { private: //members.. public: //do_somework does some sensitive operation, //This requires that proper locks need to be //held since concurrent threads can execute //this void do_somework(); } class my_data_wrapper { Private: my_data data1; std::mutex my_data_mutex; template<typename Function> void PerformOperation(Function my_func) { my_data_mutex.lock(); my_func(data1); my_data_mutex.unlock(); } } //now consider an unprotected use of my_data my_data* unprotected; void bad_func(my_data& data1) { unprotected = &data1; } my_data_wrapper d1; void foo() { my_data_wrapper A; //here we expose the protected data using a //pointer reference A.PerformOperation(bad_func); //now we can circumvent the my_data_wrapper //and invoke the functions on the unprotected //data unprotected?do_somework(); }
What is wrong with the above piece of code? The original intent of wrapping the my_data
class in the wrapper class my_data_wrapper
was to make sure that all users end up using the my_data
class after acquiring the necessary lock, namely, my_data_mutex
. This is achieved by calling the PerformOperation
function of my_data_wrapper
class with the appropriate function func
to be invoked, so that the lock my_data_mutex
is first acquired and then the function func
is performed, and once the function func
is completed, the mutex is released as well.
However, it is possible to circumvent this protection mechanism, as shown above, by storing the pointer to the “sensitive”/”protected” data in a global pointer and using the pointer to invoke a function which operates on the protected data, directly bypassing the my_data_wrapper
class. This results in bypassing the lock acquire/release and hence can lead to race conditions if executed by concurrent threads.
Therefore, it is important to note that passing out pointers or references to protected data and bypassing the locks associated with them can lead to race conditions and the onus for ensuring consistency is on the programmers.
What are Lock Guards?
In the above code snippet, I had called the lock and unlock functions of std::mutex
directly. However, the recommended practice is not to call these functions directly. Why do we want to discourage developers from calling the lock/unlock functions explicitly?
The reason is that the developer has to ensure that the mutex is appropriately unlocked on every code path after the operation on the protected data is completed. For instance, if an exception is thrown inside myfunc
that was called from PerformOperation
in the above code snippet and control exits through an exception path, the lock would not be released automatically.
Recall the RAII idiom which stands for Resource Acquisition Is Initialisation. In order to support RAII for mutexes, C++ provides the std::lock_guard
class template. This class std::lock_guard
locks the mutex supplied to it as an argument on construction and releases it on destruction. This ensures that the locked mutex is unlocked on all paths of execution when the corresponding lock_guard
object gets destructed.
Now we can rewrite the PerformOperation
function of my_data_wrapper
class using std::lock_guard
as shown below:
class my_data_wrapper { private: my_data data1; std::mutex my_data_mutex; template<typename Function> void PerformOperation(Function my_func) { std::lock_guard<std::mutex> my_lock_guard(my_data_mutex); my_func(data1); } }
So far we have skirted around the issues of how an application can acquire the multiple locks it needs. For instance, consider the simple example of two tables, A and B, each protected with their own mutex locks, A_lock
and B_lock
, respectively. So any operation on Table A will acquire the A_lock
, perform all operations on it and release the A_lock
. Similar conditions hold for operations on Table B.
Now consider the case of an operation Move_from_TableA_To_TableB
where we want to move an item from Table A to Table B. We will now need to hold both locks before we perform the move. So let us acquire the lock on Table A first and then the lock on Table B, perform the “move” operation and release the locks once done.
Now, if there is another operation Move_from_TableB_To_TableA
, we need to ensure that the order in which the locks are acquired in this operation is the same as the order in which we acquire the locks in Move_from_TableA_To_TableB
. Else, it can lead to a deadlock situation where one thread which is performing Move_from_TableA_To_TableB
has acquired the lock for Table A and is waiting for the lock for Table B, while another thread which is performing Move_from_TableB_To_TableA
has acquired the lock for Table A and is trying to acquire a lock for Table B.
Hence, it is necessary for the programmer to define and follow a lock order in order to avoid deadlocks. C++ helps the programmer by providing the std::lock
function. This function can take multiple arguments of type std::mutex
and can lock them without the possibility of a deadlock.
std::mutex my_mutex1; std::mutex my_mutex2; Void foo() { std::lock(my_mutex1, my_mutex2); //do work }
std::lock
provides an “all or nothing” semantics when acquiring the locks. Either both the locks are acquired or neither are. In case an exception is hit after std::lock
, it acquires the first mutex successfully but before acquiring the second mutex, the first mutex is automatically released.
However, it is still possible to end up in a deadlock situation in C++, if we cannot acquire all the locks in a single call to std::lock
due to application design issues, since we may need to acquire the locks in different points in the program. Hence the responsibility of avoiding the deadlocks is placed on the programmer.
While std::lock_guard
and std::lock
cover most of the basic functionality needed for ensuring that shared data is protected, C++11 also offers certain additional functionality. One of them is std::unique_lock
which we will cover in next month’s column.
My ‘must-read book’ for this month
This month’s suggestion for a must-read book comes from one of our readers, Raghuram, who recommended Patterns for Parallel Programming by Timothy G Mattson, Beverly A Sanders and Berna L Massingill, published by Addison Wesley. As Raghuram says, “This book gives an excellent overview of the various techniques used for writing effective parallel code. The authors cover Java, OpenMP and MPI as the three parallel programming paradigms while explaining the parallel patterns. They also explain in detail lots of parallel programming jargon such as CC-NUMA, SPMD, ‘Explicitly Parallel’ and so on.”
Thank you, Raghuram, for the recommendation.
If you have any favourite programming puzzles that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_AT_yahoo_DOT_com. Till we meet again next month, happy programming and here’s wishing you the very best!
Feature image courtesy: Phillip Taylor. reused under the terms of CC-BY-SA 2.0 License.