When two components are used together, let’s call them A and B, it is a good approach to figure out who is using whom, and if A is using B then B should not know about A and vice versa.
This rule of thumb lowers complexity and makes both refactoring and re-use of code easier.
One scenario where it might be appealing to ignore this rule is when outsourcing computation to a worker thread, but here it is actually more important to stick with it.
Let us say we want to search folders recursively and provide the user with status about where we are in the process.
To provide this status we can have the worker thread send a message back to the main thread to let it know which folder it is presently searching, but this breaks the rule! The main thread sets up the worker thread and will also terminate it, should the user abort the search, so the main thread clearly knows about the worker thread (and need to). If the worker thread sends back messages, then it knows about the main thread.
If we do synchronous message passing then this simple design can lead to a deadlock. E.g. if the worker thread sends back a status update and at the same time, the main thread sends a terminate message to the worker then both threads are stuck waiting for the other to acknowledge the message.
By using asynchronous message passing we avoid the deadlock but instead introduce potential race conditions. The main thread may send a terminate message to an already completed worker thread, because it hasn’t received a “did terminate” message yet, or the worker thread may send a status update to the main thread after the main thread sent a terminate message.
This may lead to messages sent to disposed objects or resources being leaked, it is not impossible to “get right” but it is definitely not a simple problem.
While polling in general should be avoided, it fits this problem very well. Our search code will look something like the following (C++):
class searcher
{
volatile bool keep_running, done;
std::vector<std::string> results;
public:
searcher () : keep_running(true), done(false) { }
void start_search (std::string const& src)
{
std::vector<std::string> toSearch(1, src);
while(keep_running && !toSearch.empty())
{
std::vector<std::string> tmp;
// pseudo-code:
folder = toSearch.pop()
for each file in folder
tmp.push(file) if file matches criterion
toSearch.push(file) if file.type == folder
lock(mutex);
results.insert(results.end(), tmp.begin(), tmp.end());
unlock(mutex);
}
done = true;
}
void stop_search ()
{
keep_running = false;
}
std::vector<std::string> get_results ()
{
std::vector<std::string> res;
lock(mutex);
res.swap(results);
unlock(mutex);
return res;
}
bool is_done () const
{
return done;
}
};
This encapsulates the searching, but does not use a thread itself. The get_results member function though is thread safe, so a user can spawn a thread, call start_search in that thread. In the main thread a timer is started, and get_results is periodically called (together with is_done).
When is_done returns true, the main thread knows that the search is done and can stop the timer (and delete the searcher object).
In addition to avoiding the potential deadlock and/or race conditions, two other advantages with this approach is:
I started by writing that if A knows about B, B should not know about A. When deciding which of the two should know about the other, it should be the component most likely to be re-used, which should not know about the other component.
In the above example we made the search code be the candidate for re-use by not letting it have any dependencies (knowledge about other objects), in a MVC pattern it is the view and model we want to re-use, and so, these do not know about any of the other parts.
Leave a Reply