Aug 232012
 

Some modern languages claim to have better concurrency support than languages like Java or C/C++ (Go, Erlang, etc.).  Their features are often touted as making it easier to “get concurrency right,” as though it’s somehow difficult to have similar capabilities in other languages.  Sometimes this sounds to me like the closures vs. objects debate.

To demonstrate what I mean, here’s one way to implement a Go-style message channel for inter-thread communication in C++ (using C++11’s thread support):

#include <list>
#include <thread>
template<class item>
class channel {
private:
  std::list<item> queue;
  std::mutex m;
  std::condition_variable cv;
public:
  void put(const item &i) {
    std::unique_lock<std::mutex> lock(m);
    queue.push_back(i);
    cv.notify_one();
  }
  item get() {
    std::unique_lock<std::mutex> lock(m);
    cv.wait(lock, [&](){ return !queue.empty(); });
    item result = queue.front();
    queue.pop_front();
    return result;
  }
};

That wasn’t so hard now, was it?

This class is an example of a monitor; it encapsulates its own synchronization primitives.  By building them in, users of this class never need to think about synchronization when putting items in or getting items out of the channel.  By keeping them private and unlocking when exiting the member functions, there’s no chance of a dueling-mutex deadlock.  This approach of keeping synchronization inside re-usable code simplifies multi-threaded programs and avoids potential bugs.

To be fair, there are at least three significant differences between this and a channel in Go:

  1. Go channels use a fixed size buffer, and this uses an unlimited size buffer.  Putting a message into a Go channel that’s full will block until there’s room in the buffer.  The “put” member function in this class never blocks, though it will allocate memory.
  2. Go channels can be “closed” like a socket or a pipe to indicate to the receiver that no more messages will ever be sent.
  3. Go’s select statement allows for checking for input from one or more channels without blocking, and the get member function here always blocks until an item is available.

Let’s see what it would take to expand the code above a little to address #2 and #3.

#include <list>
#include <thread>
template<class item>
class channel {
private:
  std::list<item> queue;
  std::mutex m;
  std::condition_variable cv;
  bool closed;
public:
  channel() : closed(false) { }
  void close() {
    std::unique_lock<std::mutex> lock(m);
    closed = true;
    cv.notify_all();
  }
  bool is_closed() {
    std::unique_lock<std::mutex> lock(m);
    return closed;
  }
  void put(const item &i) {
    std::unique_lock<std::mutex> lock(m);
    if(closed)
      throw std::logic_error("put to closed channel");
    queue.push_back(i);
    cv.notify_one();
  }
  bool get(item &out, bool wait = true) {
    std::unique_lock<std::mutex> lock(m);
    if(wait)
      cv.wait(lock, [&](){ return closed || !queue.empty(); });
    if(queue.empty())
      return false;
    out = queue.front();
    queue.pop_front();
    return true;
  }
};

Since the channel can now be closed, the return value of “get” indicates whether any item was received.  This could be used to process items until a channel is closed and drained:

my_item received;
while(my_channel.get(received)) {
  //...
}

The second argument to the revised “get” controls whether it blocks until an item is available (or the channel is closed). To do something like a “select” over multiple channels, you might do this:

item_type1 item1;
item_type2 item2;
if(channel1.get(item1,false)) {
  // ...
} else if(channel2.get(item2,false)) {
  // ...
}

Once you have an understanding of the synchronization primitives available in C++ (or Java, or other similar languages with multi-threading), it’s pretty easy to write your own re-usable code like this.  While it’s possible to make mistakes in any code, even the expanded version is under 40 lines of code.  It shouldn’t be much work to make something like this bug-free.

Why bother with this at all, when one could just use Go instead of C++ and get built-in channels?  One reason is greater control over the behavior of the message channel.  I’ll describe a specific example that came up in software that I was maintaining.  It was using something essentially identical to the channel class above.  It received requests from clients over TCP and used the channel to distribute them to multiple threads which serviced those requests.  When doing some performance analysis we discovered that some client hosts were submitting more requests than others.  Since the channel was a FIFO (just as implemented above) these busier clients were getting more attention from the server, slowing down the response time to other clients.  To fix this, we changed the channel by separating the requests based on the client host.  The “get” member function would round-robin through the client hosts first, then extract the next request from the selected host.  This ensured that greedy clients could not starve out others.  Because the message passing was in a class like the one described here and all the information needed to implement this new prioritization was in the messages themselves, making this change was completely transparent to the rest of the code.

I’m not trying to make an argument against using Go.  It’s an interesting language that has a lot of good ideas.  But I don’t see how a feature like channels is much of a reason to choose it over C++, or how concurrency is much easier to “get right” with Go.  Channels themselves are easy (as I think I’ve demonstrated).  The difficult stuff is at a higher, more abstract level of the design.  It’s just as difficult whether you get a few tools like channels for free or not.