Programming Language Tips & Tricks

The trick behind uvw's clever wrapping of libuv

Illustrated by Julia Hanke

libuv is an interesting library for cross-platform asynchronous I/O. This is also the library behind node.js.

For a C++ developer, there are well-known alternatives (such as asio). Nevertheless, there are also reasons to still use libuv, and that's why many C++ developers resort to this library. However, libuv is primarily a C library and requires an extra effort to handle some of the things that are no longer a problem in modern C++. Over time, C++ wrappers for libuv have been developed for different purposes. The goal is always to provide an API less C-ish and to automate some otherwise error-prone processes.

Among all, uvw seems to be the most appreciated by users. It stays true to the libuv API and provides a layer to manage its objects and thus their memory during an asynchronous call. Several times I was asked what the trick was for this kind of thing. Every time, the one who asked was surprised to find that the trick was in clever use of a void * and a few other things. This aspect is also particularly interesting for those who want to make their own wrapper around libuv or another library in C. So let's see what lies behind uvw and how it solved this particular problem.

Introduction

libuv provides two abstractions to work with: the handles and the requests. The former are meant for long-lived objects (e.g., TCP handles) while the latter are used for short-lived operations usually performed over a handle (e.g., write requests on TCP handles). The event loop is the tool aimed at running handles and requests so as to perform the required operations. It follows the usual single-threaded asynchronous approach and allows multiple operations to run concurrently. The callbacks attached to these objects are invoked to notify the user that a given operation is finished.

I strongly suggest referring to the design overview section of the libuv documentation for more information on how it works. What is important is that handles and requests are usually dynamically allocated objects. Therefore, the burden of managing their memory is on the final user. This is where uvw comes into play. The library exploits an interesting trick that is applicable to the vast majority of C libraries when wrapped with C++. In particular, it combines:

  • Type erasure, a typical idiom widely used in C++;
  • The callback system of libuv and the possibility of attaching opaque data
    to its handles;
  • A controlled leak, that is, a way to manage the life cycles of its objects under the hood by means of an std::shared_ptr.

Similar to what happens in libuv, uvw also exposes some classes for handles and requests. However, it also succeeds in managing memory for the final user and providing an event model to benefit from the potential of C++ (lambda and such) rather than remaining anchored to plain function pointers.

Building Blocks

Erase Your Type

Type erasure is an idiom that usually combines one or more functions to define a behavior and some data on which these functions operate. The goal of this technique is that of literally erasing the type information from both the functions and the data, thus its name. A few different approaches to the problem exist. The typical one is also the one of interest for this article:

template<typename Type>
void prototype(void *ptr) {
    Type *instance = static_cast<Type *>(ptr);
    // ...
}

using function_t = void(void *);
function_t *my_class_handler = &prototype<my_class>;

The function exposes no information regarding the type for which it's used. However, it accepts an erased pointer to an instance and has enough information to know how to cast it to the right type for its purposes. The only thing we need to pay attention to is not passing pointers to objects that cannot be cast to Type. This is usually easily solved by pairing functions and instances before sending them around as opaque objects. In the context of libuv we'll use a slightly different model that fits better with what this library offers us. However, all techniques for type erasure boil down to the same idea, and it shouldn't be too difficult to understand it.

The Doorway from C++ to C

Fortunately, libuv offers a link to create a bridge between C++ and C. It's in the possibility of attaching a void * to handles and requests. In fact, almost all data structures in libuv expose a data member that can host user-defined arbitrary data and isn't used by the library. Moreover, all the callbacks receive as a first argument the handle, the request, or any other object that has generated them. As an example, this is the signature of the callback invoked by a timer once it has been started and its time has run out:

void (*)(uv_timer_t *)

This gives us the possibility of associating data to an asynchronous operation, which we'll soon lose track of but get back when it ends.

The most careful reader has probably already found the connection with the previous building block.

May the Leak Be with You

The other tool we'll need is the pretty unknown class std::enable_shared_from_this from the standard library. It allows users to create shared pointers that share the ownership of an instance, starting from the instance itself, as long as it's managed by a shared pointer.

An example can explain these twisted words better than a thousand other sentences:

struct my_class: std::enable_shared_from_this<my_class> { /* ... */ };

auto instance = std::make_shared<my_class>();
auto other = instance->shared_from_this();

In the first line, we introduce a new type that inherits from the class std::enable_shared_from_this of the standard library. The more careful reader has perhaps also recognized the CRTP idiom, typical of the C++ language. This class provides our type with a member function shared_from_this that returns an std::shared_ptr<my_class> when invoked (as long as our objects are managed by std::shared_ptrs).

The other lines are there just to show how this mechanism works. They can seem useless here, since copying the pointer would have been enough to get another instance of it. However, imagine what happens when you receive instead a reference or a naked pointer for an object and you want to create a shared pointer from it. This is exactly the kind of problems this class solves and why it's so useful for our purposes.

Put the Pieces Together

With the few building blocks discussed above, we can finally get the job done. Unfortunately, ten articles wouldn't be enough to analyze examples for every object exposed by libuv. The same is true for those who are interested in a detailed analysis of uvw. Therefore, I'll show how to write a small wrapper for one of libuv handles so as to understand the idea behind it all. Since I've already mentioned it before, let's look at uv_timer_t, a handle to schedule callbacks that will be called in the future. These are the functions of interest:

  • uv_timer_init: initializes the handle.
  • uv_timer_start: starts the timer and accepts the callback to fire when the
    time is passed.
  • uv_close: closes a handle, that is, a uv_handle_t.

Note that a timer is also a handle, and therefore it can be used with all the functions that accept the basic class. uv_close is particularly important because libuv considers a handle as alive until it's closed, no matter what. In other words, handles must always be initialized and closed.

Before going any further, let's see some code that finally puts the pieces together:

class timer {
    static void cb(uv_timer_t *handle) {
        auto *timer = static_cast<timer *>(handle->data);
        timer.callback(*timer);
    }

public:
    using callback_type = void(timer &);

    timer() = default;

    void init() {
        uv_timer_init(&handle);
        leak = shared_from_this();
    }

    void start(unsigned int timeout, callback_type *func) {
        callback = func;
        timer.data = this;
        uv_timer_start(&handle, &cb, timeout, 0u);
    }

    void close() {
        leak.reset();
        uv_close(reinterpret_cast<uv_handle_t *>(&handle));
    }

private:
    std::shared_ptr<timer> leak;
    callback_type *callback;
    uv_timer_t handle;
};

// ...

auto my_timer = std::make_shared<timer>();
my_timer->start(1000u, +[](timer &my_timer) {
    // ...
    my_timer.close();
});

What's going on here?

First of all, remember that we're talking about asynchronous calls. Our objects must have a life cycle that overcomes that of the calls themselves. This translates into dynamic allocations and a controlled leak so that we don't have to worry about keeping them alive. This is where std::enable_shared_from_this enters the design. Since the handle must always be initialized and closed explicitly, and since a reference to it is passed to our callback, we can set aside a shared pointer to the handle itself and create a sort of leak that is completely under control.

uv_timer_t.data is our void * aimed at containing a pointer to an object. libuv works with its own data structures and these are completely decoupled from our classes. Also, we don't want to leak the types from the underlying library to the caller space. Therefore, taking advantage of the type erasure due to the void * and the possibility of reconstructing the type inside the callback, we succeed in passing our objects silently inside libuv and receive them back at the right moment.

That's it. We've built a minimal object that completely separates the two worlds and allows us to manage memory in a transparent way, moving our objects back and forth between an asynchronous call and the other.

Go Further...

This is a small example that fits the purposes of an article pretty well but is also quite far from reality. The library from which it was derived extends the concepts presented in the form of a small hierarchy that exploits its own layer of template machinery. The CRTP idiom is also used extensively all throughout the code.

Following are the links to the repository and some existing articles on the web for those wishing to go into the subject further:

From my point of view, the last one is particularly interesting. It explores the library from the point of view of the use it makes of shared pointers, a tool as loved as it is hated by the C++ community. Interestingly, I also recommend using them with great caution, finding this library one of the few use cases that make sense due to the unknown life cycles of the objects involved in asynchronous calls.

Conclusion

With such an article we could only scratch the surface of a complex library like uvw. However, I hope I've given you an idea of the analytical method to make C and C++ coexist. Of course, type erasure and the lesser-known classes of the standard library support this type of approach and allow us to create something powerful in a few lines of code. However, the real protagonist is a void *, which acts as a bridge between the two worlds, allowing us to exchange objects with those who don't have objects at all.

Michele Caini

author

Michele is fond of two things: C++ and gaming. When he isn't spending his time attending conferences, he blogs about coding and works on his popular open source game engine EnTT.

Julia Hanke

illustrator

Julia Hanke is an illustrator living in Warsaw, Poland. She worked in creative agencies, currently works as fulltime freelance Illustrator, mainly making Illustrations for animations and web design. Now she is shifting her focus on editorial and children's book Illustrations. You can follow her on instagram @julia_hanke.