Language Features

C++20: The Unspoken Features

Illustrated by Bulma Juozaityte
This article is brought to you free by PVS-Studio. More about them at the end of the article.

C++20 is coming. As of this writing, the new revision of the language isn't a thing yet, but by the time you are reading this, it may be published.

The C++ community is really excited about C++20. It looks like a groundbreaking update similar to C++11. Some major features are finding their way in the standard: modules, coroutines, concepts, and ranges. These are the big four, and almost everybody is giving a talk or a presentation on these topics.

But let's be honest, modules will take a while before they are widely adopted, coroutines aren't something you use daily, and concepts are mainly dedicated to those who write libraries rather than end-users; it's unlikely they'll be used extensively in a medium-sized application.

Of the big four, ranges seem to be the one candidate that most people will be using daily; but C++20 is not just that. This revision of the standard also contains many small changes that will help us as developers.

We'll be taking a look at some of these unspoken features in this article.

Introduction

What's new in C++ besides the big four? A lot of things. We certainly won't be able to see them all in one post. Among the most interesting (at least from my point of view) are the following:

  • Designated initializers
  • Spaceship operator
  • Extended range-based for statement
  • Pack expansions in lambda init-capture
  • Template parameter lists on lambdas
  • String literals as template parameters
  • consteval and constinit keywords
  • Constexpr containers
  • Constexpr virtual functions

Let's take a moment to discuss how everyday C++ has improved thanks to the big four, among other things.

Designated Initializers

Designated Initializers are a long-wanted feature of the language somehow borrowed from C. Consider the following class:

struct position { int x; int y; int z; };

Before this feature, there were many ways to initialize an instance of position, but none of them were really explicit. As an example, because this is an aggregate type, we can use aggregate initialization:

position pos{100, 0, 200};

It looks good, but our colleagues don't know how the type is structured and, therefore, don't know what member is being set to 100. Moreover, we would like to default initialize y rather than explicitly specify its value. Designated initializers make this possible for classes and unions:

position pos{ .x = 100, .z = 200 };

This syntax is easier to read for someone who hasn't written the class. It's also more concise because we can ignore the members for which default initialization is fine. Designated initializers in C++ are less powerful than their counterparts in C. The former doesn't support out-of-order initialization and don't work with arrays. Despite this, it's something I bet we'll see used all over a codebase.

Spaceship Operator

Have you ever made a class comparable? Every seasoned C++ developer has done it at least once and knows what a pain it may sometimes be.

In C++, this is as powerful as it is cumbersome. These are the operators to define:

struct clazz {
  bool operator==(const clazz &) const;
  bool operator!=(const clazz &) const;
  bool operator<(const clazz &) const;
  bool operator<=(const clazz &) const;
  bool operator>(const clazz &) const;
  bool operator>=(const clazz &) const;

  // ...

private:
  int value;
};

Most of the time, they are trivial, and to avoid redundant code, we make them depend on each other:

bool operator!=(const clazz &other) const { return !(*this == other); }

However, imagine a class that only contains a couple of integers. All I want is to have these operators autogenerated in such a way that the data members are compared in the order they are defined. This is where the spaceship operator enters the game. As usual, however, since it's C++, the committee wasn't satisfied with this and did much more!

With a single operator (that we can even default as in = default), we can generate all relational operators for a type at once:

struct clazz {
  auto operator<=>(const clazz &) const = default;

  // ...

private:
  int value;
};

When the default semantics isn't enough, we can also enforce a different one by giving a different definition for the spaceship operator. Whether we want to compare members out of order or define a weak order among our instances, it's possible, and we'll end up writing way less code than what we did so far. This is an example taken directly from one of the online references for the C++ language:

struct case_insensitive_string {
  std::weak_ordering operator<=>(const case_insensitive_string &other) const {
    return case_insensitive_compare(str.c_str(), other.str.c_str());
  }

  // ...

private:
    std::string str;
};

The default comparison operator for an std::string lexicographically compares two strings and therefore isn't suitable for our purposes. By defining a custom operator, we get around the limitation and still force the generation of all relational operators. By returning an instance of std::weak_ordering, we also tell the compiler (and the reader, of course) that all values are comparable but equivalent values may still be distinguishable (imagine what happens for the strings "foo" and "FOO").

All in all, it is possible to express a very complex concept that includes all relation operators in just three lines of code. This is a time-saving feature that is worth a try.

Extended Range-Based for Statement

I don't know whether you are one of the supporters or detractors of the extended if, but I make extensive use of it like so:

if(auto value = gimme_my_value(); value) { /* ... */ }

It limits the scope of value and helps me write secure code more concisely. I always wondered why this blessing had been granted only to if and switch. Finally, the range-based for became part of this group as well:

for(auto vec = gimme_my_items(); auto &&value: vec) { /* ... */ }

Regardless of personal taste, this feature can solve a problem where temporaries are involved:

for(auto &&value: gimme_my_wrapper().items()) { /* ... */ }

In this case, we can have undefined behavior, and nobody knows what will happen in production. Whether this happens mostly depends on the actual definition of gimme_my_wrapper and whether it may or may not return a temporary. The extended range-based for statement solves this dilemma quite elegantly since we can capture the returned value and eventually extend its lifetime as needed, thus avoiding all problems.

Pack Expansions in Lambda init-capture

Let's be honest with ourselves for a moment. We all know the following is possible:

template<typename... Args>
auto very_important_function(Args... args) {
    return [args...]() { /* ... */ };
}

Raise your hand if you have ever tried to do this:

template<typename... Args>
auto very_important_function(Args... args) {
    return [args = std::move(args)...]() { /* ... */ };
}

just to discover that it isn't accepted. Instead, the valid form in C++17 is:

template<typename... Args>
auto very_important_function(Args... args) {
    return [tup = std::make_tuple(std::move(args)...)]() { /* ... */ };
}

with the obvious drawbacks of having to use an std::tuple now.

Luckily, C++20 will make things right. It will allow us to expand a parameter pack in a capture list, thus obtaining the desired version without using silly methods like std::make_tuple.

Template Parameter Lists on Lambdas

Here is another thing have done countless times since C++14 came out:

auto func = [](auto &&... args) {
    return another_func(std::forward<decltype(args)>(args)...);
};

It's not too ugly, but is surely uglier than its counterpart with a free function:

template<typename... Args>
void func(Args &&... args) {
    return another_func(std::forward<Args>(args)...);
};

Despite that, we can use a lambda here. It's only a matter of syntax, but the language already gives us all we need to get the job done. The worst part comes when we try to unpack something using a bunch of indexes. Here is an example:

auto func = [](auto indexes) {
    // ???
};

// ...

func(std::make_index_sequence<42>{});

This time, there is no way to get the indexes from the sequence passed as an argument and therefore we must use a separate function for that. This is pretty annoying in many cases because we split the logic in two parts and we might want to hide this function rather than make it part of a class or a namespace.

For a long time, some of the major compilers also supported template arguments for lambdas so as to make the code concise and easy to read and work with. Finally, this feature entered the standard with C++20, and it's a good candidate for nearly daily use by many developers:

auto func = []<std::size_t... Index>(std::index_sequence<Index...>) {
    // ???
};

Moreover, we can also simplify the preceding syntax and make it similar to that of a plain function:

auto func = []<typename... Args>(Args &&... args) {
    return another_func(std::forward<Args>(args)...);
};

Overall, everything we can do with a template function we can now do with a lambda. This wasn't possible before, and I'm pretty sure it will greatly change the way we code.

String Literals as Template Parameters

Template argument lists accept both type and non-type template parameters, as long as what we pass into the list is suitable for using in a constant context, of course. So far, so good. If you want to define a compile-time hashing function for human-readable identifiers, then a const char * or a constant array of characters seem like good candidates to use as non-type template parameters. But do they actually work? The short answer is no; prior to C++20 they didn't.

It doesn't matter how many attempts we made, there was no way to make it work. The standard explicitly stated that a non-type template parameter could not be the address of a string literal, no matter what.

Of course, it was still possible to define compile-time hashing functions for strings, as well as many other things. However, because of this limitation, the whole syntax was as ugly as could be, and we needed to set up a layer of template machinery to make it work properly.

So, to make a long story short, we could have this:

constexpr auto value = hash_str("my_string");

or even this:

constexpr auto value = "my_string"_hs;

However, we could not have these until C++20:

constexpr auto value = hash_str_v<"my_string">;
constexpr auto another_value = hash_str<"my_string">();

Fortunately, with the new revision of the standard, the compiler will be able to generate a constexpr array of characters on its own and pass the reference as a non-type template argument to our class:

template<auto &str>
auto hash_str() { /* ... */ }

At this point, I can only imagine the number of uses this feature will have. I already have at least a couple of uses in mind—do you?

consteval and constinit Keywords

The whole constexpr thing is great, in my opinion. I've always thought this from the first time I saw it. However, the fact that a constexpr function may but isn't guaranteed to be an immediate function always drove me crazy. What's that then? Just a kind of suggestion for the compiler apparently.

Of course, there has also always been a way to enforce constexpr-ness on a function call, at least to an extent. Imagine using a constexpr function func that returns an integer value. As long as this function is invoked in a constant context, there are no doubts that it's executed at compile time:

std::integral_constant<int, func()> named_value;

However, this is as ugly as it can be and pretty annoying to do all the time.

C++20 introduced two keywords to get around this: consteval and constinit. The former informs the compiler that our function is an immediate function. That is, every call must produce a compile-time constant no matter what. The latter is the other side of that coin. It applies to variables rather than functions and dictates that they must have static initialization. In other words, C++20 will put an end to the use of tricks to make sure that what we think is constant is indeed that.

Constexpr Containers

Admittedly, this is odd. Putting aside fixed-size containers like std::array, variable-size containers were mainly useful at runtime, as far as I knew. However, everything changed when someone said that they are also potentially useful at compile time. Whatever that means, it sounds great.

In fact, with C++20, std::string and std::vector can be used at compile time and probably more will come with future revisions of the standard. This wasn't possible before. It was due mainly to the lack of constexpr destructors and constexpr allocators, other than the fact that a constexpr placement-new wasn't viable. In other words, the main problem was the wording in the standard rather than a real difficulty in implementing it. C++20 relaxed all these constraints and therefore makes it possible to have constexpr containers. How do they work though? What's the goal here?

The reasoning started from the 'constexpr` All the things! talk Ben Deane and Jason Turner gave at C++Now 2017.

The basic idea was to create a compile-time JSON parser and value representation by leveraging constexpr. Unfortunately, the lack of constexpr containers quickly turned into an issue and they had to find workarounds for it. To get around this limitation, the concept of transient allocation was introduced. Long story short, an allocation is transient as long as the result of the constant expression that involves it doesn't contain pointers or references to the allocated memory. This allows us to do manipulation at compile time as long as we don't expect to use the same objects at runtime. Someone went even further on this and suggested that a non-transient allocation could be promoted to static storage as long as it respects specific constraints (such has producing an object with a non-trivial constexpr destructor). This turns objects that allocate memory, and therefore don't fit the above definition, into objects that are suitable for use at compile time.

Unfortunately, as of the time I'm writing this, none of the major compilers have implemented constexpr std::string and std::vector. Because of this, it's a bit hard to provide an example you can play with since I can't even assure you it will compile correctly. On the other hand, this is a good starting point for another article. Stay tuned if you want to know more on this!

Constexpr Virtual Functions

Another thing that has been taken for granted for a long time, and literally denied by C++20, is that a virtual function cannot be constexpr. Wait? What? virtual is all about runtime as far as you know, right? This was my first reaction as well.

This is obvious in the summary of the proposal accepted to enter the standard (the emphasis is mine):

Virtual function calls are currently prohibited in constant expressions. Since in a constant expression the dynamic type of the object is required to be known (in order to, for example, diagnose undefined behavior in casts), the restriction is unnecessary and artificial.

It can't be truer than that. When inspecting a type for any reason in a constant expression, its dynamic type is and must be known. In a sense, the fact that a function is virtual or not loses meaning here because the compiler knows what version of the function to invoke at compile time. If the compiler didn't know the dynamic type of the object, it could neither adhere to the standard nor help the developer to respect its rules. In fact, the compiler knows this type and makes good use of it. So why not remove this limitation?

Without these constraints, for example the following code becomes valid:

struct base {
    virtual int func() const = 0;
};

struct derived: base {
    constexpr int func() const override { return 2; }
};

constexpr auto func(const base &b) {
    return b.func();
}

int main() {
    constexpr derived d{};
    static_assert(func(d) == 2);
}

I can only imagine the number of uses that can be made of this feature when it comes to working with templates—it is certainly one of the least discussed but, in my humble opinion, one of the most promising features as far as metaprogramming is concerned. Time will tell if I am right.

Conclusion

So, C++20 isn't made only of the big four. There is much more behind the scenes, especially regarding the use we make of it every day. With this article I hope I have intrigued you. There is still a lot to say but, of course, there will also be fun!

This article brought to you free by PVS-Studio

Although C++20 is more liberal and less restrictive on implementation details providing great flexibility for developers, nobody is immune from making mistakes. That's why there are code quality control tools like static code analyzers. One of the most popular one is PVS-Studio that detects not only bugs but potential vulnerabilities as well and has proven its capabilities in many open source project checks. Visit the link and write #HRM_cpp promo code in the message field to get a month free trial.

Michele Caini

author

Michele is fond of two things: C++ and gaming. When he isn't spending his time attending conferences, he blogs about coding and works on his popular open source game engine EnTT.

Bulma Juozaityte

illustrator

Bulma is multi-talented, freelance illustrator with a quirky, colourful style. Bulma loves animation and predominantly illustrates for motion graphics, using her creativity to help tell stories for brands. Originally from Lithuania, and having studied and worked in the UK, Bulma now enjoys island life on Malta. There, she learned to swim and dove even deeper into illustrating. Bulma is always looking for new challenges and opportunities.