Why we need compile-time reflection in C++1y

2:14:00 PM 0 Comments A+ a-

Programs need data. That's a no brainer. Programs are only as good as the data you provide them. Based on what kind of data is consumed, programs can be divided into two broad categories: (1) those that operate on regular data (a file), and (2) those that operate on other programs. The first kind of programs are abundant. Your browser, for instance, is showing you this page--its data. The second kind of programs are more interesting and they are called meta-programs.

Meta-programs need data too. As with the other programs, meta-programs are only as good as the data you provide them. So what do we feed them? ... Well, In C++, more important than 'what' is 'when'. (remember Morpheus?) A C++ program is just a sequence of bits the compiler is trying to understand. So, while the compiler is trying to make sense of your program, most of it gets translated (to assembly) but some of it gets executed. Quite intriguing! We're talking about compile-time meta-programming.

Coming back to the 'what'. We want to be able to feed whatever is available at compile-time: types, members, functions, arguments, namespaces, line numbers, file names, all are a fair game. Less obvious things are relationships among types: convertibility, parent/child, base/derived, container/iterator, friends, and more.

A C++ compiler already has this information but it is not in a form a meta-program can use. So we're in a soup, where we can run programs (at compile-time) but there is no data! So the next question is 'how' do we make the data available to our meta-programs? And that brings me to what I like to call the Curiously Recurring Template Meta-Programming  (CRTMP) pattern.

Curiously Recurring Template Meta-Programming Pattern

The idea is rather general and many have done it successfully before: Make data available to meta-programs without offending the compiler and do something interesting with it.

Let's look at who are the subjects (players) in this pattern. (1) the compiler, (2) the meta-program, and last but not the least is (3) the programmer itself because machines haven't taken over yet and humans still write most of the programs as of today.

The compile-time data must make sense to all three above. Today, C++ programmers, because we don't mind pain, create that data in a form that is understood by the former two. The prime examples are the traits idiom, the type_traits library, and sometimes code generators that parse C++ files and spit out relationships between classes.  For example, LEESA's gen-meta.py script generates typelists (Boost MPL vectors) for classes that contain other classes (think XML data-binding). Effectively it builds a compile-time tree of the XML node types.

When things are not auto generated, we make it palatable to the fellow programmers using macros. To many, macros are as obnoxious as the data they hide/generate but lets move on. There are many examples of super-charged too: Boost SIMD, pre-variadic Boost MPL, smart enumerations, and many more. When macros are used in a clever way (abused!) they really do look like magic. I got a first-hand experience of that while developing the RefleX library.

RefleX is a compile-time reflection-based type modeling in C++ for DDS Topics. It is open-source but you need the RTI Connext DDS to play with it. It essentially transforms your native C/C++ type into a serializable type representation called a TypeObject and marshals your data in what is called a DynamicData object. Note that both, type and data are serialized. There are systems--perhaps many we owe our modern life to--that need to distribute types and data over the network for discovery, interoperability, compatibility, and for other reasons.

Here's an example:


The RTI_ADAPT_STRUCT macro expands to about 120 lines of C++ code, which is primarily reflection information about ShapeType and it can be used at compile-time. It is based on the BOOST_FUSION_ADAPT_STRUCT macro. The macro opens the guts of the specified type to the RefleX library. The meta-programs in RefleX use this "data" to do their business. The reflection information includes member types, member names, enumerations, and other ornaments such as a "key". The point is that the same CRTMP pattern is used to "export" information about a native C++ type.

So, the last two open-source C++ libraries I wrote use the CRTMP pattern: In one, "data" is generated using a Python script and in the other using a macro. CRTMP makes C++ libraries remarkably powerful. The reality is there is nothing novel about it. It is seen everywhere.

The natural step in evolution of a idiom/pattern is first-class language support.  If something is so prevalent, the language itself should absorb it eliminate the crud involved developing and writing CRTMP-based libraries.

That brings us to the main point of this post: Compile-time Reflection. We need it. Period. It's a natural step of evolution from where C++ is now. When available, it will make vast amount of compile-time data available to C++ meta-programs. They will run faster, look nicer, and they will knock your socks off! It is mind boggling what has been achieved using template and preprocessor meta-programming. Compile-time reflection will push it two notches up. So stay tuned for C++1y.