Size matters, part 3
In part 2 we got our types sorted properly for optimal layout. The next step is building the mappings to and from the storage indices.
Before we start, I need to make clear what these mappings are. The code can be quite confusing and difficult to follow without a clear idea of this.
When we have a call like
get<0>(some_tuple) we need to know where in the
storage is the element 0, since we may have relocated it to get our optimal
layout. In the picture below, we can see an example where the element 0 is at
position 1 in storage. We need to map index 0 from the interface to index 1 in
That would mean that our representation of this map, being merely a list of
indices, would store in position 0 the value 1:
indices<1, x, y, z>. Applying
the same logic to the other elements, we would have
This map is not enough, though. If we have a call to
make_tuple(a, b, c, d),
we need to rearrange these arguments to place each one at the appropriate spot.
To call the constructor for the tuple that we use for storage, we need to figure
out which of those arguments to place at index 0, so we can pass it as the first
argument of that constructor.
The picture shows that at index 0 we would place the object that was given at
position 3. The previous map does not readily provide this information. We need
one that gives us a 3 in position 0, and so on for the other elements. For this
example, that map would be
Mapping from interface to storage
After sorting, we have a type list that might look like the following.
This type contains information about the order of the types, which we will need
to define a
std::tuple for storage; and also the indices for the map from the
interface indices to the storage indices, which we will need to write
just need to extract this information into a more appropriate format.
So far we needed to carry the types and indices together, and now we need to split them. Variadic templates actually make this task pretty easy. We can simply do some pattern matching in a partial specialization and take the two parameter packs at once. Then we repack them separately in two different types.
One map down, one to go. The other map is a bit trickier. We could implement it by simply searching for the position of each index in the map we already built. A naive approach would result in O(N2) template instantiations, but we can avoid that and keep everything compiling with linear instantiations.
Convincing overload resolution to do our work for us
Instead of writing a template to search for indices, we can use overload resolution to do that search for us. When we pass a type that inherits from multiple bases into a function with different overloads for each base, the compiler considers all those overloads as possible for the call. If those overloads are template functions, we can narrow down the possibilities by making some template parameters explicit. We can take advantage of this to do our search and don’t need to write our own with recursive template instantiations.
For this to work, we need a type that derives from all the candidates for our search. That is not difficult to produce using variadic templates.
We will use only our empty TMP types, so we can inherit from them directly. If there was a possibly of having fundamental types, references, etc, we would need some sort of wrapper because we cannot inherit from those.
I should also note that there is no risk for ambiguity here: we know for certain
that all types are different specializations of
indexed and each one has a
Now we need a function template that accepts all our candidates. Those
candidates need to carry around their positions in the map we have already
computed. We can use our
WithIndices templates for this.
This parameter of our function template will be built from this template. It actually has information we don’t need (the type), but that is not problematic, and it has the advantage that it can be built simply by attaching indices to our already sorted list.
Note how the index we are looking for is the first template parameter. This is
so we can pass it explicitly, and still have the compiler deduce the rest. The
type deduction is crucial here: it is how we get our result back. We put that
result into the return type so we can obtain it easily with
With this we can write
find_index<0, sorted_example>::value and obtain 3, and
we have a constant number of template instantiations.
Putting it all together
To put everything together we need to apply
find_index to all indices from 0
to N-1, where N is the number of types involved. This can be done easily if we
IndicesFor alias that builds a list of indices with the same size as a
given tuple. I will leave that as an exercise for the reader.
The lower level
And we finally have enough machinery in place to start the actual tuple class. So far everything was done on the metaprogramming level, but the next post in this series will finally involve a bit at the “regular” program level.