C++ How to mix std::array and legacy code, and stay sane

Ancient C manuscript restored

[2021 Apr 08]

Godbolt “proof of the pudding” link added. (Or is it “better” to say “proof is in the pudding“) ?

[2020 Mar 18]

Please find part one here. That is one way of dealing with native arrays. Here is another one. You choose.

With the vector, std::array is probably the most useful C++ std lib type. So, let’s use it all the time and everywhere.

No can do.

It is not easy to mix std::array and legacy code

Why not? First of all, there is this thing mentioned above: the “legacy code”. And that is my honourable reader, one stormy ocean of code beneath this world. Written in both C and C++ during the last few decades.

Reminder: part one is deliberately not handling native arrays with the std::array. Here is another approach.

I am not talking about C Run-Time (CRT) lib calls here. I am talking about a bit more unusual challenges; one can and does meet almost every day while trying to use std::array calling them from C++.

Very often developers solve some “funny” legacy API calls, in their C++, “on the spot”. Often, it all looks easy and idiomatic and seemingly does not require much thinking. But that line of action can, and often does, result in some extremely nasty bugs. Very often descending into the realm of C. The beast that lives below. This translates into a lot of man-hours spent. In Fighting the Beast.

For those challenging moments, I have developed a tiny C++ utility, that saved me (and others) a lot of time. I will present only 4 legacy functions categories which I think will serve the purpose, of explaining

What are the unusual legacy code problems? Why is this small C++ utility needed (and effective)? Legacy specimens will be tagged L1 ... L4. Let’s start with the deep end of the legacy pool.

Modern C and legacy C

Here are the two specimens one will not find in any CRT. Mainly because CRT libs are made to serve almost all versions of C compilers, in existence today.

First is the legal C99 way of passing an array argument of a (maybe) known size.

L1

If not using std::array this is easy to call from C++.

That is modern C. Only 20 years old.

L2

The next legacy example is not so modern C but, I do like a pointer to an array. C95 anyone? Here we deal with that peculiar type: pointer to an array. Not the pointer to the first array element. The distinction is important. Slight detour first.

Address of a string literal

Why would anybody do this? What is the type of that mystery?  It is a pointer to the array of 8 chars of course:

And yes, there are of course, “clever” C++ authors using this “feature” to pass arrays into low-level C APIs.  But what is this “pointer to array” thing? Please consider this diagram.

pointer-to-array

Following legacy code is also a legal C and legal C++ too. This is also what I would like to describe as “C atavism“. A good comment is here. Here is a typical specimen, seen roaming in the wild.

The closer to the metal you wonder, the more likely you are to meet this kind of C specimen. In my mind, this idiom might be put to good use when API authors want to be absolutely sure of the exact type allowed as a pointer argument to her function (using C). Above is not a native char array decaying to char *, above is exactly a pointer to an array of three chars. That function can not be called with anything else.

I can see (and I have seen) this, in mission-critical C code and such. To call that function from C++, while having the only std::array instance available, is far from easy. And pretty far from standard C++.

Legacy C++

What? Yes, there is such a thing. C++ is by now “mature” aka “old” language. In the halcyon days of primordial C++, there was no “standard C++”. And

Once upon a time there were no people who never coded C before coding C++.

L3

Consequently one can easily bump into legacy C++ using pointers to native arrays. But. As an added “benefit”, mixed with templates too. Sigh.

Using that with the native array is not easy.

And to use the above using an instance of std::array is definitely not easy. To put it mildly.

Far away from the comfort zone that is.

L4

The next legacy specimen is adjacent to the pointer to the array. C++ has this thing called “native array reference”. That is a type mechanism C does not have.

That declaration is modern C++. And it can be very useful indeed. And no, that is not easy to use having just an instance of std::array. Using a native array API that is easy to use. Just pass it to that function. An added benefit is, that it will compile only if called with native arrays.

But. Preserving the result as the native array reference is more involved. Focus, please.

OK then we are done with our herbarium of legacy specimens we shall use; these are the four legacy representatives. I might think the audience is convinced by now we are “not barking on the wrong tree” and audience motivation is now firm indeed. But wait … I can hear a question.

Is there a hack to save us all?

There is always a hack. And it is never to be used, like any other hack. Here it is.

The core of the problem, as we have all seen above, is in modern C++ we do not use or “have” a native array. We also have an instance of std::array. But, wait a minute? std::array  the implementation contains one native array inside. And it is publicly available. MS STL std lib std::array source looks something like this:

The funny fact is that  _Elems[_Size] must be public. Otherwise, one could not use std::array and initialize it as aggregate.

“Vulgaris” in Latin means common. Our “HACKATRON VULGARIS” begins here. Calling the L3 legacy function catalogued above, for example, we can use the _Ty _Elems[_Size] because we concluded it is in the std::array, because we have seen it in the source.

Ditto. To get to the reference of that native array that is inside the std::array implementation, and if we use MS STL and if this is not ever going to be silently changed in any way we can simply do this:

But that hack is never going to save anybody. That is a “path peppered with shards of glass”. Just do not go there. End of the hack. The reasons are many and varied. If you do not consider any of them as valid there is only one: the implementation of any std:: lib is constantly shape-shifting.

Shaping up a solution

Thus far, as we have understood, we indeed like them, but we can not always enjoy the services of good old std::array. We need to use the std::array public interface in order to develop a solution in which one can easily mingle, with legacy API and native array’s crowd, at the same time.

Let us think together. Shall we? As we all know C++ has this curious thing called a reference. And an even curiouser thing called array reference. How is this helping us? For starters, this is how we declare, define and use those in standard C++ (step by step):

We know now how to declare, make and use the reference to a standard C++ native array. In an easy and compliant way. Fine. Great. So what?

Given std::array how do we legally get to its internal native array, and use it as such? Answer: we produce a reference to it. Without reaching it in an illegal direct way.

To transform the result of `std::array const T * data() method, to reference its internal native array, one needs to dive deep into the toxic waste of C-style casts.

We need to cast the result of the data() method, from the char pointer to the pointer of the internal array of 3 chars. And then we have to de-reference what we got so that we can assign it to the reference of the internal array. And yes, we need to know the size of the  std::array instance we are using. Ugly as hell that is.

Side note: I deliberately do not use reinterpret_cast , as I do not see it as safe or helpful in any use case. C style cast is at least a much more obvious warning some stunt is going on.

But. We digress. Back to the task at hand. Here is the design aka the plan. Surely we should be able to package some solution, to get to the reference to the native array, present in every std::array . And, just to repeat, we need that in order to be able to use std::array with legacy specimens listed above. And with other much less dangerous legacy specimens, too.

But we need first to somehow manipulate non-academic types in the audience, to help them leave us in peace… No, not you.

The methodology, known as: “Do it quick and dirty, ’till five-thirty”

That is the title of a particular school of thought, in existence. And, inevitably portion of the audience of this post, favours that kind of software development philosophy. Instead of antagonizing them, I might give them one very shiny foot gun. And let them go and use it.

So. Inheritance for implementation is evil. But who cares, I can hear “them” say. Why don’t we just develop a specialized: std:array derivative, with the methods we need, added into that potent magical mixture.

One might think that is a symphony of simplicity. Usage seems simple too:

Now. I know there is a number of people who will copy-paste the above and simply leave this blog. Or it might even be they have already done that. They (the “leavers”) might even never ever experience the problems with the foot gun “solution” above. But, that will be just by luck or by accident.

If one C++ piece of advice is rock solid, it is the following: Never (ever)inherit from std or develop “inside” std namespace.

The reasons are numerous. And we have already mentioned one: std lib of any vendor is constantly shape-shifting its implementation.

Each std lib release increment inevitably changes things inside. Sometimes dramatically. Do not ever rely on anything inside the std lib. However innocent it seems. Even in an unlikely scenario of not needing a totally portable code. Clear? OK. Let us calmly proceed, to the solution (at last) :

Version One

We will use what we got from modern C++ and replace the above confused (not)solutions, with two palatable solutions.

Function template, with scary-looking template arguments. You might be thinking: “Oh boy, this C++ is endless, I will never learn it all”. Do not despair if you have not seen this before.

Them template arguments are just a convenient place (I happen to like) to write a single-function template, with the internal type we need. Now beware. One using this utility has to be sure to receive the result explicitly as a reference, like this:

Otherwise, if we would not do it that way ( as auto & ) we would be left with the pointer to T, again. Generally, one has to be very careful to stop the “magical” array decay into the pointer to the first element.

And oh, by the way, above we have produced one very not-nice dangling reference, by creating a std array as a function argument only, and then returning the reference to its internals. That is an example of one sure temporary object foot gun. We shall deal with this obvious issue right now. We will explicitly delete the function signature that allows for references to temporaries.

Again. Beware of using auto. Please do not make a mistake and be left with what you do not want. A pointer to the first element in an array.

The above solution has its deficiencies but it works. If users are (very) careful, that is. Also, to develop a “pointer to native array solution” we would simply need a separate (almost the same) function as above. Hmm. This all starts to look somewhat clunky to me.

You might like what we have done up till now. Let us quickly proceed to a better solution without further ado.

Version Two

Good old templates to the rescue.

Usage is a classical standard (but simple) C++. First, we instantiate the template into the type we will use. To deal with the exact std::array type we need to mix it with legacy code. Remember: the template is not a type, it is just that: a template; waiting to be made into a type. The template is just a declaration. Template definition is that declaration with concrete types as template arguments.

Godbolt :

Thus, we have fully encapsulated the solution in one simple template struct. This “helper” struct is keeping no data. Just std::array type handled and types missing from std::array as it stands today in standard C++. And there are only two static methods to return a reference and a pointer to the internal native array of the std::array argument.

Moving and copying the above template definition is a zero overhead. There is no data, there are no instance methods, just class methods. And.

Finally

The solution to the funny legacy quartet catalogued at the top :

Enjoyed it so far? Well, I have but not fully; not yet.

Caveat Emptor

Caveat emptor is Latin for “Let the buyer beware”. So far the methods in the utility presented above, are receiving const references to the instance and then manipulating the data() result, with casting stunts, to return what we want. Yes, we have explicitly banned using temporaries as arguments. But that is not enough.

Still, if and when the original array goes out of calling scope the result of these two functions becomes invalid. Either a dangling reference or a dangling pointer, that is.

So please treat their results the same as you are treating any other references or pointers. What does this mean?

This means, one should remember, that standard C++ is best when dealing primarily with values, not references or pointers. That simply means:

Do not carry around references or pointers, but values.

Do whatever you have to do inside the same scope where you have used this mechanism.

This is a classical performance vs resilience balance. The balance one has to perpetually and skillfully watch while using powerful programming languages like standard C++.