C++ How to mix std::array and legacy code

With “the other one” , std::array is probably the most useful C++ std lib type. So, let’s use it all the time and everywhere?

No can do.

It is not easy to mix std::array and legacy

Why not? First of all, there is this legacy code. And it is one stormy ocean of code. Written in both C and C++.  I am not talking C Run-Time (CRT) lib here.  I am talking about more unusual challengers, one can and does meet almost every day trying to use std::array calling them from C++.

Very often developers solve some “funny” legacy API calls, in their C++, “on the spot”.  Often, it all looks easy and idiomatic and seemingly does not require much thinking. But that line of action can, and often does, result in some extremely nasty bugs. Very often descending into the realm of C. The beast that lives bellow. Which translates into a lot of man-hours spent. In fighting the beast.

For those challenging moments, I have developed a tiny C++ utility, that saved me (and others) a lot of time.

Ancient C manuscript restored
Ancient C manuscript, restored

 

I will present only 4 legacy functions which I think will serve the purpose, in explaining

  1. What are the unusual legacy code problems
  2. Why is this small C++ utility needed (and effective)

Legacy specimens will be tagged L1 ... Ln. Let’s start with the deep end of the legacy pool.

Modern C and legacy C

Here are the two specimens one will not find in CRT. Mainly because CRT libs are made to serve almost all versions of C compilers, in existence today.

First is the legal C99 way of passing an array argument of a known size.

L1

If not using std::array this is easy to call from C++. Allow me to leave it as an exercise to the reader. This is modern C. Only 20 years old.

L2

Next is not so modern C. C95 any one? Here we deal with that peculiar type: pointer to an array. Not the pointer to the first array element.  Distinction is important. Please consider this diagram.

Following is also a legal C and legal C++ too.  This is also what I would like to describe as  “C atavism“. A good comment is here. Here is a typical specimen, seen roaming in the wild.

The closer to the metal you wonder, the more likely you are to meet this kind of C specimens. In my mind, this concept might be put to good use when API author want’s to be absolutely sure of the exact type allowed in, as a pointer argument. Above is not native char array decaying to char *, above is exactly a pointer to an array of three char‘s.  That function can not be called with anything else.

I can see (and I have seen) this, in mission-critical C code and such. To call that function from C++, while having the only std::array instance available, is far from easy. And pretty far from standard C++.

Legacy C++

What? Yes, there is such a thing. C++ is by now “mature” aka “old” language.  In them halcyon days of primordial C++, there was no “standard C++”. And there where no people who never coded C before coding C++.

L3

Consequently one can easily bump into legacy C++ using pointers to native arrays. But. As an added “benefit”, mixed with templates too. Sigh.

To use this with the native array it is not easy.

To use the above using instance of std::array is definitely not easy. To put it mildly. I have deliberately not used auto above to visualize the level of difficulty.

L4

C++ has this thing called “native array reference”.

That is a modern C++. And it can be very useful indeed. And no, that is not easy to use having just an instance of  std::array. And yes, using a native array API that is easy to use. Just pass it to that function. An added benefit is, it will compile only if called with arrays.

But. Preserving the result as the native array reference is more involved. Focus.

OK then, these are the four legacy representatives. But first.

Is there a hack to save us all?

There is always a hack. And it is never to be used, like any other hack. The core of the problem, as we have all seen above, is we do not have a native array. We have an instance of std::array. But, wait a minute? std::array implementation,  contains one native array inside.  And it is publicly available.

Funny fact: _Elems[_Size] must be public. Otherwise, one could not use std::array and initialize it as aggregate.

But that hack is never going to save anybody. That is a “path peppered with shards of glass”. Just do not go there. End of the hack.

Thus far, as we have understood, we indeed like them, but we can not always enjoy the services of std::array. We need to use std::array public interface in order to develop a solution where one can easily mingle, with legacy API and native array’s crowd.

Shaping up a solution

Let us think together. Shall we? As we all know C++ has this curious thing called a reference. And an even curiouser thing called array reference.  How is this helping us?  For starters, this is how we declare, define and use those in standard C++ (step by step):

We know now how to declare, make and use the reference to a standard C++ native array. In an easy and compliant way. Fine. Great. So what?

Given std::array how do we legally get to it’s internal native array, and use it as a such? Answer: we produce a reference to it.

To transform the result of std::array const T * data()  method, to the reference to its internal native array, one needs to dive deep into the toxic waste of C style casts.

We need to cast the result of data from char pointer to the pointer of the internal array of 3 chars. And then we have to de-reference what we got so that we can assign it to the reference of the internal array. And yes, we need to know the size of the std::array we are using. Ugly as hell this is.

Side note: I deliberately do not use reinterpret_cast as I do not see it safe or helpful in any use case. C style cast is at least much more obvious warning some stunt is going on.

But. We digress. Back to the task at hand.  Here is the design aka the plan. Surely we should be able to package some solution, to get to the reference to the native array, present in every std::array . And,  just to repeat, we need that in order to be able to use std::array with legacy specimens listed above. And with other much less dangerous legacy specimens, too.

But we need first to somehow manipulate non-academic types in the audience, to help them leaving us in peace… No, not you.

Methodology: “Do it quick and dirty, ’till five-thirty”

That is the title of a particular school of thought. And, inevitably portion of the audience of this post, favours that kind of software development philosophy. Instead of antagonizing them, I might give them one very shiny foot gun.  And let them go.

So. Inheritance for implementation is evil.  But who cares. Why don’t we just develop a specialized: std:array derivative, with the methods we need, added into that potent mixture.

One might think that is a symphony of simplicity. Usage seems simple too:

Now. I know there is a number of people who will copy-paste the above and simply leave.  Or it might even be they have already done that? They (the “leavers”) might even never ever experience the problems with the foot gun “solution” above. But, that will be just by luck or by accident.

If one C++ advice is rock solid, it is the following: Never (ever)inherit from std or develop “inside” std namespace.

The reasons are numerous. This is the reasoning I would like to offer this time:

std lib of any vendor is constantly shape-shifting its implementation.

Each std lib release increment inevitably changes things inside.  Sometimes dramatically.

Do not ever rely on anything inside std lib.

However innocent it seems. Even in an unlikely scenario of not needing a totally portable code. Clear? OK. Let us calmly proceed, to:

Version One

We will use what we got from modern C++ and replace the above confused (not)solutions, with a two palatable solutions.

Function template, with scary-looking template arguments.  Do not despair if you have not seen this before. You might be thinking: “Oh boy, this C++ is endless,  I will never learn it all”.

Them template arguments are just a convenient place (I happen to like) to write a single function template, with internal type we need. Now beware. One using this utility has to be sure to receive the result explicitly as a reference, like this:

Otherwise, if we would not do it that way ( as auto & ) we would be left with the pointer to T , again. Generally, one has to be very careful to stop the array decay into the pointer to the first element.

And oh, by the way, above we have produced one very not-nice dangling reference, by creating std array as an function argument only, and then returning the reference to its internals.  That is example of one sure temporary object. We shall deal with this obvious issue right now. We will explicitly delete the function signature that allows for references to temporaries.

Again. Beware of using auto. Please do not make a mistake and be left with what we do not want. A pointer to the first element in an array.

Above solution has its deficiencies but it works.  If users are  (very) careful, that is.  Also, to develop a “pointer to native array solution” we would simply need a separate (almost the same) function as above. Hmm. This all starts to look somewhat clunky to me. Take a look to mandatory WANDBOX solution.

You might like what we have up till now. Let us quickly proceed to a better solution without further ado.

Version Two

Good old template to the rescue.

Usage is a classical standard (simple) C++.  First we instantiate the template into the type we will use. To deal with exact std::array type we need to mix with legacy code. Remember: the template is not a type, it is just that: a template; waiting to be made into a type. Template is just a declaration. Template definition is that declaration with concrete types as template arguments.

Thus, we have fully encapsulated the solution in one template struct  This “helper” struct is keeping no data. Just std::array type handled and types missing from std::array as it stands today in standard C++.  And there are only two static methods to return reference and a pointer to the internal native array of the std::array argument.

Moving and copying of the above template definition is a zero overhead. There is no data, there is no instance methods, just class methods. And.

Finally

The  solution to the funny legacy quartet catalogued on the top  :

Enjoyed it so far? Well, I have but not fully; not yet.

Caveat Emptor

Caveat emptor is Latin for “Let the buyer beware”. So far the methods in the utility presented above, are receiving const references to the instance and then manipulate the data() result, with casting stunts,  to return what we want. Yes, we have explicitly banned using temporaries as arguments. But that is not enough.

Still,  if and when the original array goes out of calling scope the result of these two functions becomes invalid.  Either a dangling reference or a dangling pointer, that is.

So please treat them results the same as you are treating any other references or pointers.  What does this mean?

This means, one should remember, standard C++ is best when dealing primarily with values, not references or pointers. That simply means:

Do not carry around references or pointers, but values.

Do whatever you have to do inside the same scope where you have used this mechanism.

This is a classical performance vs resilience balance. The balance one has to perpetually and skillfully watch while using powerful programming languages like standard C++ is.

 

Simple Language
`c++ ? “”maybe” : “or not”, is actually legal C++

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.