C++ How to mix std::array and legacy code

With “the other one” , std::array is probably the most useful C++ std lib type. So, let’s use it all the time and everywhere? No can do.

It is not easy to mix std::array and legacy

Why not? First of all, there is this legacy code. And it is one stormy ocean of code. Written in both C and C++.  I am not talking C Run-Time (CRT) lib here.  I am talking about more unusual challengers, one can and does meet almost every day trying to use std::array calling them from C++.

Very often developers solve some “funny” legacy API calls, in their C++, “on the spot”.  Often, it all looks easy and idiomatic and seemingly does not require much thinking. But that line of action can, and often does, result in some extremely nasty bugs. Very often descending into the realm of C. The beast that lives bellow. Which translates into a lot of man-hours spent. In fighting the beast.

For those challenging moments, I have developed a tiny C++ utility, that saved me (and others) a lot of time.

Ancient C manuscript restored
Ancient C manuscript, restored

 

I will present only 4 legacy functions which I think will serve the purpose, in explaining

  1. What are the unusual legacy code problems
  2. Why is this small C++ utility needed (and effective)

Legacy specimens will be tagged L1 ... Ln. Let’s start with the deep end of the legacy pool.

Modern C and legacy C

Here are the two specimens one will not find in CRT. Mainly because CRT libs are made to serve almost all versions of C compilers, in existence today.

First is the legal C99 way of passing an array argument of a known size.

L1

/* C99 passing array argument of a known size */ 
char * 
L1
 ( int len_, char charr[len_] ) ;

If not using std::array this is easy to call from C++. Allow me to leave it as an exercise to the reader. This is modern C. Only 20 years old.

L2

Next, is not so modern C. C95 anyone? Here we deal with that peculiar type: pointer to an array. Not the pointer to the first array element.  The distinction is important. Please consider this diagram.

/*
      +---------------+
      | +---+---+---+ |
+---> | |   |   |   | |          
|     | +-^-+---+---+ |
|     +---|-----------+
|         |
|         + first_element_ptr   
|
+ entire_array_ptr 
*/

Following is also a legal C and legal C++ too.  This is also what I would like to describe as  “C atavism“. A good comment is here. Here is a typical specimen, seen roaming in the wild.

/*
array to pointer of 3 char's
*/
typedef char(*charr_3_pointer)[3];
/*
argument is array to pointer of 3 chars
return is the same type
*/
charr_3_pointer 
L2
(charr_3_pointer entire_array_ptr );

The closer to the metal you wonder, the more likely you are to meet this kind of C specimens. In my mind, this concept might be put to good use when API author want’s to be absolutely sure of the exact type allowed in, as a pointer argument. Above is not native char array decaying to char *, above is exactly a pointer to an array of three char‘s.  That function can not be called with anything else.

char  charr[] = { 'A','B','C' };

typedef char(*charr_3_pointer)[3];

charr_3_pointer exactly_3_chars_ptr = L2( & charr );

I can see (and I have seen) this, in mission-critical C code and such. To call that function from C++, while having the only std::array instance available, is far from easy. And pretty far from standard C++.

Legacy C++

What? Yes, there is such a thing. C++ is by now “mature” aka “old” language.  In the halcyon days of primordial C++, there was no “standard C++”. And there were no people who never coded C before coding C++.

L3

Consequently one can easily bump into legacy C++ using pointers to native arrays. But. As an added “benefit”, mixed with templates too. Sigh.

// receive pointer to array
// return pointer to the same array
template<
  typename T, size_t N,
  typename charp_type = T(*)[N]
>
charp_type 
L3 
( T(* arp_)[N]);

To use this with the native array it is not easy.

// C++
char  charr[] = { 'A','B','C' };
// auto * p_to_arr_of_3_chars = L3(&charr );
// or
char(* p_to_arr_of_3_chars )[3] =  L3(&charr );

To use the above-using instance of std::array is definitely not easy. To put it mildly. I have deliberately not used auto above to visualize the level of difficulty.

L4

C++ has this thing called “native array reference”.

// receive native array reference
// return the same
template<
 typename T, size_t N,
 typename narf_type = T(&)[N]
>
narf_type 
L4
( T ( & arf_)[N] );

That is a modern C++. And it can be very useful indeed. And no, that is not easy to use having just an instance of  std::array. And yes, using a native array API that is easy to use. Just pass it to that function. An added benefit is, it will compile only if called with arrays.

But. Preserving the result as the native array reference is more involved. Focus.

char  charr[] = { 'A','B','C' };
// pass as array reference
// keep the result as array reference
char(& result)[3] = L4( charr );

OK then, these are the four legacy representatives. But first.

Is there a hack to save us all?

There is always a hack. And it is never to be used, like any other hack. The core of the problem, as we have all seen above, is we do not have a native array. We have an instance of std::array. But, wait a minute? std::array implementation,  contains one native array inside.  And it is publicly available.

// <array>
// MSVC std lib std::array source 
// start of it
template <class _Ty, size_t _Size>
class array { 
public:

// ... implementation here ...

// end of it
// native array on stack
    _Ty _Elems[_Size];
};

Funny fact: _Elems[_Size] must be public. Otherwise, one could not use std::array and initialize it as aggregate.

// define and initialize the std::array
// as aggregate struct
// for this to work _Elems must be public
std::array<int,2> i2 {'1', '2' } ;

// now we can use the   _Ty _Elems[_Size]; 
// calling L3 for example
// HACKATRON VULGARIS next
auto const & whatever =  
    L3(& i2._Elems );

But that hack is never going to save anybody. That is a “path peppered with shards of glass”. Just do not go there. End of the hack.

Thus far, as we have understood, we indeed like them, but we can not always enjoy the services of std::array. We need to use std::array public interface in order to develop a solution where one can easily mingle, with legacy API and native array’s crowd.

Shaping up a solution

Let us think together. Shall we? As we all know C++ has this curious thing called a reference. And an even curiouser thing called array reference.  How is this helping us?  For starters, this is how we declare, define and use those in standard C++ (step by step):

/* 
native array as a type
easy and compliant standard C++ way 
*/
using charray  = char [] ;
/* 
concretize that native array type 
with an instance of it
*/
charray arr_of_chars{ 'A','B','C' };
/*
get to its reference 
compiler generates the referecne to array for you
*/
auto & ref_to_arr_of_chars = arr_of_chars;
/*
using auto above is much easier than
char (&ref_to_arr_of_chars)[3]  = arr_of_chars;
*/

We know now how to declare, make and use the reference to a standard C++ native array. In an easy and compliant way. Fine. Great. So what?

Given std::array how do we legally get to it’s internal native array, and use it as a such? Answer: we produce a reference to it.

To transform the result of std::array const T * data()  method, to the reference to its internal native array, one needs to dive deep into the toxic waste of C style casts.

std::array<char,3> 
    arr_of_chars{ 'A','B','C' }; 

// do  not repeat this at home
const char(&ref_to_arr_of_chars)[3] = 
   *(char(*)[3])
      arr_of_chars.data();

We need to cast the result of data from char pointer to the pointer of the internal array of 3 chars. And then we have to de-reference what we got so that we can assign it to the reference of the internal array. And yes, we need to know the size of the std::array we are using. Ugly as hell this is.

Side note: I deliberately do not use reinterpret_cast as I do not see it safe or helpful in any use case. C style cast is at least much more obvious warning some stunt is going on.

But. We digress. Back to the task at hand.  Here is the design aka the plan. Surely we should be able to package some solution, to get to the reference to the native array, present in every std::array . And,  just to repeat, we need that in order to be able to use std::array with legacy specimens listed above. And with other much less dangerous legacy specimens, too.

But we need first to somehow manipulate non-academic types in the audience, to help them leaving us in peace… No, not you.

Methodology: “Do it quick and dirty, ’till five-thirty”

That is the title of a particular school of thought. And, inevitably portion of the audience of this post, favours that kind of software development philosophy. Instead of antagonizing them, I might give them one very shiny foot gun.  And let them go.

So. Inheritance for implementation is evil.  But who cares. Why don’t we just develop a specialized: std:array derivative, with the methods we need, added into that potent mixture.

//
// WARNING: using MSVC, up to date as of 2020 Mar 18
// C++ version 17
// using default set of compiler switches and 
// building an standard Windows Application
//
template <typename T, size_t N>
struct legacy_compliant_array final
 : public std::array<T, N> {
 
 using parent = std::array<T, N>;
 using type = legacy_compliant_array;

 // native array reference
 using narf = T(&)[N];
 // native array pointer
 using narp = T(*)[N];

 narf internal_array_reference() {
   return  * narp( this->data()  );
 }

  narp internal_array_pointer() {
   return  narp( this->data()  );
 }
}; // legacy_compliant_array

One might think that is a symphony of simplicity. Usage seems simple too:

legacy_compliant_array<char, 3> charr{ {'X','Y','Z'} };

// let us try this on out legacy 
// specimens L3 and L4
legacy_compliant_array<char, 3>::narf
  narf_1 = 
    L4(charr.internal_array_reference());

legacy_compliant_array<char, 3>::narp
  narp_2 = 
   L3(charr.internal_array_pointer());

Now. I know there is a number of people who will copy-paste the above and simply leave.  Or it might even be they have already done that? They (the “leavers”) might even never ever experience the problems with the foot gun “solution” above. But, that will be just by luck or by accident.

If one C++ advice is rock solid, it is the following: Never (ever)inherit from std or develop “inside” std namespace.

The reasons are numerous. This is the reasoning I would like to offer this time:

std lib of any vendor is constantly shape-shifting its implementation.

Each std lib release increment inevitably changes things inside.  Sometimes dramatically.

Do not ever rely on anything inside std lib.

However innocent it seems. Even in an unlikely scenario of not needing a totally portable code. Clear? OK. Let us calmly proceed, to:

Version One

We will use what we got from modern C++ and replace the above confused (not)solutions, with a two palatable solutions.

/*
(c) 2018-2020 by dbj.org CC BY SA 4.0 

return array reference to the 
native array inside std::array
*/
template<typename T, size_t N,
/* std::array */
typename ARR = std::array<T, N>, 
/* the native array */
typename ART = T[N],    
/* reference to it */
typename ARF = ART & ,  
/* pointer to it */
typename ARP = ART * >  
constexpr inline 
ARF
internal_array_reference(const std::array<T, N> & arr)
{
  return *(ARP) 
      const_cast<typename ARR::pointer>
         (arr.data());
}

Function template, with scary-looking template arguments.  Do not despair if you have not seen this before. You might be thinking: “Oh boy, this C++ is endless,  I will never learn it all”.

Them template arguments are just a convenient place (I happen to like) to write a single function template, with the internal type we need. Now beware. One using this utility has to be sure to receive the result explicitly as a reference, like this:

int main()
{
// spot the '&' after the auto!
auto & narf = 
 internal_array_reference( std::array<int,3>{1,2,3} ) ;
// narf is native array reference after this
// quick check  
  for ( auto & e_ : narf ) {
      cout << e_ << "\n" ;
  }
}

Otherwise, if we would not do it that way ( as auto & ) we would be left with the pointer to T, again. Generally, one has to be very careful to stop the array decay into the pointer to the first element.

And oh, by the way, above we have produced one very not-nice dangling reference, by creating std array as a function argument only, and then returning the reference to its internals.  That is an example of one sure temporary object. We shall deal with this obvious issue right now. We will explicitly delete the function signature that allows for references to temporaries.

// we allow references, but not references to temporaries
template<typename T, size_t N >
constexpr inline auto 
internal_array_reference
(const std::array<T, N> && arr) = delete ;
// this is standard C++ mechanism

Again. Beware of using auto. Please do not make a mistake and be left with what we do not want. A pointer to the first element in an array.

Above solution has its deficiencies but it works.  If users are  (very) careful, that is.  Also, to develop a “pointer to native array solution” we would simply need a separate (almost the same) function as above. Hmm. This all starts to look somewhat clunky to me. Take a look to the mandatory WANDBOX solution.

You might like what we have up till now. Let us quickly proceed to a better solution without further ado.

Version Two

Good old template to the rescue.

/*
Array Helper struct
(c) 2018-2020 by dbj.org
CC BY SA 4.0
*/
template< typename T, size_t N >
struct ARH final
{
// std::array type
typedef std::array<T, N> ARR;
// inbuilt ARray type
typedef T ART[N];
// reference to ART
typedef ART& ARF;
// pointer to ART
typedef ART* ARP;

/*
return pointer to the underlying array
of an instance of std::array<T,N>
*/
static constexpr ARP
  to_arp(const ARR & arr)
  {
   return (ARP)
       const_cast<typename ARR::pointer>
          (arr.data());
  }
// ban temporary references as arguments
static constexpr ARP to_arp(const ARR && ) = delete ;
/*
return reference to the underlying array
of an instance of std::array<T,N>
*/
static constexpr ARF
   to_arf(const ARR & arr)
   {
    return *(ARP) 
       const_cast<typename ARR::pointer>
         (arr.data());
   }
// ban temporary references as arguments
static constexpr ARF to_arf(const ARR &&) = delete ;
};

Usage is a classical standard (simple) C++.  First, we instantiate the template into the type we will use. To deal with exact std::array type we need to mix with legacy code. Remember: the template is not a type, it is just that: a template; waiting to be made into a type. The template is just a declaration. Template definition is that declaration with concrete types as template arguments.

// handler of native int[3]
// and of std::array<int,3>
// A3 is now a type made from template definition
using A3 = ARH<int, 3>;
// Above template definition contains
// all the nested  types we need

// instantiate std::array<int,3>
A3::ARR arr{1,2,3};

// get to the  
// pointer to the native array
// inside 
A3::ARP arp = A3::to_arp(arr);

// get to the reference 
// to the native array inside
A3::ARF arf = A3:to_arf(arr);

// notice how above we just use the nested types 
// prepared for us
// A3::ARP and A3::ARF

Thus, we have fully encapsulated the solution in one template struct  This “helper” struct is keeping no data. Just std::array type handled and types missing from std::array as it stands today in standard C++.  And there are only two static methods to return reference and a pointer to the internal native array of the std::array argument.

Moving and copying of the above template definition is a zero overhead. There is no data, there is no instance methods, just class methods. And.

Finally

The  solution to the funny legacy quartet catalogued on the top  :

// first we declare our helper
// there is no class and no object
// just a type
using arh_char_3 = ARH<char,3> ;

// pointer to native array of 3 chars
// we can use auto too
typedef char(*charr_3_pointer)[3];

// same as 
// std::array sar = { 'A', 'B', 'C' };
arh_char_3::ARR sar = { 'A', 'B', 'C' }; 

// dealing with C legacy L1 and L2
// we make and pass reference to the array 
// inside the sar object
char * rez_1
 = L1( sar.size(),  arh_char_3::to_arf(sar) ); 

// we make and pass pointer to the array 
// inside the sar object
charr_3_pointer rez_2
 = L2(arh_char_3::to_arp(sar)); 

// dealing with C++ legacy L3 and L4

auto rez_3
 = L3(arh_char_3::to_arp(sar));

auto& rez_4
  = L4(arh_char_3::to_arf(sar));

Enjoyed it so far? Well, I have but not fully; not yet.

Caveat Emptor

Caveat emptor is Latin for “Let the buyer beware”. So far the methods in the utility presented above, are receiving const references to the instance and then manipulate the data() result, with casting stunts,  to return what we want. Yes, we have explicitly banned using temporaries as arguments. But that is not enough.

Still,  if and when the original array goes out of calling scope the result of these two functions becomes invalid.  Either a dangling reference or a dangling pointer, that is.

So please treat them results the same as you are treating any other references or pointers.  What does this mean?

This means, one should remember, standard C++ is best when dealing primarily with values, not references or pointers. That simply means:

Do not carry around references or pointers, but values.

Do whatever you have to do inside the same scope where you have used this mechanism.

This is a classical performance vs resilience balance. The balance one has to perpetually and skillfully watch while using powerful programming languages like standard C++ is.

 

Simple Language
`c++ ? “”maybe” : “or not”, is actually legal C++

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.