[2021 Apr 08]
Godbolt “proof of the pudding” link added. (Or is it “better” to say “proof is in the pudding“) ?
[2020 Mar 18]
Please find part one here. That is one way of dealing with native arrays. Here is another one. You choose.
With the vector, std::array
is probably the most useful C++ std lib type. So, let’s use it all the time and everywhere.
No can do.
It is not easy to mix std::array and legacy code
Why not? First of all, there is this thing mentioned above: the “legacy code”. And that is my honourable reader, one stormy ocean of code beneath this world. Written in both C and C++ during the last few decades.
Reminder: part one is deliberately not handling native arrays with the std::array
. Here is another approach.
I am not talking about C Run-Time (CRT) lib calls here. I am talking about a bit more unusual challenges; one can and does meet almost every day while trying to use std::array
calling them from C++.
Very often developers solve some “funny” legacy API calls, in their C++, “on the spot”. Often, it all looks easy and idiomatic and seemingly does not require much thinking. But that line of action can, and often does, result in some extremely nasty bugs. Very often descending into the realm of C. The beast that lives below. This translates into a lot of man-hours spent. In Fighting the Beast.
For those challenging moments, I have developed a tiny C++ utility, that saved me (and others) a lot of time. I will present only 4 legacy functions categories which I think will serve the purpose, of explaining
What are the unusual legacy code problems? Why is this small C++ utility needed (and effective)? Legacy specimens will be tagged L1 ... L4
. Let’s start with the deep end of the legacy pool.
Modern C and legacy C
Here are the two specimens one will not find in any CRT. Mainly because CRT libs are made to serve almost all versions of C compilers, in existence today.
First is the legal C99 way of passing an array argument of a (maybe) known size.
L1
1 2 3 4 5 6 |
/* C99 passing array argument of a known size of course the callers can choose to ignore the -Wnonnull */ char * L1 (const int len_, char charr[static len_] ) ; |
If not using std::array
this is easy to call from C++.
1 2 3 |
// native array use char arr[] = {'0','1','2'}; char * rez = L1( 3, arr ) ; |
That is modern C. Only 20 years old.
L2
The next legacy example is not so modern C but, I do like a pointer to an array. C95 anyone? Here we deal with that peculiar type: pointer to an array. Not the pointer to the first array element. The distinction is important. Slight detour first.
Address of a string literal
1 2 |
// C++ auto mistery = & "LITERAL"; |
Why would anybody do this? What is the type of that mystery? It is a pointer to the array of 8 chars of course:
1 2 |
// the type of a mistery variable char const (*) [8] |
And yes, there are of course, “clever” C++ authors using this “feature” to pass arrays into low-level C APIs. But what is this “pointer to array” thing? Please consider this diagram.
Following legacy code is also a legal C and legal C++ too. This is also what I would like to describe as “C atavism“. A good comment is here. Here is a typical specimen, seen roaming in the wild.
1 2 3 4 5 6 7 8 9 |
/* array to pointer of 3 char's */ typedef char(*charr_3_pointer)[3]; /* argument pointer is to array of 3 chars return is the same type */ charr_3_pointer L2 (charr_3_pointer entire_array_ptr ); |
The closer to the metal you wonder, the more likely you are to meet this kind of C specimen. In my mind, this idiom might be put to good use when API authors want to be absolutely sure of the exact type allowed as a pointer argument to her function (using C). Above is not a native char array decaying to char *, above is exactly a pointer to an array of three chars. That function can not be called with anything else.
1 2 3 |
char charr[] = { 'A','B','C' }; typedef char(*charr_3_pointer)[3]; charr_3_pointer exactly_3_chars_ptr = L2( & charr ); |
I can see (and I have seen) this, in mission-critical C code and such. To call that function from C++, while having the only std::array
instance available, is far from easy. And pretty far from standard C++.
Legacy C++
What? Yes, there is such a thing. C++ is by now “mature” aka “old” language. In the halcyon days of primordial C++, there was no “standard C++”. And
Once upon a time there were no people who never coded C before coding C++.
L3
Consequently one can easily bump into legacy C++ using pointers to native arrays. But. As an added “benefit”, mixed with templates too. Sigh.
1 2 3 4 5 6 |
// receive pointer to native array // return pointer to the same array template< typename T, size_t N, typename charp_type = T(*)[N] > charp_type L3 ( T(* arp_)[N]); |
Using that with the native array is not easy.
1 2 3 |
// C++ char charr[] = { 'A','B','C' }; char(* p_to_arr_of_3_chars )[3] = L3(&charr ); |
And to use the above using an instance of std::array
is definitely not easy. To put it mildly.
1 2 3 |
std::array stdarr = {'A', 'B', 'C'}; char(* p_to_arr_of_3_chars )[3] = L3( (char(*)[stdarr.size()])stdarr.data() ); |
Far away from the comfort zone that is.
L4
The next legacy specimen is adjacent to the pointer to the array. C++ has this thing called “native array reference”. That is a type mechanism C does not have.
1 2 3 4 5 6 7 8 |
// receive native array reference // and return the same template< typename T, size_t N, typename narf_type = T(&)[N] > narf_type L4 ( T ( & arf_)[N] ); |
That declaration is modern C++. And it can be very useful indeed. And no, that is not easy to use having just an instance of std::array
. Using a native array API that is easy to use. Just pass it to that function. An added benefit is, that it will compile only if called with native arrays.
But. Preserving the result as the native array reference is more involved. Focus, please.
1 2 3 4 5 6 7 |
char charr[] = { 'A','B','C' }; // pass as array reference // keep the result as an array reference char(& result)[3] = L4( charr ); // Warning: NOT an array reference // resulting type bellow is: char * auto result_2 = L4( charr ); |
OK then we are done with our herbarium of legacy specimens we shall use; these are the four legacy representatives. I might think the audience is convinced by now we are “not barking on the wrong tree” and audience motivation is now firm indeed. But wait … I can hear a question.
Is there a hack to save us all?
There is always a hack. And it is never to be used, like any other hack. Here it is.
The core of the problem, as we have all seen above, is in modern C++ we do not use or “have” a native array. We also have an instance of std::array
. But, wait a minute? std::array
the implementation contains one native array inside. And it is publicly available. MS STL std lib std::array
source looks something like this:
1 2 3 4 5 6 7 8 9 10 11 12 |
// // MS STL template<typename _Ty, size_t _Size > class array { public: // ... implementation here ... // end of it // NOT! private native array // as a data member // What could possibly go wrong?(tm) _Ty _Elems[_Size]; }; |
The funny fact is that _Elems[_Size]
must be public. Otherwise, one could not use std::array
and initialize it as aggregate.
1 2 3 4 |
// define and initialize the std::array // as aggregate struct // for this to work _Elems must be public std::array<int,2> i2 {'1', '2' } ; |
“Vulgaris” in Latin means common. Our “HACKATRON VULGARIS” begins here. Calling the L3
legacy function catalogued above, for example, we can use the _Ty _Elems[_Size]
because we concluded it is in the std::array
, because we have seen it in the source.
Ditto. To get to the reference of that native array that is inside the std::array
implementation, and if we use MS STL and if this is not ever going to be silently changed in any way we can simply do this:
1 2 3 4 5 |
// MS STL std::array<int,2> i2 {'1', '2' } ; // use the reference to the // std::array internal native array auto const & whatever = L3(& i2._Elems ); |
But that hack is never going to save anybody. That is a “path peppered with shards of glass”. Just do not go there. End of the hack. The reasons are many and varied. If you do not consider any of them as valid there is only one: the implementation of any std:: lib is constantly shape-shifting.
Shaping up a solution
Thus far, as we have understood, we indeed like them, but we can not always enjoy the services of good old std::array
. We need to use the std::array
public interface in order to develop a solution in which one can easily mingle, with legacy API and native array’s crowd, at the same time.
Let us think together. Shall we? As we all know C++ has this curious thing called a reference. And an even curiouser thing called array reference. How is this helping us? For starters, this is how we declare, define and use those in standard C++ (step by step):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
/* native array as a type easy and compliant standard C++ way */ using charray = char [] ; /* concretize that native array type with an instance of it */ charray arr_of_chars{ 'A','B','C' }; /* get to its reference compiler generates the referecne to array for you */ auto & ref_to_arr_of_chars = arr_of_chars; /* using auto above is much easier than char (&ref_to_arr_of_chars)[3] = arr_of_chars; */ |
We know now how to declare, make and use the reference to a standard C++ native array. In an easy and compliant way. Fine. Great. So what?
Given std::array how do we legally get to its internal native array, and use it as such? Answer: we produce a reference to it. Without reaching it in an illegal direct way.
To transform the result of `std::array
const T * data()
method, to reference its internal native array, one needs to dive deep into the toxic waste of C-style casts.
1 2 3 4 5 6 |
std::array<char,3> arr_of_chars{ 'A','B','C' }; // do not repeat this at home const char(&ref_to_arr_of_chars)[3] = *(char(*)[3]) arr_of_chars.data(); |
We need to cast the result of the data()
method, from the char pointer to the pointer of the internal array of 3 chars. And then we have to de-reference what we got so that we can assign it to the reference of the internal array. And yes, we need to know the size of the std::array
instance we are using. Ugly as hell that is.
Side note: I deliberately do not use reinterpret_cast
, as I do not see it as safe or helpful in any use case. C style cast is at least a much more obvious warning some stunt is going on.
But. We digress. Back to the task at hand. Here is the design aka the plan. Surely we should be able to package some solution, to get to the reference to the native array, present in every std::array
. And, just to repeat, we need that in order to be able to use std::array with legacy specimens listed above. And with other much less dangerous legacy specimens, too.
But we need first to somehow manipulate non-academic types in the audience, to help them leave us in peace… No, not you.
The methodology, known as: “Do it quick and dirty, ’till five-thirty”
That is the title of a particular school of thought, in existence. And, inevitably portion of the audience of this post, favours that kind of software development philosophy. Instead of antagonizing them, I might give them one very shiny foot gun. And let them go and use it.
So. Inheritance for implementation is evil. But who cares, I can hear “them” say. Why don’t we just develop a specialized: std:array
derivative, with the methods we need, added into that potent magical mixture.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
// // WARNING: using MSVC, up to date as of 2020 Mar 18 // C++ version 17 // using default set of compiler switches and // building an standard Windows Application // template struct legacy_compliant_array final : public std::array<T, N> { using parent = std::array<T, N>; using type = legacy_compliant_array; // native array reference using narf = T(&)[N]; // native array pointer using narp = T(*)[N]; narf internal_array_reference() { return * narp( this->data() ); } narp internal_array_pointer() { return narp( this->data() ); } }; // legacy_compliant_array |
One might think that is a symphony of simplicity. Usage seems simple too:
1 2 3 4 5 6 7 8 9 10 11 |
legacy_compliant_array<char, 3> charr{ {'X','Y','Z'} }; // let us try this on out legacy // specimens L3 and L4 legacy_compliant_array<char, 3>::narf narf_1 = L4(charr.internal_array_reference()); legacy_compliant_array<char, 3>::narp narp_2 = L3(charr.internal_array_pointer()); |
Now. I know there is a number of people who will copy-paste the above and simply leave this blog. Or it might even be they have already done that. They (the “leavers”) might even never ever experience the problems with the foot gun “solution” above. But, that will be just by luck or by accident.
If one C++ piece of advice is rock solid, it is the following: Never (ever)inherit from std or develop “inside” std namespace.
The reasons are numerous. And we have already mentioned one: std lib of any vendor is constantly shape-shifting its implementation.
Each std lib release increment inevitably changes things inside. Sometimes dramatically. Do not ever rely on anything inside the std lib. However innocent it seems. Even in an unlikely scenario of not needing a totally portable code. Clear? OK. Let us calmly proceed, to the solution (at last) :
Version One
We will use what we got from modern C++ and replace the above confused (not)solutions, with two palatable solutions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
/* (c) 2018-2021 by dbj.org, https://dbj.org/license_dbj/ return array reference to the native array inside std::array */ template /* std::array */ typename ARR = std::array<T, N>, /* the native array */ typename ART = T[N], /* reference to it */ typename ARF = ART & , /* pointer to it */ typename ARP = ART * > constexpr inline ARF internal_array_reference(const std::array<T, N> & arr) { return *(ARP) const_cast (arr.data()); } |
Function template, with scary-looking template arguments. You might be thinking: “Oh boy, this C++ is endless, I will never learn it all”. Do not despair if you have not seen this before.
Them template arguments are just a convenient place (I happen to like) to write a single-function template, with the internal type we need. Now beware. One using this utility has to be sure to receive the result explicitly as a reference, like this:
1 2 3 4 5 6 7 8 9 10 11 12 |
int main() { // spot the '&' after the auto // preserve it as a reference! auto & narf = internal_array_reference( std::array<int,3>{1,2,3} ) ; // narf is native array reference after this // quick check for ( auto & e_ : narf ) { cout << e_ << "\n" ; } } |
Otherwise, if we would not do it that way ( as auto &
) we would be left with the pointer to T, again. Generally, one has to be very careful to stop the “magical” array decay into the pointer to the first element.
And oh, by the way, above we have produced one very not-nice dangling reference, by creating a std array as a function argument only, and then returning the reference to its internals. That is an example of one sure temporary object foot gun. We shall deal with this obvious issue right now. We will explicitly delete the function signature that allows for references to temporaries.
1 2 3 4 5 |
// we allow references, but not references to temporaries template constexpr inline auto internal_array_reference (const std::array<T, N> && arr) = delete ; // this is standard C++ mechanism |
Again. Beware of using auto
. Please do not make a mistake and be left with what you do not want. A pointer to the first element in an array.
The above solution has its deficiencies but it works. If users are (very) careful, that is. Also, to develop a “pointer to native array solution” we would simply need a separate (almost the same) function as above. Hmm. This all starts to look somewhat clunky to me.
You might like what we have done up till now. Let us quickly proceed to a better solution without further ado.
Version Two
Good old templates to the rescue.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
/* ARH == ARray Helper (c) 2018-2021 by dbj.org, https://dbj.org/license_dbj/ */ template< typename T, size_t N > struct ARH final { // std::array type typedef std::array<T, N> ARR; // inbuilt ARray type typedef T ART[N]; // reference to ART typedef ART& ARF; // pointer to ART typedef ART* ARP; /* return pointer to the underlying array of an instance of std::array<T,N> */ static ARP to_arp(const ARR & arr) { return (ARP)const_cast<T*>(arr.data()); } // ban temporary references as arguments static ARP to_arp(const ARR && ) = delete ; /* return reference to the underlying array of an instance of std::array<T,N> */ static ARF to_arf(const ARR & arr) { return *(ARP)const_cast<T*>(arr.data()); } // ban temporary references as arguments static ARF to_arf(const ARR &&) = delete ; }; // ARH |
Usage is a classical standard (but simple) C++. First, we instantiate the template into the type we will use. To deal with the exact std::array
type we need to mix it with legacy code. Remember: the template is not a type, it is just that: a template; waiting to be made into a type. The template is just a declaration. Template definition is that declaration with concrete types as template arguments.
Godbolt :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
// handler of native int[3] // and of std::array<int,3> // A3 is now a type made from template definition using A3 = ARH<int, 3>; // Above template definition contains // all the nested types we need // instantiate std::array<int,3> A3::ARR arr{1,2,3}; // get to the // pointer to the native array // inside A3::ARP arp = A3::to_arp(arr); // get to the reference // to the native array inside A3::ARF arf = A3:to_arf(arr); // notice how above we just use the nested types // prepared for us // A3::ARP and A3::ARF |
Thus, we have fully encapsulated the solution in one simple template struct. This “helper” struct is keeping no data. Just std::array
type handled and types missing from std::array as it stands today in standard C++. And there are only two static methods to return a reference and a pointer to the internal native array of the std::array argument.
Moving and copying the above template definition is a zero overhead. There is no data, there are no instance methods, just class methods. And.
Finally
The solution to the funny legacy quartet catalogued at the top :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
// first we declare our helper // there is no class and no object // just a type using arh_char_3 = ARH<char,3> ; // pointer to native array of 3 chars // we can use auto too typedef char(*charr_3_pointer)[3]; // same as // std::array sar = { 'A', 'B', 'C' }; arh_char_3::ARR sar = { 'A', 'B', 'C' }; // dealing with C legacy L1 and L2 // we make and pass reference to the array // inside the sar object char * rez_1 = L1( sar.size(), arh_char_3::to_arf(sar) ); // we make and pass pointer to the array // inside the sar object charr_3_pointer rez_2 = L2(arh_char_3::to_arp(sar)); // dealing with C++ legacy L3 and L4 auto rez_3 = L3(arh_char_3::to_arp(sar)); auto& rez_4 = L4(arh_char_3::to_arf(sar)); |
Enjoyed it so far? Well, I have but not fully; not yet.
Caveat Emptor
Caveat emptor is Latin for “Let the buyer beware”. So far the methods in the utility presented above, are receiving const references to the instance and then manipulating the data() result, with casting stunts, to return what we want. Yes, we have explicitly banned using temporaries as arguments. But that is not enough.
Still, if and when the original array goes out of calling scope the result of these two functions becomes invalid. Either a dangling reference or a dangling pointer, that is.
So please treat their results the same as you are treating any other references or pointers. What does this mean?
This means, one should remember, that standard C++ is best when dealing primarily with values, not references or pointers. That simply means:
Do not carry around references or pointers, but values.
Do whatever you have to do inside the same scope where you have used this mechanism.
This is a classical performance vs resilience balance. The balance one has to perpetually and skillfully watch while using powerful programming languages like standard C++.