C++ and modern C are two different languages. C++ branched from K&R C. Today it is not based on modern C99, C11, and beyond.
There are C++ users, who truly believe native arrays are to be avoided in modern C++. I disagree. Another side of that coin is perhaps, somewhat like me, you might also believe no feature of modern C++ std lib, has to be used just because it exists, without knowing why.
Thus the scene is set. Two opposing groups of C++ practitioners. But they both agree: std::array
is a “good thing”. They disagree on when, how, and why to use it.
What do I think? When you think you need it, you better be sure it is the right tool to solve your problem and you be sure you use it properly too. Enter native array. Why does it; exist C++ zealots might be bold to ask. Let’s try and reason with them. For starters:
The native array is a foundation cornerstone
Of course, the native array is one of the cornerstones in the foundations of modern C and modern C++. Here is the diagram you have seen before, perhaps many times. But please stop and understand, this is first and foremost the diagram representing the C++ key concepts. This is not K&R C diagram.
Contiguous memory and iterators (begin & end), operating on it.
Iterators are conceptually the pointer to the first element of the contiguous memory (aka array) and a pointer to the “one behind” (aka “+1”) address after the last element. Repeat: The end is last + 1.
And, next is yet another important detail many of you dear fellow C++ developers surely knew a long time ago. But the one not perhaps that much in your focus recently.
The Key abstraction
T (&)[N]
That is a declaration of the reference to the array of size N, which I like to call: NAtive aRray reFerence or abbreviated NARF. In standard C, this does not exist. The closest to it we have in C is
T (*) [N]
Pointer to an array. A source of much confusion. Because C++ array decay does not transform the array to that. Array decay is simply a pointer to the first array element.
Why is NARF a “key abstraction”?
Most importantly, NARF is how C++ passes arrays into the functions and out of the functions. Basically, NARF is always used and a pointer to the array is never used.
One good reason NARF is important is that native array reference is the type compiler uses for native arrays used in the modern C++ range-based for loop,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
#include <iostream> #include <cstdlib> int main() { //native array of //pointers to char const char * nativarr[] {"Look", "native", "array", "iteration"}; // this transforms nativarr into NARF! // const char (&nativarr)[4] for ( auto word : nativarr ) { std::cout << word << " "; } } |
This works because the C++ compiler sees T(&)N
aka (const char * (&)[4]
, above) and thus, in turn, knows the number N
, of elements in the nativarr
above. In turn, a compiler can also deduce begin and end iterators to the NARF.
// for the above
compiler 'knows'
auto begin = nativarr ;
// one after the end
auto end = nativarr + 4;
Hint: Remember the diagram on the top?
Using C, native arrays can be passed in as value arguments
// C99
void take_array ( int size, char string[ size ] ) ;
That is legal C99. That is not C++. Also, we can not return arrays from functions.
C++ NARF can be passed as the return type
In C++ there is one a bit different way to return native arrays from functions. One can declare arguments as array references, and return them from functions, etc.
1 2 3 4 5 6 7 8 9 10 11 |
using narf_char_256 = char(&)[0xFF]; // take native array reference // aka NARF // and return the // same type too narf_char_256 all_zeros(narf_char_256 narf ) { memset(narf, 0, 0xFF); return narf; // be carefull not to return reference to local! } |
In standard C++ it is illegal to declare NARF (native array reference) to an array of unknown size. Hence the repeated usage of ‘256’ above.
1 2 3 4 5 |
// no can do // this declaration is not legal // in standard C++ // size must be given using illegal_NARF = char(&)[] |
The above is illegal and will not compile. Native array reference, the size must be known at compile time. NARF and template are a marriage made in heaven.
1 2 3 4 5 6 7 |
// common C++ idiom for array passing // is using NARF template< typename T, size_t N> constexpr size_t array_count ( const T(&) [N] ) { return N; } |
That is the standard way. It works for any type of T and any size N. It is generic. and obvious candidate for compile time.
Does it seem you are by now beginning to like native arrays in C++?
We have to talk about std::array
I know, some of you are itching we jump onto this one by now. You want to shout out: Why not just use the std:array
?!
The answer is simple: The integration. The legacy.
std::array
is a full-blown C++ std container. Very basic but still a container using one native array inside; carried by value. NARF is not a container. NARF is a tiny abstraction representing a reference to the native array. NARF could be visualized as an atomic building block of larger abstractions.
And yes, there is no std::array
in the land of Modern C.
Modern C
C++ is/was based on K&R C and ANSI C. The modern C is C99 and beyond. C++ is not based on modern C. The outcome is: this common heritage requires a lot of native arrays on both sides of the proverbial fence, between C run-time (CRT) and C++ above it.
For example. Just take a look at the MSVC STL source peppered with C files. In there, you will find a lot of very important implementations operating on native arrays, aka contiguous memory, coded in C. Very often using two native pointers representing array begin and end concepts, straight from the C++ space one level above. (Hint: lookup that diagram on top now; again please.)
To dance in and out of these C functions would be a very complex dance without native arrays being possible in the modern C++ code. C/C++ string literals are perhaps the prime example of that “dance”.
String comparisons are (mega) important, functions written in C. Their declarations are very similar to these two an example:
1 2 3 4 5 6 7 8 |
// this is C code, by one "DBJ" int __cdecl dbj_ordinal_compareA( const char *_string1, const char *_end1, const char *_string2, const char *_end2); int __cdecl dbj_ordinal_compareW( const wchar_t *_string1, const wchar_t *_end1, const wchar_t *_string2, const wchar_t *_end2); |
Now imagine using these from modern C++ but without native arrays, and also without references to native arrays, at hand. I am sure you understand that might work but will be a far inferior implementation to the fortunate reality of using native arrays in modern C++.
The same kind as the two C functions above are making the UCRT, aka (Microsoft) Universal C Run Time.
Using modern C++ to encapsulate what we know
dbj::narf
Godbolt
You might be a bit tired by now and thinking: Grand words, show me some code. Fine. Without any more delay, here is my key abstraction for dealing with native arrays in modern C++. It is a reference to the native array encapsulated using the std::reference_wrapper. More precisely modern C++ template alias.
1 2 3 4 5 6 7 8 9 10 11 12 |
/* Copyright 2017-2018 by dbj@dbj.org Licensed under the Apache License, Version 2.0 */ namespace dbj::narf { template <typename T, std::size_t N> using wrapper = std::reference_wrapper<T[N]>; } |
NARF. Yes, this funny acronym will perhaps make it easier to remember this key abstraction. dbj::narf::wrapper
. So what is dbj narf
?
dbj NARF contains (encapsulates) reference to a native array T(&)[N]
. aka “NARF”. Naked NARF in itself cannot be safely copied and moved around. One, in essence, uses dbj NARF to “carry around” references to native arrays. Array stays where it was made.
That makes for one key requirement: handle dbj NARF
with care. Or you will experience the dangling reference to the native array.
And perhaps the key reason for existence: unlike std::array<T,N>
, dbj NARF readily delivers easy to use, reference to native array aka T(&)[N]
, it encapsulates.
By using dbj::narf
functionality one can safely and practically deal with native arrays in a modern way in a modern C++. Crucially:
Why would anybody use dbj NARF?
Abstractions based on dbj NARF
If people adopt this “tiny thing”, I am sure they will come. Like for example “array of references” which is illegal in modern C++. But easy to code by using dbj NARF.
1 2 3 4 5 6 7 |
typedef char four_chars[4] ; four_chars abc{"ABC"}, def{"DEF"}, ghj{"GHJ"}; // array of references? why not .. dbj::narf::wrapper<char,4> narf_arr[]{ abc, def, ghj } ; |
I know, one can do almost all with std::array
, besides one key problem: easily getting to the reference of the native array it contains. But there is also one more broad subject that delivers focused and convincing proof we need native arrays. As already mentioned, there is one very important land where std::array
does not exist. Standard and modern C. Aleady 20+ years old.
dbj::narf
I have made a very small core of function helpers to make for comfortable usage.dbj::narf:wrapper
. This is functional programming. dbj::narf
does not contain a single class.
Of course, feel free to browse and wander around. The code is never enough on its own, but still, this one is pretty well documented, I hope. Next, I will focus on its perceived usage. First few dbj::narf
examples.
Make dbj NARF from std::array instance:
auto arf = dbj::narf::make(std::array<int, 10>{});
Important side note: line above makes a reference to the std:array
that disappeared because it is declared and defined in line, on the stack. To disappear after that call. The correct code is this:
1 2 3 4 5 |
// first create the instance std::array<int, 10> i10 {}; // second use the instance auto arf = dbj::narf::make(i10); // make sure that instance stays in the memory while arf is used |
Make it from the init list:
1 2 3 4 5 6 7 8 9 |
// make ref to array of 10 int's // wrong, original array dissapears dbj::narf::make({ 0,1,2,3,4,5,6,7,8,9 }); // correct int i10[] = { 0,1,2,3,4,5,6,7,8,9 } dbj::narf::make( i10); // make sure i10 stays |
Make dbj NARF from the native array of narrow string literals
1 2 3 4 5 6 7 |
// wrong dbj::narf::make({ "native","array","of", "asci","string","literals" }); // correct char * charr[] = { "native","array","of", "asci","string","literals" }; dbj::narf::make(charr); |
Or just make dbj NARF contain a native char array:
1 2 |
const auto nacharr = "native char array" ; auto buffer = dbj::narf::make( nacharr ); |
Iteration
Observe in the testing code how is default_print
declared and implemented. We use dbj::narf::apply
, which in turn calls dbj::narf::for_each
, that uses std::for_each
on the native arrays held inside the dbj::narf
instance.
1 2 3 4 5 6 7 8 9 |
template< typename T, size_t N, typename FUN > constexpr auto for_each( const wrapper<T, N> & arf, const FUN & fun_ ) { const auto & B = begin(arf); const auto & E = end(arf); return std::for_each(B, E, fun_); } |
where begin() and end() are actually stl
standards begin
and end
iterators to the native array as in the diagram above. And we can do this quite easily by using the std::reference_wrapper, get()
method that returns the (by now, famous) native array reference. Aka NARF!
Phew! Sounds rather complicated. Follow it once through your favorite debugger and you will see it is not. Here is the code:
1 2 3 4 5 6 7 8 9 10 11 12 |
template <typename T, std::size_t N> using narf_wrapper = std::reference_wrapper<T[N]>; template<typename T, size_t N> constexpr auto begin(const narf_wrapper <T, N> & wrp_) { return std::begin(wrp_.get()); } template<typename T, size_t N> constexpr auto end(const narf_wrapper <T, N> & wrp_) { return std::end(wrp_.get()); } |
The Holy Grail
So. to get to the “holy grail”, the reference to the native array, std::array
keeps, in a quite straightforward manner we just do the:
1 2 3 4 |
// make ref to std array of 10 int's auto arf = dbj::narf::make(std::array<int, 10>{}); // get to native array reference const auto & native_arr_ref = arf.get() |
And to get to the pointer to the std::array
contained native array we simply do the:
1 2 3 |
// default auto by-val result yields a pointer to the // native array contained auto native_arr_ptr = arf.get(); |
So if we wish to use some function that declares its argument as a native array reference we will do it as simply as this:
1 2 3 4 5 |
template<typename T, size_t N> void native_arr_consumer( const T(&arrf)[N] ) ; // create narf auto arf = dbj::narf::make({1,2,3}); native_arr_consumer(arf.get()); |
And of course, we can always give the narf instance to the range for loop like so:
1 2 3 4 5 6 7 |
// the for loop usage // notice the auto & declaration // so that internal array // elements are updated for (auto & element : arf.get()) { element = random(255); }; |
Please notice we do not need to know the size of the native array contained, still, we enjoy the advantages of being able to use a reference to it.
counter-homage to std::data()
The,dbj::narf::data()
is returning the reference of the n-array it holds. Please note the difference to the std::data()
version for native arrays, which returns a pointer to n-array.
1 2 3 4 5 |
// std version template <class T, std::size_t N> constexpr T* data(T(&array)[N]) noexcept{ return array; } |
To use dbj::narf::data()
the users will tend to use the auto declaration of the return type, but one has to be a bit careful as the one will get different types in relation to the declaration of the result:
1 2 3 4 5 |
auto narf = dbj::narf::make("narf to Array of chars"); auto native_array_pointer = dbj::narf::data(narf); decltype(auto) not_elegant_ref = dbj::narf::data(narf); // the standard way auto & the_arr_ref = dbj::narf::data(narf); |
The Danger Zone
is always near when using dbj NARF (or just NARFs). Please do not forget one can easily “drop the ball” and accidentally use dangling references in this case.
Keep in mind dbj::narf
is just an alias to the
std::reference_wrapper<T[N]>
And in there the only data is T *. Just one naked pointer.
Summary
In the coming days and weeks, I shall use dbj::narf
extensively, and report back with real-life usage examples and findings.
And here we are, the usage promoted. Please proceed here.