C++ Small string optimizations

One size does not fit all
One size does not fit all

What is “small string optimization”?

Standard C++  string stores its data on the heap. But that is only true if the string  grows over an implementation-dependent size. That predefined  size for std::string is/was 15 for MSVC and GCC and 23 for Clang.  That is: C++ string stays “small”, if  you have not asked for bigger than 15/23 sized strings. The string will not attempt to grow its storage on the heap if it can stay small.

Heap memory allocations/de-allocations are taking a lot of time when compared to most standard C run time calls.

Thus if you avoid them your program will run faster and will consume less memory.

In case of strings (plural, there are several predefined string types in C++) you do this by always making strings of a certain “smallish” predefined size so that majority of your program string usage does not use heap. But still operates on usable strings.

So, in essence you always want to create a string of a certain usable size/capacity, before it is being used. And yes, 15 is very small size. So, basically, each time you need to specifically reserve some larger string and then use it. And that is tedious, error prone and easy to forget or avoid.

For you I have prepared an string utility function that will encapsulate making a string of predefined size. And this is how one would use it.

/* (c) 2018 by dbj.org */ 
auto optimized_small_string 
   = dbj::str::optimal<char>() 
/*
 size and the capacity of the above 
 string are 255
 it is also initialized with
 255 end of strings
*/

If your team or you always use this to create strings, it is very likely the resulting programs will be faster and will take less memory. Predefined size in there is 255.

255 is an arbitrary size. For any program you should try different sizes and measure the results.  Of course any fundamental char type can also be used. Few examples:

auto os2 = dbj::str::optimal<wchar_t>( 1024 );

auto os3 = dbj::str::optimal<char16_t>( 512 , u'=');

auto os4 = dbj::str::optimal<char32_t>( 128 , U'+' );

Please try this little utility. You can achieve sometimes dramatic gains in speed and memory consumption.

And this is the code for your little library of utilities.

namespace dbj::str {

/*
Make a string optimized for small sizes
*/
template < 
typename CT,
typename string_type = std::basic_string< CT > ,
typename char_type = typename string_type::value_type,
typename size_type = typename string_type::size_type
>
constexpr inline string_type optimal
(
size_type SMALL_SIZE = 255,
char_type init_char_ 
  = static_cast<char_type>(0)
)
{
 return string_type(    
    SMALL_SIZE, 
    init_char_
  );
}

} // dbj::str