C++ Small string optimizations

One size does not fit all

What is “small string optimization”?

Standard C++  string stores its data on the heap. But that is only true if the string grows over an implementation-dependent size. That predefined size for std::string is/was 15 for MSVC and GCC and 23 for Clang.  That is: the C++ string stays “small” if you have not asked for bigger than 15/23 sized strings. The string will not attempt to grow its storage on the heap if it can stay small.

Heap memory allocations/de-allocations are taking a lot of time when compared to most standard C run time calls.

Thus if you avoid them your program will run faster and will consume less memory.

In the case of strings (plural, there are several predefined string types in C++) you do this by always making strings of a certain “smallish” predefined size so that majority of your program string usage does not use heap. But still operates on usable strings.

So, in essence, you always want to create a string of a certain usable size/capacity, before it is being used. And yes, 15 is very small in size. So, basically, each time you need to specifically reserve some larger string and then use it. And that is tedious, error-prone and easy to forget or avoid.

For you, I have prepared a string utility function that will encapsulate making a string of predefined size. And this is how one would use it.

If your team or you always use this to create strings, it is very likely the resulting programs will be faster and will take less memory. The predefined size in there is 255.  Probably it took many meetings to arrive at it but still 255 is an arbitrary size.

For any program, you should try different sizes and measure the results.  Of course, any fundamental char type can also be used. Few examples:

Please try this little utility. You can achieve sometimes dramatic gains in speed and memory consumption.

And this is the code for your little library of utilities.

How is this working? That code uses the std::basic_string constructor that pre-allocates memory of the required size the string will be using. It does not rely on inbuilt small size optimization at all.

Caveat Emptor

Of course, that is not the actual small string optimization tamed. The whole std::basic_string<> machinery stays inside ready to start managing the internal dynamic storage as soon as you step out of bounds. And that is where it is back to slow.