C++ : codecvt deprecated. Panic?

P0618RO:  “…The entire header <codecvt> (which does not contain the class codecvt!) is deprecated, as are the utilities wstring_convert and wbuffer_convert. These features are hard to use correctly, and there are doubts about whether they are even specified correctly. Users should use dedicated text-processing libraries instead…”

(update: for the comprehensive “all in one” solution please head here )

Therefore: C++17:codecvt is, officially, irrevocably, gone for good: deprecated.  And there is this highly suspicious: “Text-processing libraries” advice ?? Panic?  Please don’t.

My advice is to “stay cool calm and collected and all things will fall into place”. Read on.

In case you want to transform from let’s say wide string type to std::string type here is the standard C++ (17 and beyond) solution:

Just one function. Almost simple.

NOTE 1: This is a standard C++ standard way. For WIN32 aficionados this is apparently not exactly a way. They would need to use something like cppWINRT to_string.

NOTE2: In case you see nothing wrong with this approach: it is standard and somewhat controversial, at the same time. This code is doing nothing but casting the chars from one to another std char type. And this works for the first 127 chars, for the English language speaking users and developers, that is. But not for the others. For a good introductory text please see here.

NOTE3:  In case you have thoughts like: “How do I transform utf8 to utf16 ..”, or in case you are looking for a true text internationalization and localization solution for your project, please start from here.

Usage example:

Yes, I am using C++ string literals.

That  transform_to() argument will take anything that is an std string, or std string view.

The F type has to be std string or std string_view.  The T type has to be std string. Ah, yes, all 4 std string or view types will work. A little reminder follows:

I could add type mismatch traps, in here. I decided not. In case anybody uses this with wrong types she will be greeted with very long compiler errors.

Also, I could have done a lot more template jockeying in here. Using std::enable_if and a such. Again I have decided that is counter-intuitive for the majority of readers/users and achieves little. Illegal usage will be simply stopped by a compiler.

Back to subject. The above will not work for native string literals. We could stop people trying to compile that, by deleting the overload that has a pointer argument:

Thus, I could bar any native literals usage and “force” users into standard C++ and standard C++ string and view literals only. But that is not very beginner friendly.  Also, I like to provide comfortable API’s. So here is the overload that takes care of native string literals.

In standard C/C++, a native string literal is compiled into charr array. Two functions. One standard C++ solution.

For an advanced version which consists of one function and does all of this, plus any other standard character sequence type please jump here.

Now let us imagine the solution sketched here, is all implemented. Here is one (almost) comprehensive test”suite”:

To be 100% comprehensive there are more tests one can imagine here. I am sure if you have been reading until this point, you will understand they might be redundant.

That’s it. No codecvt in sight. Enjoy standard C++.

Appendix

In standard c++, returning from functions,  one does not need to repeat the type returned. Instead of:

std::string fun (std::wstring str ) {
return std::string{

   str.begin(), str.end()
  };
}

Standard C++ does allow:

std::string fun ( std::wstring str) {
return {

  str.begin(), str.end()
 };
}

Standard C+ compiler already knows std::string is the return type.  One can just type the brace init list without again mentioning the type. This works (as ever) only if there is a constructor required or if there is a user-defined conversion.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.