C++ Convert any character sequence to any standard string

Four standard char types can be transformed to each other
Four standard char types can be transformed into each other. Caveat Emptor: If using C or C++ avoid char8_t.

[Update 2023-01-07] Author of this humble post has decided not to use C or C++ to do any char or text or string processing, using those two languages. It is just too much trouble for no obvious gains. No drama. Might use GO. Language made by two gentlemen who invented UTF8.

[Update 2021-09-05] Here is the link to the why’s and hows of Unicode, with a focus on Windows code. At last, managed to find the time.

[Update 2021-03-18] Code in here is updated and there is a link to the Godbolt working version too.

(Note: this is the second part of C++ : codecvt deprecated. Panic? )

Update: This is not a foreign language translator or some such code.  This is a standard  C++ 17 utility to transform the core character sets between each other.  The first 127 characters, that is. As such, it is remarkably useful and simple. For full-blown, locale-aware solutions please look elsewhere, starting from here.  End of update.

Standard C++ std lib is one very complete and useful library. But there are times when you do realize you can build one or two very simple utilities on top of it.

Simple but sometimes surprisingly powerful. Like perhaps this one is.

A mechanism for transforming any standard sequence of chars (i.e. holding standard char types),  into any of the four standard string types.  Which are:

Type Definition
std::string std::basic_string<char>
std::wstring std::basic_string<wchar_t>
std::u16string (C++11) std::basic_string<char16_t>
std::u32string (C++11) std::basic_string<char32_t>

First the reason you are here, The code:

One struct with one function call operator, does it all.  The usage:

And so on. Any standard sequence made up of standard chars will do as a legal input.  As long as it has begin() and end() methods, and the value_type typedef.  That is including native string literals too, as a legal input.

char8_t is best avoided. We could also serve stunt programmers to a certain extent, too:

Perhaps (one might remark) we could code this in a more “resilient” way. But why should we? Using (for example) non-standard strings as return type simply will not compile.

And after all, it is certainly wise to wait for C++20 constraints and concept’s to appear soon in a compiler near you. Applying that standard feature will certainly make for one resilient and more user-friendly version.

In case you would like to try this yourself but need some guidance, do mail us, please.