This is Episode 3 from a miniseries on “Strong Types”
Motivation
You are a software developer. You are first-time reading two header files of two phone book components. Your task is to select one, make your company pay for that one you just selected, and then ask everyone to use the one.
A
1 2 3 4 5 |
/* Phone Book API -- A */ int phone_book_store ( const char * /*name*/, const char * /* surname */ ) ; |
B
1 2 3 4 5 |
/* Phone Book API -- B */ int phone_book_store( Name, Surname ) ; |
Which one is better and why?
I vote B. I can see shredded glass on the path of the developer who selected the option “A”.
The omnipotent “Access violation”
And with that, we are entering yet another real-life use case, helping you to decide in favour of simple strong types.
Recently I was given an “honourable” task to find and remedy a “mystery” bug in a pretty complex multi-threading system. The core is written in C.
In software development we all know: “mystery” is one sure way to “misery”.
Run-time “Access violation” coming from previously working parts. In particular, freeing pointers to corrupted heap. And not always.
It was one of those: “But it was working before?!” kind of a mysteries. And to cut the long story short. here was this function:
1 2 3 4 5 6 7 8 |
/* make (on the heap) a name made up of id and origin following system specific logic */ char *make_name_( char * id_, char * origin_ ) ; |
It is used from a lot of threads upon starting each of them. It is part of the tried and tested common library. It is passed as initial thread data, and it is the responsibility of the thread to free it upon specific usage.
So. A lot of developers calling one simple function from a lot of locations, sprinkled around their complex code.
What could possibly go wrong?
What went wrong is very simple. And extremely hard to test properly. Some of the callers have mixed the order of the arguments.
1 2 3 4 5 6 |
/* origin and id in wrong order. */ start_thread_with( make_name_( new_origin, next_id() ) ); |
But which callers? In where? We are talking about many components, many developers and several projects.
That mistake was easy to make. Both arguments are of the same type. The compiler can not spot logical mistakes.
Enter Strong Types
So instead of spending a lot of coordinated effort and time (and that means money), I have re-declared this one function like so.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
/* strong type is a struct with a single member struct name is a type name */ typedef struct { char * val ; } Id ; typedef struct { char * val ; } Origin ; /* make (on the heap) a name made up of id and origin */ char *make_name_( Id, Origin) ; |
Infinitely better vs the previous declaration.
Recompile everything and lo-and-behold, the several “culprits” have been found. As soon as arguments have been re-declared as strong types, they realized they have made a mistake in the order of arguments.
Now the code required to call make_name
has suddenly made it very obvious which argument is which.
1 2 3 4 5 6 7 8 9 10 |
// calling from C // the function where arguments are strong types // declaring and defining strong type literals // perhaps to some unusal but perfectly legal syntax make_name( (Id){"42"}, (Origin){"Sector 5"} ); // standard call syntax is not less instructional Id id = {"42"}; Origin origin = {"Sector 5"}; char * name_ = make_name( id, origin) ; |
In any case, it is very obvious which value is used as which argument. Maintenance of the software code is often forgotten as a critical and potentially very costly activity. Anyone diving into the code above, even years after the release, will be immediately clear, calling the make_name
is all ok.
That “strong types as literals” C code is better C++. Why? Please compare.
1 2 3 |
// C++ has no compound literals // using compile time literals make_name( {"42"}, {"Sector 5"} ); |
You have not seen this code before. You need to debug it. In the C++ call above, are you always going to be sure which argument is Id
and which one is Origin
?
1 2 3 |
// C++ // Bug .. wrong order of arguments make_name( {"Sector 5"}, {"42"} ); |
That is the same logical bug, again. It compiles in C++. But not in C.
Addendum
Before I left, the team has decided they like the approach and they have improved the declaration of the “offending function” :
1 2 3 4 5 6 7 8 9 10 11 12 |
typedef struct { char * val ; } Id ; typedef struct { char * val ; } Origin ; #define Name_size BUFSIZ typedef struct { char val[Name_size] ; } Name ; /* make a name made up of id and origin */ inline Name make_name_( Id id_, Origin origin_ ) { Name retval_ ; /* special name making system, logic here */ return retval_ ; } |
They have added yet another strong type, to this function signature: Name
. As the return value type. After that, they have compared the performance of this version with the previous version with no strong types.
The difference in favour of no strong types was negligible. They have paid the cost, learned the lesson, and decided to stay with the “strong types” variant.
Option C
Ah and by the way. Option B is better than option A. But the best option I vote for is:
1 2 3 4 5 |
/* Phone Book API -- Option C Strong type is returned too */ Phonebook_ebtry_handle phone_book_store( Name, Surname ) ; |
Wink, wink … Enjoy.