Switch Operator
Motivation
Sometimes it is necessary to compare some value against fixed set of other values and execute some code, specific for each value. Example from Compiler1 code:
if( t == U_FundamentalType::i8_ ) { return KeywordToString( Keyword::i8_ ); }
if( t == U_FundamentalType::u8_ ) { return KeywordToString( Keyword::u8_ ); }
if( t == U_FundamentalType::i16_ ) { return KeywordToString( Keyword::i16_ ); }
if( t == U_FundamentalType::u16_ ) { return KeywordToString( Keyword::u16_ ); }
if( t == U_FundamentalType::i32_ ) { return KeywordToString( Keyword::i32_ ); }
if( t == U_FundamentalType::u32_ ) { return KeywordToString( Keyword::u32_ ); }
if( t == U_FundamentalType::i64_ ) { return KeywordToString( Keyword::i64_ ); }
if( t == U_FundamentalType::u64_ ) { return KeywordToString( Keyword::u64_ ); }
Another example:
auto escaped_c= it.front();
it.drop_front();
if( escaped_c == "\""c8 || escaped_c == "\\"c8 )
{
result_lexem.text.push_back( escaped_c );
}
else if( escaped_c == "b"c8 ){ result_lexem.text.push_back( "\b"c8 ); }
else if( escaped_c == "f"c8 ){ result_lexem.text.push_back( "\f"c8 ); }
else if( escaped_c == "n"c8 ){ result_lexem.text.push_back( "\n"c8 ); }
else if( escaped_c == "r"c8 ){ result_lexem.text.push_back( "\r"c8 ); }
else if( escaped_c == "t"c8 ){ result_lexem.text.push_back( "\t"c8 ); }
else if( escaped_c == "0"c8 ){ result_lexem.text.push_back( "\0"c8 ); }
else if( escaped_c == "u"c8 )
{
// ...
As you can see, such code is implemented via chains of if-else
operators.
And such chain looks not so great (too verbose).
So, in order to beautify such code i decided to add into Ü something like switch
operator from C++.
Such operator may be shorter than equivalent chain of if-else
.
Initially i thought to implement it via some library/built-in macro.
Such macro may look like true switch
operator, but internally will produce same chain of if-else
.
But i found some reasons not to do that.
I decided to implement switch
operator as a part of the language.
Later i will explain why.
Implementation
So, a simple switch
operator in Ü looks like this:
switch(x)
{
0 -> { return 42; },
1 -> { ++y; },
2 -> { foo(); },
default -> {}
}
Such operator compares value in ()
against values before ->
.
If values are equal, control flow is passed to the block after ->
.
If no matching value found, control flow is passed to default
block.
But what is the difference between such switch
and if-else
chain?
There are a lot of differences!
It is possible to specify multiple values:
switch(x)
{
0, 10, 66 -> { return 42; },
15, 16 -> { ++y; },
21, 22, 100, 500 -> { foo(); },
default -> {}
}
It is similar to using multiple case
labels in C++ switch
.
Ranges are also possible, including half-open ones:
switch(x)
{
0 ... 10, 100 ... 1000 -> { return 42; },
... -16, 33 -> { ++y; },
9999 ..., 77, 99 -> { foo(); },
default -> {},
}
In C++ switch
there are no ranges in the standard.
They exist only as extensions in some compilers (like GCC).
The Main Feature
Ok great, switch
in Ü is pretty flexible.
But this is not the main feature of the switch
operator.
There is something else.
The switch
operator has though some limitations.
It supports only integer, character and enum
types.
And it supports only constexpr
label values.
So, it is not so flexible (in some cases), like if-else
chain.
But such limitations have a good reason.
Because all values are compile-time constants and are internally just integer numbers, Ü compiler can statically check, if all possible values are handled inside switch
!
So, if you forget to handle some values, compiler will complain about it:
fn Foo( i32 x ) : i32
{
switch(x)
{
0, 1, 2 -> { return 42; },
3, 4, 5 -> { return 24; },
// Compilation error - values before 0 and values after 5 are not handled.
}
}
You need to specify such values manually (one by one or via ranges) or add a default
branch:
fn Foo( i32 x ) : i32
{
switch(x)
{
0, 1, 2 -> { return 42; },
3, 4, 5 -> { return 24; },
default -> { return 0; }, // Ok - all other values are handled in "default" branch.
}
}
There is another kind of check. The compiler will complain, if default branch is unnecessary:
enum E{ A, B, C }
fn Foo( E e ) : i32
{
switch(e)
{
E::A -> { return 123; },
E::B -> { return 456; },
E::C -> { return 789; },
default -> { return 0; }, // Compilation error - default branch is unreachable. Ü assumes that only values listed in the enum declaration are possible.
}
}
Also compiler can check for overlaps/duplicate values:
fn Foo( i32 x ) : i32
{
switch(x)
{
5-> { return 1; },
3 + 2 -> { return 2; }, // Error, label value "3 + 2" is equal to value of other label "5".
10 ... 20 -> { return 3; },
19 ... 1000 -> { return 4; }, // Error, this range overlaps with previous one.
default -> { return 0; },
}
}
Reason for static checks
So, why exactly switch
in Ü has such checks?
The main reason for that is code safety.
It is better to perform code static checks during compilation instead of spending time debugging/testing it.
If it is possible, a programmer can use switch
and get all its checks.
If not - it is still possible to use chain of if-else
.
Especially useful are static checks for enums. Consider such example:
enum E{ A, B, C }
// Some function, defined far away from the enum definition, possible in another file/library.
fn Foo( E e ) : i32
{
switch(e)
{
E::A -> { return 123; },
E::B -> { return 456; },
E::C -> { return 789; },
}
}
All works, all are happy. But one day the declaration of the enum changes - new values are added:
enum E{ Before, A, B, C, AfterC }
And after such addition compilation of the function with switch
breaks with a message like this:
test.u:6:10: error: Value 0 is not handled in switch.
test.u:8:10: error: Value 4 is not handled in switch.
And it is great, that compilation breaks! If no such static checks existed, the program will successfully compile, but contain a bug in place of this switch. And it may take long time to find and fix this bug. But with such static checks it is trivial to find all such places and fix them:
fn Foo( E e ) : i32
{
switch(e)
{
E::Before -> { return 999; },
E::A -> { return 123; },
E::B -> { return 456; },
E::C -> { return 789; },
E::AfterC -> { return 1; },
}
}
Alternatively it is possible to add a default
branch.
But such solution may not be good, since after another enum values will be introduced no new compilation errors will be generated and it may be not so easy to find places in code needed to be changed.
Ü is not unique with such static checks.
Some modern C++ compilers can also complain about unhandled enum values in switch
.
Rust compiler will also complain about unhandled variants in match
operator, though match
in Rust works internally very differently relative to switch
in Ü.
Conclusion
Now swith
operator is available.
It is recommended to use it where it is possible.
With such switch
operator i managed to rewrite some boilerplate code in Compiler1 (like in some examples in the beginning of this article).
Some small bonus:
switch
operator is also useful with only a couple of handler blocks, but many values.
It is still better that writing chain of ==
and ||
.
Code before:
fn IsWhitespace( char32 c ) : bool
{
return
c == " "c32 || c == "\f"c32 || c == "\n"c32 || c == "\r"c32 || c == "\t"c32 ||
c <= char32(0x1Fu) || c == char32(0x7Fu);
}
Code after:
fn IsWhitespace( char32 c ) : bool
{
switch(c)
{
" "c32, "\f"c32, "\n"c32, "\r"c32, "\t"c32, char32(0x1Fu), char32(0x7Fu) -> { return true; },
default -> { return false; }
}
}