Functions visibility evolution
In this article i will explain, how functions (and not only functions) visibility evolved during Ü development.
Initial approaches
Ü language development was started in year 2016. But active development was started only in year 2017. In this year LLVM library was introduced for code generation.
Initially all functions had external linkage, as (for example) any regular (non-template, non-static and non-inline) C++ functions. This was the default behavior, there was no reason to choose something else.
In November 2017, after templates and imports were introduced, functions linkage was changed. This was needed in order to avoid redefinition linking errors while linking files containing identical instantiations of same templates. From now linkonce_odr was used, together with comdat any. Such approach allowed to work with templates and even treat all functions, defined in common imported files, as sort of inline.
Moderate approach
The approach above was used pretty long. It lasted even during development of Compiler1 (year 2020).
During Compiler1 development i noticed, that its build became slower and slower. Initially i thought, that this was because of class templates model in Ü - each class template instantiation produces all methods, even if some methods were not used. Anyway, even with such slow compilation i managed to reach Compiler1 self-building. After that i decided to fix its slow compilation.
I found, that there was really a lot of template methods in result code. But methods generation itself was pretty fast. Much slower it was to compile all this methods into machine code. But do we really need to generate machine code for all there methods? Not at all!
Since some template methods are not used, it is safe to just remove them! LLVM optimizer can remove unused methods, but only if they have private linkage. linkonce_odr linkage and comdat are not needed, because template must be instantiated in each compilation unit, thus each user of template will get its own copy of template methods code.
So, i changed functions linkage from linkonce_odr to private for each function, located inside template. Also i changed linkage to private for all generated methods of regular classes.
The result was shocking - Compiler1 build became several times faster! It took now about 30 seconds to build Compiler1, instead of several minutes. Debug builds became much faster too.
Advanced approach
Approach above was good, but not good enough.
I found, that Ü compiler is missing very important feature - regular functions with private visibility, like static functions in C++. It was even necessary, since with linkonce_odr visibility it was possible to accidentally define a function with same name (and signature) in different compilation units and silently merge them together during linking without noticing it and thus break the result program.
In July 2023 i decided to fix these issues.
I decided, that location of the function declaration/definition may be used to control its visibility.
If function is defined inside non-main (imported) file, it may be safely made private. If you need to use this function - just import file with its definition. If you import a file with a bunch of function definitions and use only some of them, other functions will be optimized-out.
But what if a function with private visibility is needed inside main (compilation root) file? I found a solution for this problem. A function may be done private if it has no prototypes in imported files. It has a reason. external visibility is needed, when function defined in one compilation unit is used in another compilation unit. But in order to do this function prototype must be declared in common imported file. So, if you use declarations inside common file, you get external visibility automatically.
So, new approach for functions visibility is following:
- Every generated function is private
- Every function inside template is private
- Every function defined in imported file is private
- A function defined in main file is private, unless it has a prototype in one of imported files
- A function defined in main file, that has a prototype in one of imported files, has external linkage
- nomangle function defined in main file has also external linkage, even if no prototype in any imported file exists
The approach described above was additionally used for some non-function symbols:
- Every immutable global variable is private
- A mutable variable, defined in imported file, is external and has a comdat (for deduplication)
- TypeID table for polymorph classes is private, if class is declared in main file, otherwise it is external and has a comdat
- Polymorph classes virtual table is always private
With this approach it is now impossible to obtain silent merge of distinct functions during linking. external linkage without a comdat will prevent it. And creation of private function is now as simple as it can be - just define a function and it will be private, unless you create also a declaration in another file for it.
Conclusion
Now Ü has a reliable and configurable functions visibility model. Especially good is that no special language constructions are required to control visibility (static, like in C++, pub, like in Rust).
Current behavior is also friendly for compilation speed and runtime performance. Preferred usage of private linkage by Ü Compiler allows to reject unused functions and thus speed-up build/reduce object files size. Also usage of private linkage allows LLVM optimizer to more-aggressively inline functions and thus increase performance of the result code.
Will current linkage model change? It is likely (as history shows), but i think that it will no change so drastically as before.