Minimal ABI Example
In compiled languages, such as C++, developpers have the possibility to use and build shared libraries, also known as a dynamic libraries. To depend on a shared library (in an exectuable or another shared library), said library needs to be present at compile time, 1 but really symbols from the library (functions, types, global variables etc.) are resolved at runtime (when something is executed) by the dynamic linker.
Shared libraries have many advantages. They can be shared between multiple programs, reducing memory and disk usage. Because they are resolved at runtime, they can be changed without impacting any program that depend on them. This is convenient to deliver updates to user without requiring much tools on their machine (which can be hard to get right). It also means we can wait until program execution to choose the most appropriate library. For instance, a library such as NumPy is compiled against BLAS. BLAS has many implementations, for instance OpenBLAS, ATLAS, BLIS, that we can choose from. One of them, Intel MKL, is built specificly for Intel CPUs, and we can therefore expect it to be more efficient on these machines. But for NumPy, it does not change anything. It only needs to compile against a BLAS library, and it will work with all BLAS libraries.
All of them? What if we take something completly different and called it BLAS?
As you can expect that will not work.
Some information (e.g. function names) have been looked up when compiling against the shared
library, and this information need to be the same in the library used at runtime.
This is kown as the library interface: the observable boundary of componnents (functions, classes,
etc.) that can be used to interact with the library.
In non compiled languages (e.g. Python), it is the same as the
Application Programming Interface
(API), i.e. the way we (humans) can write code using a library.
For shared library however, there also exsits an
Application Binary Interface to
define the way computers can interact with it (in binary form).
Understanding ABI is highly technical.
It can change even if the shared library API (or even all of it source code) remains
unchanged.2
ABI is what make a shared library compatible with another.
The first thing that impacts ABI is the CPU architecture.
If we build a library for a given type, say x86-64, we
cannot expect it to be interchangable with another like
ARM64.
If we want to update a library without having to recompile its dependent programs, the ABI has to remain stable. This is why in this post we will build our intuition by studying an example that breaks the ABI.
Should you care about ABI stability in your library?
If you are wondering this, then chances are the answer is no.
Libraries that are ABI stable across many versions and implementations are usually cornerstones of
our computing ecosystem and require dramatically more work and knowledge than what is in this post.
With this article, you will however better understand the struggle of package managers.
libcomplex 0.1.0 - Hello world
Complex numbers are at the foundation of many algorithms and applications, so we have decided to put
much of our knowledge into a library libcomplex
so that other developpers can build it.
Watch out for bugs, it is still in in beta! 😉
We build a class to represent complex numbers and put it in our include/complex.hpp
header.
It uses the carthesian representation, but can compute a complex’s modulus and argument.
We also added a to_string
method to get human readable representation.
#pragma once
#include <ostream>
struct Complexe {
Complexe(double real, double imaginary) noexcept :
real(real), imaginary(imaginary) {}
double modulus() const noexcept;
double argument() const noexcept;
friend std::ostream& operator<<(std::ostream& out, const Complex& c);
private:
double imaginary;
double real;
};
And we put the implementation of the method in the associated src/complex.cpp
file
#include <cmath>
#include "complex.hpp"
double Complexe::modulus() const noexcept {
return std::sqrt(real*real + imaginary*imaginary);
}
double Complexe::argument() const noexcept {
if(modulus() == 0.){
return std::nan("");
} else if(imaginary < 0.){
return - std::asin(real / modulus());
}
return std::asin(real / modulus());
}
std::ostream& operator<<(std::ostream& out, const Complex& c) {
out << c.real << " + " << c.imaginary << 'i';
out << " = " << c.modulus() << "e^(" << c.argument() << " i)";
return out;
}
Our library is an overnight success, and many projects use it as a dependency!
One of them, awesome-app
, has the following code
#include <iostream>
#include <string>
#include "complex.hpp"
int main(int argc, char** argv) {
const Complex z{std::stod(argv[1]), std::stod(argv[2])};
std::cout << z << '\n';
}
Because compiling a library can be tricky to get right, we write a short CMake
file to do it.
For simplicity, we will add awesome-app
to the same file, but in practice this would be two
separate projects.
cmake_minimum_required(VERSION 3.0)
project(Complex VERSION 0.1.0 LANGUAGES CXX DESCRIPTION "Complex number library")
add_library(complex SHARED src/complex.cpp)
target_include_directories(complex PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}/include")
target_compile_features(complex PUBLIC cxx_std_14)
add_executable(awesome-app src/awesome-app.cpp)
target_link_libraries(awesome-app PRIVATE complex)
And now to compile and run all
cmake -B build
cmake --build build
./build/awesome-app 3 4
Yields
3 + 4i = 5e^(0.643501 i)
libcomplex 0.2.0 - Pardon my French
Too excited about our library, we named our class Complexe
in French instead of the English
Complex
.
We make the change and try to recompile only our library (i.e. not awesome-app
).
cmake --build build --target complex
Now if our dependency awesome-app
updates the library without recompiling
./build/awesome-app 3 4
It will get the error similar to (this is on MacOS)
dyld: Symbol not found: __ZlsRNSt3__113basic_ostreamIcNS_11char_traitsIcEEEERK8Complexe
Referenced from: ./build/awesome-app
Expected in: ./build/libcomplex.dylib
This means that the dynamic loader cannot find the (binary) function to print a Complexe
(the
names in the output are mangled).
Indeed, when recompling the library we replaced the function to print a Complexe
with a function
to print a Complex
.
The ABI is no longer compatible.
awseome-app
needs to be recompiled
cmake --build build --target awesome-app
And this time we have the compilation error
./src/awesome-app.cpp:8:8: error:
unknown type name 'Complexe'; did you mean 'Complex'?
const Complexe z{std::stod(argv[1]), std::sto...
^~~~~~~~
Complex
Indeed, the name has changed.
The API has also been modified and awesome-app
needs to modify its code to replace Complexe
with Complex
.
Changing the API will always change the ABI so evaluating if the ABI was stable here was a lost cause anyways.
libcomplex 0.2.1 - Bug fix
In our enthusiasm, we made a mistake in the formula to compute a complex’s argument.
We used asin
instead of acos
.
To be fair, there is an equivalent formula using asin
, and also one using atan
.
There is actually a function in the standard library to compute directly this angle:
std::atan2
.3
Because it is made just for this, we can expect it to be numerically more accurate.
It also reduce the amount of code we have to maintain and test, so we make the switch.
double Complex::argument() const noexcept {
return std::atan2(imaginary, real);
}
If we recompile only the library
cmake --build build --target complex
And use it directly without recompiling awesome-app
,
./build/awesome-app 3 4
We get
3 + 4i = 5e^(0.927295 i)
The ABI remains unchanged so we are able to deliver a bugfix to our users (and the users of
awesome-app
) by simply swaping in the new library.
This could happen without any action from the developpers of awesome-app
!
libcomplex 0.3.0 - ABI break
While putting some documentation in our code, we notice the following attributes in the Complex
class
struct Complex {
...
private:
double imaginary;
double real;
};
Complex numbers are usually represented with their real part first, so we would like our code to look the same. We inverse the attributes
struct Complex {
...
private:
double real;
double imaginary;
};
These are private attributes. They cannot be accessed outside of the class, so our API has not changed. If we try again to only swap the new library
cmake --build build --target complex
And use it directly without recompiling awesome-app
,
./build/awesome-app 3 4
This time, we get
4 + 3i = 5e^(0.643501 i)
Notice how 3
and 4
(and the phase) have been swapped, just like we swapped the attributes in the
class?
This happens because when the complex.hpp
header get included in awesome-app.cpp
, some code for
Complex
gets generated in awesome-app
.
In particluar, the compiler aligns all its attributes (like if it was a tuple) and loses the notion
of name.
The constructor of Complex
is defined in the headers, so when the compiler compiles awesome-app
,
it puts the imaginary part first and the real part afterwards.
When we use a function from the updated library, the library expects on the contrary to find the
real attribute first and the imaginary second, hence the flip.
We made a modification that left the API unchanged but changed the ABI.
This is a particluarily nasty one, because it does not give any errors.
The program is completly well behaved for computers but it does not compute what we expect it.
As before, the solution is to recompile awesome-app
.
The topic of ABI stability is extremely complex.
For instance, renaming imaginary
to any other name would have preserved the ABI.
Similarily, if the constructor was only declared in include/complex.hpp
struct Complex {
Complex(double real, double imaginary) noexcept;
...
};
And defined in src/complex.cpp
Complex::Complex(double real, double imaginary) noexcept :
real(real), imaginary(imaginary) {}
Then, in this limited example, we could swap the attributes without changing the ABI.
I hope this example helped you understand how and when package managers can decide to replace a shared library for another, and hopefully it will help you figure out related bugs in the future. This post was meant to be illustrative and is not an incitation to develop ABI stable library. In particular, the examples in this post may behave differently in a more complete library.
-
For instance, GCC changed its C++ standard library ABI in GCC 5. Therefore the same shared library compiled with GCC < 5 and GCC >= 5 may not be ABI compatible. ↩︎
-
Ok, there is also a
std::complex
type, but 🤫. ↩︎