I just released mrcal 2.5; much more about that in a future post. Here, I'd like to talk about some implementation details.
libpython3 and linking
mrcal is a C library and a Python library. Much of mrcal itself interfaces the C
and Python libraries. And it is common for external libraries to want to pass
Python mrcal.cameramodel objects to their C code. The obvious way to do this
is in a converter function in an O& argument to
PyArg_ParseTupleAndKeywords(). I wrote this mrcal_cameramodel_converter()
function, which opened a whole can of worms when thinking about the compiling
and linking and distribution of this thing.
mrcal_cameramodel_converter() is meant to be called by code that implements
Python-wrapping of C code. This function will be called by the
PyArg_ParseTupleAndKeywords() Python library function, and it uses the Python
C API itself. Since it uses the Python C API, it would normally link against
libpython. However:
- The natural place to distribute this is in
libmrcal.so, but this library doesn't touch Python, and I'd rather not pull in all oflibpythonfor this utility function, even in the 99% case when that function won't even be called - In some cases linking to
libpythonactually breaks things, so I never do that anymore anyway. This is fine: since this code will only ever be called bylibpythonitself, we're guaranteed thatlibpythonwill already be loaded, and we don't need to ask for it.
OK, let's not link to libpython then. But if we do that, we're going to have
unresolved references to our libpython calls, and the loader will complain
when loading libmrcal.so, even if we're not actually calling those functions.
This has an obvious solution: the references to the libpython calls should be
marked weak. That won't generate unresolved-reference errors, and everything
will be great.
OK, how do we mark things weak? There're two usual methods:
- We mark the declaration (or definition?) or the relevant functions with
__attribute__((weak)) - We weaken the symbols after the compile with
objcopy --weaken.
Method 1 is more work: I don't want to keep track of what Python API calls I'm
actually making. This is non-trivial, because some of the Py_...() invocations
in my code are actually macros that call functions internally that I must
weaken. Furthermore, all the functions are declared in Python.h that I don't
control. I can re-declare stuff with __attribute__((weak)), but then I have to
match the prototypes. And I have to hope that re-declaring these will make
__attribute__((weak)) actually work.
So clearly I want method 2. I implemented it:
python-cameramodel-converter.o: %.o:%.c $(c_build_rule); mv $@ _$@ $(OBJCOPY) --wildcard --weaken-symbol='Py*' --weaken-symbol='_Py*' _$@ $@
Works great on my machine! But doesn't work on other people's machines. Because
only the most recent objcopy tool actually works to weaken references.
Apparently the older tools only weaken definitions, which isn't useful to me,
and the tool only started handling references very recently.
Well that sucks. I guess I will need to mark the symbols with
__attribute__((weak)) after all. I use the nm tool to find the symbols that
should be weakened, and I apply the attribute with this macro:
#define WEAKEN(f) extern __typeof__(f) f __attribute__((weak));
The prototypes are handled by __typeof__. So are we done? With gcc, we are
done. With clang we are not done. Apparently this macro does not weaken symbols
generated by inline function calls if using clang I have no idea if this is a
bug. The Python internal machinery has some of these, so this doesn't weaken
all the symbols. I give up on the people that both have a too-old objcopy
and are using clang, and declare victory. So the logic ends up being:
- Compile
objcopy --weakennmto find the non-weak Python references- If there aren't any, our
objcopycall worked and we're done! - Otherwise, compile again, but explicitly asking to weaken those symbols
nmagain to see if the compiler didn't do it- If any non-weak references still remain, complain and give up.
Whew. This logic appears here and here. There were even more things to deal
with here: calling nm and objcopy needed special attention and build-system
support in case we were cross-building. I took care of it in mrbuild.
This worked for a while. Until the converter code started to fail. Because ….
Supporting old Python
…. I was using PyTuple_GET_ITEM(). This is a macro to access PyTupleObject
data. So the layout of PyTupleObject ended up encoded in libmrcal.so. But
apparently this wasn't stable, and changed between Python3.13 and Python3.14. As
described above, I'm not linking to libpython, so there's no NEEDED tag to
make sure we pull in the right version. The solution was to call the
PyTuple_GetItem() function instead. This is unsatisfying, and means that in
theory other stuff here might stop working in some Python 3.future, but I'm
ready to move on for now.
There were other annoying gymnastics that had to be performed to make this work with old-but-not-super old tooling.
The Python people deprecated PyModule_AddObject(), and added PyModule_Add()
as a replacement. I want to support Pythons before and after this happened, so I
needed some if statements. Today the old function still works, but eventually it
will stop, and I will have needed to do this typing sooner or later.
Supporting old C++ compilers
mrcal is a C project, but it is common for people to want to #include the
headers from C++. I widely use C99 designated initializers (27-years old in C!),
which causes issues with not-very-old C++ compilers. I worked around this
initialization in one spot, and disabled it a feature for a too-old compiler in
another spot. Fortunately, semi-recent tooling supports my usages, so this is
becoming a non-issue as time goes on.