Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weak linking CMake module #47

Closed
wants to merge 8 commits into from

Conversation

opadron
Copy link
Member

@opadron opadron commented Jul 6, 2016

Adapted from vtk#1511.

Adds a new experimental CMake module: targetLinkLibrariesWithDynamicLookup that allows one to portably indicate that a target uses symbols from another library, but that the resolution of symbols should be delayed until runtime, i.e.: the linker should tolerate undefined symbols when linking. "Weak linking" has been used to refer to this code sharing scheme as well as its many subtle variants.

Reviewers, please note the major differences in this PR when compared to vtk#1511.

  • This version uses two instances of a sample project with try_compile to infer the level of support for dynamic symbol resolution. The first determines if the linker responds to the dynamic lookup flag currently present in linker on OS X. The second determines if the linker and run-time loader can properly coalesce symbols that have been duplicated across link boundaries, such as ld/Linux. The original version only included the first check.
  • In addition, each sample project is tested after compiling to ensure that the test program produces the expected output as well as compile successfully. The original used a much simpler test program that would only compile.
  • In the original version, no link is performed when the linker is found to not support the dynamic lookup flag. In general, a proper linking is still necessary on platforms like Linux, whose loader can coalesce duplicate symbols at run-time. In this version, the link is performed when such a loader has been detected.
  • The original version's system checks were performed immediately upon module load and then cached. This version runs the system checks and caches on demand.

Also, includes a fix for when testing the tower-of-babel sample project under anaconda.

@AppVeyorBot
Copy link

set(run_output)

if(project_compiles)
execute_process(COMMAND "${test_project_bin_dir}/main"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execute_process(COMMAND ${CMAKE_CROSSCOMPILING_EMULATOR} "${test_project_bin_dir}/main"

@blowekamp
Copy link
Contributor

blowekamp commented Jul 6, 2016

In the original version, no link is performed when the linker is found to not
support the dynamic lookup flag. In general, a proper linking is still necessary
 on platforms like Linux, whose loader can coalesce duplicate symbols at 
run-time. In this version, the link is performed when such a loader has been 
detected.

The goal of the original cmake file (in SimpleITK) was NOT to link again libpython on Linux and OSX as per distutils and Pypa recommendations. This sounds like you are trying to still link against the library?

I am not sure what the goal of this is now...

endfunction()


function(target_link_libraries_with_dynamic_lookup target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the original logic:

if ( ${CMAKE_SYSTEM_NAME} MATCHES "Darwin" )
     set_target_properties( ${target} PROPERTIES LINK_FLAGS "-undefined dynamic_lookup" )
   elseif(VTK_UNDEFINED_SYMBOLS_ALLOWED)
     # linker allows undefined symbols, let's just not link
   else()
     target_link_libraries ( ${target} ${ARGN}  )
   endif()

This is doing something different now.

@blowekamp
Copy link
Contributor

blowekamp commented Jul 6, 2016

I am also going to point out that having a single executable linked to two shared libraries has quite a different scope for symbol resolution than an executable dynamically loading at runtime two separate libraries.

@jcfr
Copy link
Contributor

jcfr commented Jul 6, 2016

@blowekamp
Copy link
Contributor

blowekamp commented Jul 6, 2016

This I think is how I began down this issue by naively linking against the shared library:

Perhaps it would be work testing on the manylinux1 docker image too?

@opadron
Copy link
Member Author

opadron commented Jul 6, 2016

The logic change was on purpose, but clearly, it was a mistake if everything I read is true.

# linker allows undefined symbols, let's just not link

So, my main confusion stems from the fact that I could not replicate this behavior on Linux -- the linker always refuses to leave the symbols undefined. Is there a linker flag I should be using?

I am also going to point out that having a single executable linked to two shared libraries has quite a different scope for symbol resolution than an executable dynamically loading at runtime two separate libraries.

This sounds right to me, but I'm not sure what the point of bringing this up is? Neither of those scenarios are being tested in the test project. Should they be? Besides this tolerance for undefined symbols that I have yet to observe, is there anything we should be checking for that we're not?

@blowekamp
Copy link
Contributor

blowekamp commented Jul 6, 2016

What is your platform and what exactly are you trying to do to test if "undefined" symbols are allowed? The original CMake file seems to pass on all my linux (elf binaries) systems I have tests [1].

I may have been mistaken about what this CMake code and the test project was trying to do. And I don't know why the test is constructed this way as opposed to other permutations of executable, linked library and runtime loaded library.

I also don't know what problem or issue the HAS_SYMBOL_DEDUPE is trying to address.

I thought I was understanding the linker process pretty well now, but I am guess I have more to read.

[1] https://github.com/SimpleITK/SimpleITK/blob/master/CMake/sitkTargetLinkLibrariesWithDynamicLookup.cmake#L44-L64

@opadron
Copy link
Member Author

opadron commented Jul 6, 2016

I haven't actually tried that particular test code, but I was more referring to building actual Python extension modules. When trying to build a simple extension module, the link fails with undefined symbol errors on Linux. That is- when I don't link to any libpython.

The test project in this PR is based on my experiments here. There is a shared lib, L; a module lib, M; and the executable, E. E is linked against L. E also dynamically loads M. And M, itself uses code from L, but the main question is whether M should be linked against L. This test was supposed to emulate a typical example of building a python extension module. As an aside, I haven't been able to build M without linking it against L on Linux, I run into the exact same problems as I did with actual Python extension modules.

I figure besides library version mismatches, the only other potential source of problems with linking for both the module and the executable is that it may result in code duplication. If the library had some shared state, we should be able to observe the effects of such duplication by inspecting the shared state from the perspectives of E and M, and then comparing. If there is code duplication, the counting routines from each would have their own counters instead of sharing. If the loader coalesces the duplicate symbols, then they would share, which is the desired behavior. This is what I check for under the "SYMBOL_DEDUPING" name.

I'm going to check to see if I can get your simpler test code to compile and link. If so, then there's clearly something about how linking on Linux works that I'm not privy to.

@blowekamp
Copy link
Contributor

Another specific case where not linking solved a problem was on OS X. With binaries of SimpleITK compiled against the system Python 2.7 with explicit linking, when these would create a load error when run on Anaconda or home brew. However, if the "-undefined dynamic_lookup" was used and not linked against libpython, then the binaries would work!

SimpleITK is not linking against libpython, on every linux system I have tested RH5,RH6,RH7, Cent OS5, Some Debian, Ubuntu 12, Ubuntu 14....

I'd be interested in a minimal example and the link line used. I'm thinking of starting tests with gcc command lines here.

@opadron
Copy link
Member Author

opadron commented Jul 6, 2016

Well, I've finally gotten extension modules to build and link on Linux as you and others describe -- without linking.

I have no idea what I'm doing differently, but I'm now a believer. 😅

I want to try to incorporate this into my weak-linking-demo and see how that changes things. Then, I'll come back to this and make the necessary corrections.

Thanks for keeping me honest. 👍

@opadron
Copy link
Member Author

opadron commented Jul 8, 2016

Check out opadron/weak-linking-demo#3 for an updated version of the module and an updated discussion in the README.

@AppVeyorBot
Copy link

@AppVeyorBot
Copy link

@opadron
Copy link
Member Author

opadron commented Jul 16, 2016

This should be ready to go.

@AppVeyorBot
Copy link

endif()

file(APPEND "${test_project_src_dir}/main.c" "
result = count() != 0 ? 1 :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 -> EXIT_FAILURE

@opadron
Copy link
Member Author

opadron commented Jul 16, 2016

continued in #51

@opadron opadron closed this Jul 16, 2016
@opadron opadron deleted the weak-linking-module branch July 16, 2016 21:38
jcfr added a commit to jcfr/weak-linking-demo that referenced this pull request Sep 25, 2016
This commit integrates changes associated with the module from
scikit-build/scikit-build@a415df9

The corresponding changes were originally added in scikit-build/scikit-build#47
and scikit-build/scikit-build@a606dc7 (handle missing target case)

IMPORTANT:

This backported commit causes all tests to *FAIL*. It has been kept
like this so that the next commit can clearly illustrate what is
required to fix them.

It also mean that the implementation in "scikit-build" is broken.


For example, on Linux, output is:

$ rm  -rf _build/ && dockcross-manylinux-x64 ./testAllBuilds.bash

CASE  CONFIG BUILD  TEST
s00   fail   fail   RTE
s01   fail   fail   RTE
s10   fail   fail   RTE
s11   pass   pass   pass
d00   fail   fail   RTE
d01   fail   fail   RTE
d10   fail   fail   RTE
d11   pass   pass   pass


whereas the following is expected:

CASE  CONFIG BUILD  TEST
s00   pass   pass   RTE
s01   pass   pass   pass
s10   pass   pass   RTE
s11   pass   pass   pass
d00   pass   pass   RTE
d01   pass   pass   pass
d10   pass   pass   RTE
d11   pass   pass   pass
jcfr added a commit that referenced this pull request Sep 25, 2016
This commit has two purposes:

(1) it fixes logic checking for cross-compilation (the regression
was introduced by #51 and #47

(2) It configure the text project setting CMAKE_ENABLE_EXPORTS to ON. Doing
so ensure the executable compiled in the test exports symbols (if supported
by the underlying platform).
See https://cmake.org/cmake/help/v3.6/prop_tgt/ENABLE_EXPORTS.html?highlight=enable_export
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants