Add cmake flag USE_FATBIN_COMPRESSION, ON by default #19123

DickJC123 · 2020-09-11T23:18:06Z

As the size of libmxnet.so grows to near 2GB, with increased functionality and the addition of cuda architectures, we're running into link failures, e.g. see issue #17045

One technique that lowers lib size dramatically is 'fatbin compression', enabled by the nvcc options --fatbin-options -compress-all. This has been always a part of Makefile builds, but this PR adds it to the cmake builds. Specifically, this PR adds support to CMakeLists.txt for the cmake option -DUSE_FATBIN_COMPRESSION={ON,OFF}, with a default of ON for CUDA 11 builds and beyond. This PR proposes to leave existing cmake builds against 10.2 as they are, without fatbin compression, to avoid unnecessarily introducing unforeseen consequences to existing use cases.

Results of experiments building the 1.x branch with cuda11:

With cmake options -DMXNET_CUDA_ARCH="5.2 6.0 6.1 7.0 7.2 7.5 8.0" -DUSE_FATBIN_COMPRESSION=OFF, a cuda11 build fails with link error:

libmxnet.so: PC-relative offset overflow in PLT entry for void mxnet::op::mxnet_op::Kernel<...> ...

With the same above cmake options, but dropping arches 5.2 and 7.2, the build succeeds with a libmxnet.so size of 1.8GB.
Finally, with the same first cmake options -DMXNET_CUDA_ARCH="5.2 6.0 6.1 7.0 7.2 7.5 8.0" a cuda11 build (using fatbin compression then by default) succeeds with a libmxnet.so size of 750MB, so over a 2X decrease in size.

Both succeeding builds, one with fatbin compression and one without, ran the command:

time python -c "import mxnet as mx; x = mx.nd.array([1,], ctx=mx.gpu(0)); print((x+1).asnumpy()))"
[2.]

in the same time of 7.6 secs.

@samskalicky @anirudh2290 @ChaiBapchya @ptrendx

mxnet-bot · 2020-09-11T23:18:08Z

Hey @DickJC123 , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [website, windows-gpu, miscellaneous, edge, centos-cpu, unix-cpu, centos-gpu, sanity, unix-gpu, windows-cpu, clang]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

leezu · 2020-09-11T23:43:40Z

As we can change default options before the 2.0 release, it may be helpful to always enable USE_FATBIN_COMPRESSION? A couple of users run into the linking errors if gpu auto-detection fails and cmake defaults to building for all "common" gpu architectures.

DickJC123 · 2020-09-12T00:06:35Z

I've adjusted the default to be always ON, regardless of CUDA version.

samskalicky

Thanks @DickJC123 for this great addition!

…pache#19123)

…19123) (#19158) * [1.x] Backport Add cmake flag USE_FATBIN_COMPRESSION, ON by default (#19123) * Trigger CI * Appending to existing CMAKE_CUDA_FLAGS in all cases

…pache#19123) (apache#19158) * [1.x] Backport Add cmake flag USE_FATBIN_COMPRESSION, ON by default (apache#19123) * Trigger CI * Appending to existing CMAKE_CUDA_FLAGS in all cases

…pache#19123) * Add cmake flag USE_FATBIN_COMPRESSION, ON by default for CUDA >= 11 * cmake flag USE_FATBIN_COMPRESSION default is ON for all builds

Add cmake flag USE_FATBIN_COMPRESSION, ON by default for CUDA >= 11

de15a92

DickJC123 requested review from leezu and szha as code owners September 11, 2020 23:18

DickJC123 requested a review from ptrendx September 11, 2020 23:21

cmake flag USE_FATBIN_COMPRESSION default is ON for all builds

205656b

samskalicky mentioned this pull request Sep 12, 2020

[RFC] v1.8.0 release #18800

Open

samskalicky approved these changes Sep 12, 2020

View reviewed changes

szha merged commit 5c1aadc into apache:master Sep 12, 2020

DickJC123 changed the title ~~Add cmake flag USE_FATBIN_COMPRESSION, ON by default for CUDA >= 11~~ Add cmake flag USE_FATBIN_COMPRESSION, ON by default Sep 15, 2020

DickJC123 added a commit to DickJC123/mxnet that referenced this pull request Sep 15, 2020

[1.x] Backport Add cmake flag USE_FATBIN_COMPRESSION, ON by default (a…

2878ae8

…pache#19123)

DickJC123 mentioned this pull request Sep 15, 2020

[1.x] Backport Add cmake flag USE_FATBIN_COMPRESSION, ON by default (#19123) #19158

Merged

6 tasks

DickJC123 mentioned this pull request Sep 18, 2020

FwdPort of "1.x-Backport Add cmake flag USE_FATBIN_COMPRESSION, ON by default (#19123) (#19158)" #19175

Merged

leezu mentioned this pull request Oct 5, 2020

Relocation truncation issues #17045

Open

This was referenced Mar 9, 2023

Build failure: magma NixOS/nixpkgs#220357

Closed

cudaPackages: fix #220357; use -Xfatbin=-compress-all; prune default cudaCapabilities NixOS/nixpkgs#220402

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cmake flag USE_FATBIN_COMPRESSION, ON by default #19123

Add cmake flag USE_FATBIN_COMPRESSION, ON by default #19123

DickJC123 commented Sep 11, 2020 •

edited

Loading

mxnet-bot commented Sep 11, 2020

leezu commented Sep 11, 2020

DickJC123 commented Sep 12, 2020

samskalicky left a comment

Add cmake flag USE_FATBIN_COMPRESSION, ON by default #19123

Add cmake flag USE_FATBIN_COMPRESSION, ON by default #19123

Conversation

DickJC123 commented Sep 11, 2020 • edited Loading

mxnet-bot commented Sep 11, 2020

leezu commented Sep 11, 2020

DickJC123 commented Sep 12, 2020

samskalicky left a comment

Choose a reason for hiding this comment

DickJC123 commented Sep 11, 2020 •

edited

Loading