Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TriBITS: Nalu-Wind Cuda builds won't configure #10954

Closed
tasmith4 opened this issue Aug 30, 2022 · 35 comments
Closed

TriBITS: Nalu-Wind Cuda builds won't configure #10954

tasmith4 opened this issue Aug 30, 2022 · 35 comments
Labels
client: ExaWind All issue that impact the ECP project ExaWind TriBITS Issues with the TriBITS framework itself, not usage of the TriBITS framework type: bug The primary issue is a bug in Trilinos code or tests

Comments

@tasmith4
Copy link
Contributor

Bug Report

@bartlettroscoe @psakievich

Description

When building Nalu-Wind with Cuda enabled against the latest Trilinos develop, we get Cuda-related configure errors such as add_library cannot create imported target "CUDA::cufft" because another target with the same name already exists. (see below for more extended output). We are reasonably certain these are caused by #10614 (unfortunately it took us a while to suspect this was the culprit, thus the delayed reporting of this issue).

@bartlettroscoe I did not see these errors in the list in #10774, do you have ideas on what needs to change in Nalu-Wind's CMake system?

This can be reproduced with spack-manager (developer workflow quick start instructions here) on ascicgpu machines, using the spec nalu-wind+cuda cuda_arch=70 ^trilinos@develop, although @psakievich and I are happy to try stuff out ourselves as needed to resolve this issue.

Extended output: (click to expand)
-- The CUDA compiler identification is NVIDIA 11.2.152
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/include (found version "11.2.152").
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE..
-- Found CUDAToolkit = 11.2.152 (/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/lib64)
CMake Error at /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/external_packages/CUDA/CUDAConfig.cmake:34 (add_library):
  add_library cannot create imported target "CUDA::cufft" because another
  target with the same name already exists.
Call Stack (most recent call first):
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/KokkosCore/KokkosCoreConfig.cmake:156 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/TeuchosCore/TeuchosCoreConfig.cmake:196 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Teuchos/TeuchosConfig.cmake:194 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/TpetraCore/TpetraCoreConfig.cmake:251 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Tpetra/TpetraConfig.cmake:204 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Zoltan2Core/Zoltan2CoreConfig.cmake:186 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Zoltan2/Zoltan2Config.cmake:154 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Trilinos/TrilinosConfig.cmake:123 (include)
  CMakeLists.txt:82 (find_package)


CMake Error at /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/external_packages/CUDA/CUDAConfig.cmake:41 (add_library):
  add_library cannot create imported target "CUDA::cublas" because another
  target with the same name already exists.
Call Stack (most recent call first):
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/KokkosCore/KokkosCoreConfig.cmake:156 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/TeuchosCore/TeuchosCoreConfig.cmake:196 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Teuchos/TeuchosConfig.cmake:194 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/TpetraCore/TpetraCoreConfig.cmake:251 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Tpetra/TpetraConfig.cmake:204 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Zoltan2Core/Zoltan2CoreConfig.cmake:186 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Zoltan2/Zoltan2Config.cmake:154 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Trilinos/TrilinosConfig.cmake:123 (include)
  CMakeLists.txt:82 (find_package)


CMake Error at /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/external_packages/CUDA/CUDAConfig.cmake:47 (add_library):
  add_library cannot create imported target "CUDA::cudart" because another
  target with the same name already exists.
Call Stack (most recent call first):
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/KokkosCore/KokkosCoreConfig.cmake:156 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/TeuchosCore/TeuchosCoreConfig.cmake:196 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Teuchos/TeuchosConfig.cmake:194 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/TpetraCore/TpetraCoreConfig.cmake:251 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Tpetra/TpetraConfig.cmake:204 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Zoltan2Core/Zoltan2CoreConfig.cmake:186 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Zoltan2/Zoltan2Config.cmake:154 (include)
  /fgs/tasmit/exawind/sandbox/halley_root/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-yjrp7k4c5yxagjzv5yspb35fs5q7ihky/lib/cmake/Trilinos/TrilinosConfig.cmake:123 (include)
  CMakeLists.txt:82 (find_package)
@tasmith4 tasmith4 added type: bug The primary issue is a bug in Trilinos code or tests TriBITS Issues with the TriBITS framework itself, not usage of the TriBITS framework client: ExaWind All issue that impact the ECP project ExaWind labels Aug 30, 2022
@tasmith4
Copy link
Contributor Author

@alanw0

@bartlettroscoe
Copy link
Member

Is it possible to reproduce the problem outside of Spack?

@tasmith4
Copy link
Contributor Author

Is it possible to reproduce the problem outside of Spack?

It's very difficult to setup everything correctly to get a build outside of spack-manager. Inside of the spack-manager framework, we tried with Trilinos hashes before and after #10614, and the Nalu-Wind build succeeded with the old hash and failed with the new hash. Also, the last night that the Nalu-Wind Cuda nightlies configured correctly on ascicgpu was July 17/18, right before #10614 merged.

If needed, I can develop a set of scripts to reproduce this without using spack-manager (although you'll still need spack for TPLs below Trilinos), but it will take a little while. Let me know though, because if it's the only path forward I'll definitely do it.

@bartlettroscoe
Copy link
Member

It's very difficult to setup everything correctly to get a build outside of spack-manager

@tasmith4, that is what I figured. But can you at least run a reconfigure after Spack loads the build env and puts you in the build dir? There is no mention at all of CMake in:

What is your experience with this, just so I have the right expectations?

@tasmith4
Copy link
Contributor Author

So spack will automatically run cmake as part of the install step. When you have Trilinos setup as a develop spec in spack, you can go to the Trilinos build directory and blow away the CMake* files as you might in any other project, which will force spack to do the reconfigure.

@bartlettroscoe
Copy link
Member

@tasmith4, how do you force Spack to do just the configure? What are the exact commands to use in this workflow?

@tasmith4
Copy link
Contributor Author

Do you want to reconfigure nalu-wind, trilinos, or both?

@bartlettroscoe
Copy link
Member

Do you want to reconfigure nalu-wind, trilinos, or both?

Likely both. Depends if the errors are present when building only Nalu or if you can you reproduce them by enabling and building the Trilinos tests as well?

I am guessing that I will need to be on an ascicgpu machine to reproduce this?

@tasmith4
Copy link
Contributor Author

We can't reproduce by enabling Trilinos tests (specifically, we did not see the issue in a build that had STK tests turned on).

I think in this workflow you need to run spack install once to get everything setup. Then if you tweak the Nalu-Wind cmake files, I believe you can run the configure step only with spack install -u cmake. If you have to change anything in Trilinos, I think you might be forced to allow the entire Trilinos build to proceed, but I will admit to never having actually tried to do this before.

It is likely you'll need to be on an ascicgpu machine, do you have access to one?

@bartlettroscoe
Copy link
Member

If you have to change anything in Trilinos, I think you might be forced to allow the entire Trilinos build to proceed, but I will admit to never having actually tried to do this before.

Would I be blazing new territory here with Spack usage at SNL? Has no one tried to do micro-level CMake configuration and building with Trilinos through Spack?

It is likely you'll need to be on an ascicgpu machine, do you have access to one?

I have access to some. Just checking that is the correct platform to try.

@tasmith4
Copy link
Contributor Author

Would I be blazing new territory here with Spack usage at SNL? Has no one tried to do micro-level CMake configuration and building with Trilinos through Spack?

Probably? Typically, we've done that by editing the trilinos package at https://github.com/psakievich/spack-manager/blob/main/repos/exawind/packages/trilinos/package.py. But I don't know anyone who's only been focused on the configure step, usually it's a means to getting the right build so we always want the build & install. @psakievich might have tried something else or know others who have.

@psakievich
Copy link
Contributor

@bartlettroscoe you can dive into the build environment and tweak things like you are in a normal development with the spack-manager command build-env-dive nalu-wind or trilinos respectively. That will take you to the build directory and load the exact build environment spack used in a sub shell. From there you can manually run make clean cmake or any other commands you would like. This is probably your best bet for iterating on the configure step.

Spack is essentially just embodying the generation of a CMake script through it's package+the configure phase so you can also edit the package.py files for each of the packages but that would be considerably slower.

@bartlettroscoe
Copy link
Member

@psakievich and @tasmith4, what exact version of the Spack repo are you using and what are the exact arguments you are passing to Spack? Is it just nalu-wind+cuda cuda_arch=70 ^trilinos@develop?

@psakievich
Copy link
Contributor

psakievich commented Aug 30, 2022

@bartlettroscoe we have spack submoduled in spack-manager. We stay pretty close to the edge of the develop branch. But an exact reproducer would be:

# setup spack-manager
git clone --recursive https://github.com/psakievich/spack-manager
export SPACK_MANAGER=$(pwd)/spack-manager
source ${SPACK_MANAGER}/start.sh
# create the spack environment
quick-create-dev --name bug --spec nalu-wind@master+cuda cuda_arch=70 trilinos@develop
# build
spack install
# modify configure (takes you to the build directory etc.)
build-env-dive nalu-wind

Edit: If you would like any help with this feel free to schedule a meeting with me.

@tasmith4
Copy link
Contributor Author

@ldh4

@bartlettroscoe
Copy link
Member

Edit: If you would like any help with this feel free to schedule a meeting with me.

@psakievich, I will give this a try and get back to you. It will give me a chance to kick the tires on Spack support for development and provide some feedback.

@jjellio
Copy link
Contributor

jjellio commented Sep 6, 2022

@bartlettroscoe When I want to stop spack after a configure - I break it. spack -d install --verbose --keep-stage ... then wait for the config spam (or you can break it b as soon as you see it echo the CMake stuff.). Then mash ctrl-C like a champion. Then copy all the cmake stuff, go to where it staged it, and mess around. Things can get weird though, sometimes this can ruin your terminal (need to start a new shell).

This isn't ideal, because Spack sets various ENVs - most relevant is CMAKE_PREFIX (forget the name) - that talls CMake where to find cmake modules. Still, for most of my stuff, Trilinos' CMake makes it explicit where TPLs live... so you can just make sure you've got the correct compilers in path and configure with the copy/paste'd Cmake line.

There's a way to have spack stop - and drop you into the build ENV, but that's way above my pay grade.

Another way is to spack edit trilinos and add a hook around Cmake (the real call to Cmake) - and have it exit(0) - I.e., exit with a "good" code. This usually blocks it from deleting the stage.

    def cmake(self, spec, prefix):
        print("++++========== ////      ||||     \\\\\\\\ ================== ++++++")
        print(" Calling cmake hook:")
      
        super(Trilinos, self).cmake(spec, prefix)

It all feels like a mess to me. Wish spack supported R&D "using spack" (ya know, just drop the build Dir + cmake configure and exit) - I think it can do that, but it should be way easier and more obvious (Clearly!)

@psakievich
Copy link
Contributor

@jjellio FWIW this is a way to drop into the environment without all that extra work that I curated. It is the last line in #10954 (comment)

The source code really isn't bad. Just dump the env to file and then source it in a new shell.

There is another native command spack dev-build which is what you are referencing. I'm not sure if that works in environments though. It's been a long time since I've used that one. However both of these seem much simpler than the tactics you are currently using.

@tasmith4
Copy link
Contributor Author

@bartlettroscoe have you had a chance to look into this? It is a blocker to upgrading the version of Trilinos we use with Nalu-Wind.

@bartlettroscoe
Copy link
Member

Thanks for reminder. Let me see if I can reproduce following commands above.

@bartlettroscoe
Copy link
Member

bartlettroscoe commented Sep 29, 2022

I have been able to reproduce the configure failure shown above (see details section below) and I believe the solution is straightforward. The problem is the the TriBITS TPL glue module tribits/cmake/core/std_tpls/FindTPLCUDA.cmake is using the deprecated FindCUDA.cmake module and instead should be using the new FindCUDAToolkit.cmake module.

What is happening is the the old FindCUDA.cmake module does not create modern imported targets and therefore the TriBITS glue module FindTPLCUDA.cmake has to add modern imported CMake targets itself which happen to clash with the names of the modern imported targets created by FindCUDAToolkit.cmake.

So the next step is to upgrade tribits/cmake/core/std_tpls/FindTPLCUDA.cmake along the lines of Creating FindTPL.cmake using find_package() with IMPORTED targets. That will result in both Nalu-wind's CMakeLists.txt file and the TriBITS-generated CUDAConfig.cmake module calling find_package(CUDAToolkit) (and the latter call will be a non-opt because CUDAToolkit would have already been found).

Using Spack-manager, it is possible to just rebuild Trilinos from the local Git repo in-place and then try building Nalu-wind again? That will allow me to confirm quickly that this fixes this Nalu-wind use case.

Detailed notes on reproduction and analysis of the bug (click to expand)

I am now going to try to reproduce the Nalu failure. It seems I need to reproduce this on an ascicgpu machine. The instructions are at:

Doing:

$ ssh ascicgpu17

$ cd /fgs/rabartl/SpackManager.base/

$ git clone --recursive https://github.com/psakievich/spack-manager
Cloning into 'spack-manager'...
remote: Enumerating objects: 3958, done.
remote: Counting objects: 100% (337/337), done.
remote: Compressing objects: 100% (197/197), done.
remote: Total 3958 (delta 142), reused 284 (delta 114), pack-reused 3621
Receiving objects: 100% (3958/3958), 3.54 MiB | 3.79 MiB/s, done.
Resolving deltas: 100% (2030/2030), done.
Submodule 'spack' (https://github.com/spack/spack) registered for path 'spack'
Cloning into '/fgs/rabartl/SpackManager.base/spack-manager/spack'...
remote: Enumerating objects: 394217, done.        
remote: Counting objects: 100% (188/188), done.        
remote: Compressing objects: 100% (112/112), done.        
remote: Total 394217 (delta 69), reused 158 (delta 48), pack-reused 394029        
Receiving objects: 100% (394217/394217), 198.11 MiB | 13.70 MiB/s, done.
Resolving deltas: 100% (157984/157984), done.
Submodule path 'spack': checked out '66b451a70d8d89e00baad7614821cb7cc21b3e3c'

$ export SPACK_MANAGER=$(pwd)/spack-manager

$ source ${SPACK_MANAGER}/start.sh

$ quick-create-dev --name bug --spec nalu-wind@master+cuda cuda_arch=70 trilinos@develop
+ spack-start
==> Added repo with namespace 'exawind'.
+ spack manager create-dev-env --name bug --spec nalu-wind@master+cuda cuda_arch=70 trilinos@develop
making /fgs/rabartl/SpackManager.base/spack-manager/environments/bug
==> Configuring spec nalu-wind@master+cuda cuda_arch=70 for development at path nalu-wind
==> Warning: included configuration files should be updated manually [files=include.yaml]
Cloning into '/fgs/rabartl/SpackManager.base/spack-manager/environments/bug/trilinos'...
remote: Enumerating objects: 1113803, done.
remote: Counting objects: 100% (222/222), done.
remote: Compressing objects: 100% (157/157), done.
remote: Total 1113803 (delta 91), reused 176 (delta 65), pack-reused 1113581
Receiving objects: 100% (1113803/1113803), 617.33 MiB | 27.62 MiB/s, done.
Resolving deltas: 100% (901254/901254), done.
Updating files: 100% (52559/52559), done.
==> Configuring spec trilinos@develop for development at path trilinos
+ spack env activate --dir /fgs/rabartl/SpackManager.base/spack-manager/environments/bug --prompt

$ time spack install

...

==> Installing nalu-wind-master-gs4m3a3codejvactaj765tp7fxaoyods
==> No binary for nalu-wind-master-gs4m3a3codejvactaj765tp7fxaoyods found: installing from source
==> No patches needed for nalu-wind
==> nalu-wind: Executing phase: 'cmake'
==> Error: ProcessError: Command exited with status 1:
    '/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cmake-3.22.2-s67tgoaqaw4ky5kmpuzxdg5une7xouvh/bin/cmake' '-G' 'Unix Makefiles' '-DCMAKE_INSTALL_PREFIX:STRING=/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/nalu-wind-master-gs4m3a3codejvactaj765tp7fxaoyods' '-DCMAKE_BUILD_TYPE:STRING=Release' '-DBUILD_TESTING:BOOL=OFF' '-DCMAKE_INTERPROCEDURAL_OPTIMIZATION:BOOL=OFF' '-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON' '-DCMAKE_INSTALL_RPATH_USE_LINK_PATH:BOOL=ON' '-DCMAKE_INSTALL_RPATH:STRING=/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/nalu-wind-master-gs4m3a3codejvactaj765tp7fxaoyods/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/nalu-wind-master-gs4m3a3codejvactaj765tp7fxaoyods/lib64;/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/mpich-3.4.2-4h2muy6jlgfdahehxmxa4ybndfdb6gx2/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/netcdf-c-4.7.4-fhjnttm33jzlhzlpacidomfedonlrx4h/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/hdf5-1.10.7-opndkl5zmxwmn2uolfvydijj7uvgty3u/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/zlib-1.2.12-p3jjc3wrlk3twnxtlxgs5ishtew652mu/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/parallel-netcdf-1.12.2-yruxln2rpy6bkjbvy5ambpmtyf7ai4qh/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/boost-1.76.0-z6ogxo3eotdutkfvsczhi6gidki5o5ei/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cgns-4.3.0-2tq3i7krfbdpsue25cypkpaoz2dgmv2g/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/hwloc-2.8.0-j26x7fuce32u53r47h4o7jforusu3vkz/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/libpciaccess-0.16-5nwvyfabqkomjcet44xufnzoious2czn/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/libxml2-2.10.1-nvx5uofxclw53ffthkj2egqrfdlnf2dg/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/libiconv-1.16-zunfjy4bcka4jr3q377yzbxsz6ceebjf/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/xz-5.2.5-d7jjten22dit7unhmacijnlddjzzxwbn/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/ncurses-6.3-4pwku6zergquosiwpe2ocf5juqcpxrbj/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/matio-1.5.17-b3fgfiq22nsqdoby2c2ueufw4jiu3knf/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/metis-5.1.0-rkfcjo3ytp5p5uqikc445xx4girvrwts/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/parmetis-4.0.3-ifeofmlnrwu7btqysph2neackprtx4pg/lib;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/yaml-cpp-0.6.3-nl5y4ihyxzs5gnooxqermu2dokm6acro/lib;/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/lib64;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/netlib-lapack-3.10.1-6637vng6we5kxbhw4hz5agbveonefbsh/lib64' '-DCMAKE_PREFIX_PATH:STRING=/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/yaml-cpp-0.6.3-nl5y4ihyxzs5gnooxqermu2dokm6acro;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/parmetis-4.0.3-ifeofmlnrwu7btqysph2neackprtx4pg;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/netlib-lapack-3.10.1-6637vng6we5kxbhw4hz5agbveonefbsh;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/metis-5.1.0-rkfcjo3ytp5p5uqikc445xx4girvrwts;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/matio-1.5.17-b3fgfiq22nsqdoby2c2ueufw4jiu3knf;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/hwloc-2.8.0-j26x7fuce32u53r47h4o7jforusu3vkz;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/libxml2-2.10.1-nvx5uofxclw53ffthkj2egqrfdlnf2dg;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/xz-5.2.5-d7jjten22dit7unhmacijnlddjzzxwbn;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/libpciaccess-0.16-5nwvyfabqkomjcet44xufnzoious2czn;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cgns-4.3.0-2tq3i7krfbdpsue25cypkpaoz2dgmv2g;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/boost-1.76.0-z6ogxo3eotdutkfvsczhi6gidki5o5ei;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/nccmp-1.9.0.1-a5dzngmwkzowpwjqkrfssbfugcihvqho;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/netcdf-c-4.7.4-fhjnttm33jzlhzlpacidomfedonlrx4h;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/parallel-netcdf-1.12.2-yruxln2rpy6bkjbvy5ambpmtyf7ai4qh;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/ncurses-6.3-4pwku6zergquosiwpe2ocf5juqcpxrbj;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/libiconv-1.16-zunfjy4bcka4jr3q377yzbxsz6ceebjf;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/hdf5-1.10.7-opndkl5zmxwmn2uolfvydijj7uvgty3u;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/zlib-1.2.12-p3jjc3wrlk3twnxtlxgs5ishtew652mu;/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/kokkos-nvcc-wrapper-3.2.00-m7c2p4ebbvazvpncj4nqxbhjsknrwqiw;/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/mpich-3.4.2-4h2muy6jlgfdahehxmxa4ybndfdb6gx2/;/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/;/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/targets/x86_64-linux/lib/cmake;/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cmake-3.22.2-s67tgoaqaw4ky5kmpuzxdg5une7xouvh/' '-DCMAKE_POSITION_INDEPENDENT_CODE:BOOL=ON' '-DCMAKE_CXX_COMPILER:STRING=/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/mpich-3.4.2-4h2muy6jlgfdahehxmxa4ybndfdb6gx2/bin/mpic++' '-DCMAKE_Fortran_COMPILER:STRING=/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/mpich-3.4.2-4h2muy6jlgfdahehxmxa4ybndfdb6gx2/bin/mpif90' '-DTrilinos_DIR:STRING=/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx' '-DYAML_DIR:STRING=/fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/yaml-cpp-0.6.3-nl5y4ihyxzs5gnooxqermu2dokm6acro' '-DENABLE_CUDA:BOOL=ON' '-DENABLE_WIND_UTILS:BOOL=OFF' '-DENABLE_BOOST:BOOL=OFF' '-DENABLE_OPENFAST:BOOL=OFF' '-DENABLE_TIOGA:BOOL=OFF' '-DENABLE_HYPRE:BOOL=OFF' '-DENABLE_PARAVIEW_CATALYST:BOOL=OFF' '-DENABLE_FFTW:BOOL=OFF' '-DENABLE_TESTS:BOOL=OFF' '-DCMAKE_CXX_STANDARD:STRING=14' '-DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=ON' '-DENABLE_TESTS:BOOL=ON' '-DNALU_WIND_SAVE_GOLDS:BOOL=ON' '-DNALU_WIND_SAVED_GOLDS_DIR:STRING=/fgs/rabartl/SpackManager.base/spack-manager/golds/tmp/nalu-wind' '-DNALU_WIND_REFERENCE_GOLDS_DIR:STRING=/fgs/rabartl/SpackManager.base/spack-manager/golds/current/nalu-wind' '/fgs/rabartl/SpackManager.base/spack-manager/environments/bug/nalu-wind'

3 errors found in build log:
     18    -- Looking for C++ include pthread.h
     19    -- Looking for C++ include pthread.h - found
     20    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
     21    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
     22    -- Found Threads: TRUE
     23    -- Found CUDAToolkit = 11.2.152 (/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/lib64)
  >> 24    CMake Error at /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/external_packages/C
           UDA/CUDAConfig.cmake:10 (add_library):
     25      add_library cannot create imported target "CUDA::cufft" because another
     26      target with the same name already exists.
     27    Call Stack (most recent call first):
     28      /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/KokkosCore/KokkosCoreConfi
           g.cmake:156 (include)
     29      /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/TeuchosCore/TeuchosCoreCon
           fig.cmake:196 (include)
     30      /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Teuchos/TeuchosConfig.cmak
           e:193 (include)

     ...

     33      /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Zoltan2Core/Zoltan2CoreCon
           fig.cmake:185 (include)
     34      /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Zoltan2/Zoltan2Config.cmak
           e:153 (include)
     35      /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Trilinos/TrilinosConfig.cm
           ake:123 (include)
     36      CMakeLists.txt:79 (find_package)

Okay, so that reproduced the problem.

Now to see if I can enter the dev env and do the configure manually:

$ ssh ascicgpu17

$ cd /fgs/rabartl/SpackManager.base/

$ cat load-env.sh 
export SPACK_MANAGER=$(pwd)/spack-manager
source ${SPACK_MANAGER}/start.sh
quick-create-dev --name bug --spec nalu-wind@master+cuda cuda_arch=70 trilinos@develop

$ . load-env.sh

$ build-env-dive nalu-wind

$ pwd
/fgs/rabartl/SpackManager.base/spack-manager/environments/bug/nalu-wind/spack-build-gs4m3a3

Now let's try to configure manually:

$ cmake .
-- Found CUDAToolkit = 11.2.152 (/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/lib64)
CMake Error at /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/external_packages/CUDA/CUDAConfig.cmake:10 (add_library):
  add_library cannot create imported target "CUDA::cufft" because another
  target with the same name already exists.
Call Stack (most recent call first):
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/KokkosCore/KokkosCoreConfig.cmake:156 (include)
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/TeuchosCore/TeuchosCoreConfig.cmake:196 (include)
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Teuchos/TeuchosConfig.cmake:193 (include)
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/TpetraCore/TpetraCoreConfig.cmake:251 (include)
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Tpetra/TpetraConfig.cmake:203 (include)
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Zoltan2Core/Zoltan2CoreConfig.cmake:185 (include)
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Zoltan2/Zoltan2Config.cmake:153 (include)
  /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/trilinos-develop-wmhhsdfv3on6ahzhepsrxn7373eakdlx/lib/cmake/Trilinos/TrilinosConfig.cmake:123 (include)
  CMakeLists.txt:79 (find_package)

Now I need to find out where the nalu-wind source tree is located? Greping the CMakeCache.txt file I see:

Nalu-Wind_SOURCE_DIR:STATIC=/fgs/rabartl/SpackManager.base/spack-manager/environments/bug/nalu-wind

And in the file:

  • /fgs/rabartl/SpackManager.base/spack-manager/environments/bug/nalu-wind/CMakeLists.txt

at lines 77-86, I see:

########################## TRILINOS ####################################
set(CMAKE_PREFIX_PATH ${Trilinos_DIR} ${CMAKE_PREFIX_PATH})
find_package(Trilinos QUIET REQUIRED)
message(STATUS "Found Trilinos = ${Trilinos_LIBRARY_DIRS}")
target_link_libraries(nalu PUBLIC ${Trilinos_LIBRARIES})
target_include_directories(nalu SYSTEM PUBLIC ${Trilinos_INCLUDE_DIRS})
target_include_directories(nalu SYSTEM PUBLIC ${Trilinos_TPL_INCLUDE_DIRS})
if(Trilinos_BUILD_SHARED_LIBS)
  set(BUILD_SHARED_LIBS ON)
endif()

Okay, I think the problem here is the find_package(CUDAToolkit) command at lines 42-52:

if(ENABLE_CUDA)
  enable_language(CUDA)
  find_package(CUDAToolkit REQUIRED)
  message(STATUS "Found CUDAToolkit = ${CUDAToolkit_VERSION} (${CUDAToolkit_LIBRARY_DIR})")
  target_link_libraries(nalu PUBLIC
    CUDA::cusparse
    CUDA::curand
    CUDA::cudart
    CUDA::cublas
    CUDA::nvToolsExt)
endif()

So it must be that the FindCUDAToolkit.cmake module is creating the target CUDA::cufft and so is the TriBITS TPL system in the file trilinos/cmake/tribits/core/std_tpls/FindTPLCUDA.cmake which currently is:

find_package(CUDA REQUIRED)  # Will abort if not found!

macro(package_add_cuda_library cuda_target)
  tribits_add_library(${cuda_target} ${ARGN} CUDALIBRARY)
endmacro()

set(TPL_CUDA_INCLUDE_DIRS ${CUDA_TOOLKIT_INCLUDE})
set(TPL_CUDA_LIBRARIES ${CUDA_CUDART_LIBRARY} ${CUDA_cublas_LIBRARY}
   ${CUDA_cufft_LIBRARY})

tribits_tpl_find_include_dirs_and_libraries(CUDA
  REQUIRED_LIBS_NAMES  willNotBeUsed)

unset(TPL_CUDA_INCLUDE_DIRS)
unset(TPL_CUDA_LIBRARIES)

To test this, I will comment out lines CMakeLists.txt:42-52:

$ git diff --word-diff-regex=.
diff --git a/CMakeLists.txt b/CMakeLists.txt
index a83d89b..c318f4a 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -39,17 +39,17 @@ add_executable(${utest_ex_name} ${CMAKE_CURRENT_SOURCE_DIR}/unit_tests.C)
find_package(MPI REQUIRED)
target_link_libraries(nalu PUBLIC $<$<BOOL:${MPI_CXX_FOUND}>:MPI::MPI_CXX>)

{+#+}if(ENABLE_CUDA)
{+#+}  enable_language(CUDA)
{+#+}  find_package(CUDAToolkit REQUIRED)
{+#+}  message(STATUS "Found CUDAToolkit = ${CUDAToolkit_VERSION} (${CUDAToolkit_LIBRARY_DIR})")
{+#+}  target_link_libraries(nalu PUBLIC
{+#+}    CUDA::cusparse
{+#+}    CUDA::curand
{+#+}    CUDA::cudart
{+#+}    CUDA::cublas
{+#+}    CUDA::nvToolsExt)
{+#+}endif()

if(ENABLE_ROCM)

and configure again:

$ cmake .
-- Enabled Kokkos devices: CUDA;SERIAL
-- Enabled Kokkos devices: CUDA;SERIAL
-- Found Trilinos = 
-- Found YAML-CPP = /fgs/rabartl/SpackManager.base/spack-manager/spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/yaml-cpp-0.6.3-nl5y4ihyxzs5gnooxqermu2dokm6acro/lib/cmake/yaml-cpp
-- CMAKE_SYSTEM_NAME = Linux
-- CMAKE_CXX_COMPILER_ID = GNU
-- CMAKE_BUILD_TYPE = Release
-- Trilinos git commit = 820615d4fa4
-- Reference gold files will be expected here: /fgs/rabartl/SpackManager.base/spack-manager/golds/current/nalu-wind/Linux/GNU/9.3.0
-- Using test tolerance: abs = 1.0e-15, rel = 1.0e-12
-- Gold files will be saved to: /fgs/rabartl/SpackManager.base/spack-manager/golds/tmp/nalu-wind/Linux/GNU/9.3.0
-- Configuring done
-- Generating done
-- Build files have been written to: /fgs/rabartl/SpackManager.base/spack-manager/environments/bug/nalu-wind/spack-build-gs4m3a3

And bingo.

The TriBITS-generated CUDAConfig.cmake file contains:

# Package config file for external package/TPL 'CUDA'
#
# Generated by CMake, do not edit!

# Guard against multiple inclusion
if (TARGET CUDA::all_libs)
  return()
endif()

add_library(CUDA::cufft IMPORTED UNKNOWN)
set_target_properties(CUDA::cufft PROPERTIES
  IMPORTED_LOCATION "/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/lib64/libcufft.so")

add_library(CUDA::cublas IMPORTED UNKNOWN)
set_target_properties(CUDA::cublas PROPERTIES
  IMPORTED_LOCATION "/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/lib64/libcublas.so")
target_link_libraries(CUDA::cublas
  INTERFACE CUDA::cufft)

add_library(CUDA::cudart IMPORTED UNKNOWN)
set_target_properties(CUDA::cudart PROPERTIES
  IMPORTED_LOCATION "/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/lib64/libcudart.so")
target_link_libraries(CUDA::cudart
  INTERFACE CUDA::cublas)

add_library(CUDA::all_libs INTERFACE IMPORTED)
target_link_libraries(CUDA::all_libs
  INTERFACE CUDA::cufft
  INTERFACE CUDA::cublas
  INTERFACE CUDA::cudart
  )
target_include_directories(CUDA::all_libs SYSTEM
  INTERFACE "/projects/wind/system-spack/opt/spack/linux-rhel7-x86_64/gcc-9.3.0/cuda-11.2.2-wlah7an4q7rej4uylqlimczaa6z3zlq7/include"
  )

You can see the targets CUDA::cufft, CUDA::cublas and CUDA::cudart that are also being created by the FindCUDAToolkit.cmake module.

So I think the solution to this problem is to refactor tribits/cmake/core/std_tpls/FindTPLCUDA.cmake to use find_package(CUDAToolkit). It seems that FindCUDAToolkit.cmake was added in CMake 3.17. In fact, CMake 3.17 offically deprecates FindCUDA.cmake as per:

So the solution seems straightforard. TriBITS and Trilinos should not be using a deprecated module.

@psakievich
Copy link
Contributor

@bartlettroscoe this is good news. Yes if you make the changes in the trilinos source code that was cloned and then run spack install it will update trilinos and all it's dependencies in the DAG. In this case just Nalu-Wind. Some simple commands to do this would be:

spack env activate ${SPACK_MANAGER}/environments/bug # ensure the spack environment is active (assuming the commands above were executed)
spack cd -s trilinos # space will take you directly to the source code the environment is using
# make local code changes
spack install

If you already have the environment active then you can skip step 1.

bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Sep 29, 2022
@bartlettroscoe
Copy link
Member

@psakievich, I will have to do more testing on the Trilinos side, but I believe I have the fix.

Is there a way to run tests for nalu-wind from within the Spack nalu-wind build dir?

@psakievich
Copy link
Contributor

@bartlettroscoe yes go ahead and run these commands

build-env-dive nalu-wind
ctest -R unit -VV

If it builds and the unit tests run then that should be sufficient for this issue.

@psakievich
Copy link
Contributor

@bartlettroscoe thanks for this. The segfault is a nalu-wind developer issue we'll have to track down. From the trilinos perspective it looks like the build issues are resolved. Thank you for getting this worked out.

@bartlettroscoe
Copy link
Member

From the trilinos perspective it looks like the build issues are resolved. Thank you for getting this worked out.

@psakievich, okay, I will clean this up, do some testing with Trilinos and post a Trilinos PR to resolve this.

Thanks for posting issue and pointing out this problem. With the move to modern CMake and the generation of modern namespaced imported targets, there was bound to be a conflict at some point.

bartlettroscoe added a commit to bartlettroscoe/TriBITS that referenced this issue Sep 30, 2022
NOTE: This has matching changes in the downstream Trilinos TPLs CUBLAS and
CUSPARSE.  You can't use this updated TriBITS version with Trilinos without
those matching Trilinos FindTPL<tplName>.cmake changes.
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Sep 30, 2022
NOTE: This requires the matching change to the TriBITS file
tribits/core/std_tpls/FindTPLCUDA.cmake.
@bartlettroscoe
Copy link
Member

bartlettroscoe commented Sep 30, 2022

FYI: We have run into a problem when testing against the full Trilinos test suite. See #11084.

I will post a PR and then someone will need to upgrade Stokhos's usage of CUSPARSE in order to allow that PR to merge and therefore things for Nalu-wind.

Update: Looks like Stokhos is disabled in the rhel7 cuda-11.4.2-uvm-off PR build so this may not impact my ability to push this. I am doing fully testing now locally and will post the PR once that is complete and checks out.

bartlettroscoe added a commit to bartlettroscoe/TriBITS that referenced this issue Oct 2, 2022
NOTE: This has matching changes in the downstream Trilinos TPLs CUBLAS and
CUSPARSE.  You can't use this updated TriBITS version with Trilinos without
those matching Trilinos FindTPL<tplName>.cmake changes (see
trilinos/Trilinos#10954).
bartlettroscoe added a commit to bartlettroscoe/TriBITS that referenced this issue Oct 3, 2022
This is needed in order to avoid namespace classes with downstream CMake
projects that call find_package(CUDAToolkit) (e.g. like Nalu-Wind, see
trilinos/Trilinos#10954).

NOTE: This has matching changes in the downstream Trilinos TPL files
FindTPLCUBLAS.cmake and FindTPLCUSPARSE.cmake.  You can't use this updated
TriBITS version with Trilinos without those matching Trilinos
FindTPL<tplName>.cmake changes (see trilinos/Trilinos#10954).
bartlettroscoe added a commit to TriBITSPub/TriBITS that referenced this issue Oct 3, 2022
Change to use find_package(CUDAToolkit) and some other changes (trilinos/Trilinos#10954)
bartlettroscoe added a commit that referenced this issue Oct 3, 2022
Origin repo remote tracking branch: 'github/master'
Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git'
Git describe: Vera4.0-RC1-start-1292-g6d3bb5b3

At commit:

commit 23dc20b901ab55943b71e51f5e64a244ad186b5a
Author:  Roscoe A. Bartlett <rabartl@sandia.gov>
Date:    Thu Sep 29 11:46:47 2022 -0600
Summary: Change to use find_package(CUDAToolkit) (#10954)
@bartlettroscoe
Copy link
Member

@psakievich, @tasmith4,

The fix is in PR #11093. I just needs to be approved and pass the PR builds and then it will be on the 'develop' branch.

@psakievich
Copy link
Contributor

Thanks @bartlettroscoe !

trilinos-autotester added a commit that referenced this issue Oct 5, 2022
…d-fix-cuda

Automatically Merged using Trilinos Pull Request AutoTester
PR Title: Change TriBITS/Trilinos TPLs to use find_package(CUDATookit) to fix builds with downstream customers using find_package(CUDATookit) (#10954)
PR Author: bartlettroscoe
@bartlettroscoe
Copy link
Member

bartlettroscoe commented Oct 5, 2022

FYI: PR #11093 just merged to the 'develop' branch. Don't know when the 'master' branch will be updated but if you use the 'develop' branch you should be good to go.

When it is confirmed that Nalu-Wind builds with the updated Trilinos version, can you put a comment in this issue so we can close this?

Sorry for the delays in getting this triaged, fixed, and deployed.

@tasmith4
Copy link
Contributor Author

tasmith4 commented Oct 5, 2022

thanks @bartlettroscoe! Our nightly testing has lines that use trilinos@develop, we'll close this issue once we see it build there (likely tomorrow, unless an unrelated bug in our nightly testing setup takes it down again tonight).

@psakievich
Copy link
Contributor

@bartlettroscoe @tasmith4 as a friendly data point, I was able to build a cuda enabled nalu-wind with trilinos' develop branch this afternoon. This uses the same configuration as the nightly test. Thanks for all your work on this @bartlettroscoe. I'll let @tasmith4 give the final okay for closing this issue though.

@tasmith4
Copy link
Contributor Author

tasmith4 commented Oct 5, 2022

sweet! I doubt there will be an issue, but would still like to see it work on the nightlies before closing.

jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Oct 6, 2022
…s:develop' (c748ef0).

* trilinos-develop: (21 commits)
  Tpetra FECrs_MatrixMatrix: Adjust tolerance
  Tpetra: Removing floating point hard-equality check in test
  Sync latest Percept from Sierra
  Removes unneeded fences after Tpetra multiply in Belos and Thyra interfaces. (trilinos#10991)
  ZELLIJ: Add offset; fix parallel output
  Revert "Applications:"
  Revert "Update to work with current TriBITs where needed"
  Revert "Snapshot changes (partial)"
  Revert "IOSS Snapshot changes "
  Tpetra: Modifications to BlockCrsMatrix test to fix geminga nightlies
  Drivers: oops
  Drivers: Adding SYCL/CPU nightly to lightsaber
  Drivers: Adding SYCL/CPU nightly to lightsaber
  IOSS Snapshot changes
  Snapshot changes (partial)
  Update to work with current TriBITs where needed
  Applications: * Clean out extra blank lines * Add better url for showing users where documentation is located. Printed when -help entered. * clang-format differences
  Automatic snapshot commit from tribits at 23dc20b9
  Intrepid2: changes to avoid compiler warnings
  Change to use find_package(CUDAToolkit) (trilinos#10954)
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Oct 6, 2022
…s:develop' (c748ef0).

* trilinos-develop: (21 commits)
  Tpetra FECrs_MatrixMatrix: Adjust tolerance
  Tpetra: Removing floating point hard-equality check in test
  Sync latest Percept from Sierra
  Removes unneeded fences after Tpetra multiply in Belos and Thyra interfaces. (trilinos#10991)
  ZELLIJ: Add offset; fix parallel output
  Revert "Applications:"
  Revert "Update to work with current TriBITs where needed"
  Revert "Snapshot changes (partial)"
  Revert "IOSS Snapshot changes "
  Tpetra: Modifications to BlockCrsMatrix test to fix geminga nightlies
  Drivers: oops
  Drivers: Adding SYCL/CPU nightly to lightsaber
  Drivers: Adding SYCL/CPU nightly to lightsaber
  IOSS Snapshot changes
  Snapshot changes (partial)
  Update to work with current TriBITs where needed
  Applications: * Clean out extra blank lines * Add better url for showing users where documentation is located. Printed when -help entered. * clang-format differences
  Automatic snapshot commit from tribits at 23dc20b9
  Intrepid2: changes to avoid compiler warnings
  Change to use find_package(CUDAToolkit) (trilinos#10954)
  ...
@tasmith4
Copy link
Contributor Author

tasmith4 commented Oct 6, 2022

looks good this morning -- thanks again @bartlettroscoe!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: ExaWind All issue that impact the ECP project ExaWind TriBITS Issues with the TriBITS framework itself, not usage of the TriBITS framework type: bug The primary issue is a bug in Trilinos code or tests
Projects
Development

No branches or pull requests

4 participants