Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable use of CVs defined by PyTorch neural network models #570

Draft
wants to merge 82 commits into
base: master
Choose a base branch
from

Conversation

zwpku
Copy link
Member

@zwpku zwpku commented Aug 28, 2023

This branch implements a class called torchANN, which allows to define cv components by loading pretrained PyTorch neural network models.

Installation Steps

  1. Download LibTorch. This package is required in order to enable the torchann class. First, download the code and unzip it.

         wget https://download.pytorch.org/libtorch/nightly/cpu/libtorch-cxx11-abi-shared-with-deps-latest.zip
         unzip libtorch-cxx11-abi-shared-with-deps-latest.zip
    

    In this way, the library is uncompressed under the current directory. Let's say it is located at /path/to/libtorch.

  2. Patch MD engine. This step is done as usual using the script update-colvars-code.sh. Enter the source code of Colvars package, and run:

         ./update-colvars-code.sh /path/to/md-engine        
    
  3. Compilation. This step depends on the engine to be compiled.

    • NAMD: add "--with-colvars-torch --torch-prefix path/to/libtorch" to the argument of ./config

      Assume packages that are required to build NAMD, e.g. charm, tcl/tcl-threaded, are already prepared.
      Then, one can compile the NAMD package with the following commands:

        ./config Linux-x86_64-g++ --charm-arch multicore-linux-x86_64 --with-colvars-torch    \
              --torch-prefix /path/to/libtorch  --with-fftw3 --fftw-prefix /path/to/fftw
        cd Linux-x86_64-g++
        make 
    
    • GROMACS: add "-DTorch_DIR=/path/to/libtorch/share/cmake/Torch" when running cmake

    An example of the command is:

        cmake .. -DCMAKE_INSTALL_PREFIX=/home/username/local/gromacs  \
                        -DFFTWF_LIBRARY=/home/username/mambaforge/lib/libfftw3f.so  \
                        -DFFTWF_INCLUDE_DIR=/home/username/mambaforge/include \
                        -DTorch_DIR=/path/to/libtorch/share/cmake/Torch/  \
                        -DCMAKE_CXX_COMPILER=/usr/bin/mpicxx \
                        -DOpenMP_gomp_LIBRARY=/home/username/mambaforge/lib/libgomp.so
    
    • LAMMPS: only installation by cmake is supported. In the directory of LAMMPS source code, run
         mkdir build && cd build
         cmake ../cmake -D PKG_COLVARS=yes -D COLVARS_TORCH=yes 
    

    and set the variable Torch_DIR in the file CMakeCache.txt. When a cpu version of libtorch library is used, it may
    also be necessary to set MKL path to empty:

         MKL_INCLUDE_DIR:PATH=
    

    Alternatively, one could combine these steps in one command:

         cmake ../cmake -D PKG_COLVARS=yes -D COLVARS_TORCH=yes      \ 
             -D  Torch_DIR=/path/to/libtorch/share/cmake/Torch -D MKL_INCLUDE_DIR=
    

    After that, run make and make install to compile and install the package.

The class has only been tested using simple neural network models (i.e. an autoencoder on alanine dipeptide), under NAMD and GROMACS engines. Feedbacks are welcome!

A (trivial) example

  1. Create a PyTorch model
import torch

class MyModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
    def forward(self, x):
        return x

model = MyModel()
scripted_cv_filename = f'./identity.pt'
torch.jit.script(model).save(scripted_cv_filename)

This Python script simply creates a model which is an identity map and save it to a file named identity.pt.

  1. Define the COLVARS config file

This file defines two CVs using torchann class taking other cv components (here dihedral angles) as inputs.

colvarsTrajFrequency    10000
colvarsRestartFrequency 10000

colvar {
  name nn_0
  lowerBoundary -180.0
  upperBoundary 180
  width 5.0
  extendedLagrangian on
  extendedFluctuation 5.0
  extendedTimeConstant 200

  torchann {
    modelFile identity.pt
    m_output_index 0
    period 360

    dihedral {
      group1 { 
	atomnumbers 5
      }
      group2 { 
	atomnumbers 7
      }
      group3 { 
	atomnumbers 9
      }
      group4 { 
	atomnumbers 15
      }
    }

    dihedral {
      group1 { 
	atomnumbers 7
      }
      group2 { 
	atomnumbers 9
      }
      group3 { 
	atomnumbers 15
      }
      group4 { 
	atomnumbers 17
      }
    }

  }
}

colvar {
  name nn_1
  lowerBoundary -180.0
  upperBoundary 180
  width 5.0
  extendedLagrangian on
  extendedFluctuation 5.0
  extendedTimeConstant 200

  torchann {
    modelFile identity.pt
    m_output_index 1
    period 360

    dihedral {
      group1 { 
	atomnumbers 5
      }
      group2 { 
	atomnumbers 7
      }
      group3 { 
	atomnumbers 9
      }
      group4 { 
	atomnumbers 15
      }
    }

    dihedral {
      group1 { 
	atomnumbers 7
      }
      group2 { 
	atomnumbers 9
      }
      group3 { 
	atomnumbers 15
      }
      group4 { 
	atomnumbers 17
      }
    }
  }
}

abf {
  colvars nn_0 nn_1
  fullSamples	200
}

@jhenin
Copy link
Member

jhenin commented Sep 13, 2024

Two items would help here:

  1. a way to run the tests in GH actions to catch regressions
  2. a way to load a dynamic libtorch, because we probably won't be able to add it as a dependency to the back-end codes.

@HanatoK
Copy link
Member

HanatoK commented Sep 16, 2024

2. a way to load a dynamic libtorch, because we probably won't be able to add it as a dependency to the back-end codes.

There are two ways: (i) Colvars provides a mechanism to load library dynamically, just like the LOAD command in PLUMED (https://www.plumed.org/doc-v2.9/user-doc/html/_l_o_a_d.html); (ii) torchann uses a private implementation, and the private implementation does not get compiled with Colvars main source tree and loads the libtorch dynamically.

@jhenin
Copy link
Member

jhenin commented Sep 18, 2024

Of interest here, a libtorch interface is making its way into the Gromacs build system: https://gitlab.com/gromacs/gromacs/-/merge_requests/4551

Also allow torchANN components to be declared as periodic to fix the regtest
@jhenin
Copy link
Member

jhenin commented Sep 18, 2024

@zwpku I have made changes to enable building with Gromacs 2024. Had you tested that?
The regtest is broken possibly due to changes in the extended Lagrangian integrator.

@zwpku zwpku closed this Sep 19, 2024
@giacomofiorin
Copy link
Member

@zwpku Did you mean to close this PR?

@zwpku
Copy link
Member Author

zwpku commented Sep 19, 2024

@jhenin @HanatoK I don't know whether I understood dynamical loading correctly. One issue to me is that, in order to be able to call functions in the loaded library, one has to declare in the C++ source code of the library with extern "C" so that the names are not mangled when creating the .so file. I found many such declarations in plumed, but only a few in libtorch. So I not sure whether this is supported by libtorch.

Also, it would be good to have a compatible solution with the development of libtorch interface in Gromacs.

@zwpku zwpku reopened this Sep 19, 2024
@zwpku
Copy link
Member Author

zwpku commented Sep 19, 2024

@zwpku Did you mean to close this PR?

Sorry, I clicked the wrong button.

@HanatoK
Copy link
Member

HanatoK commented Sep 19, 2024

@jhenin @HanatoK I don't know whether I understood dynamical loading correctly. One issue to me is that, in order to be able to call functions in the loaded library, one has to declare in the C++ source code of the library with extern "C" so that the names are not mangled when creating the .so file. I found many such declarations in plumed, but only a few in libtorch. So I not sure whether this is supported by libtorch.

Also, it would be good to have a compatible solution with the development of libtorch interface in Gromacs.

You don't have to declare everything in extern C. In my opinion, what we should do include:

  1. Colvars should have the ability to load a dynamic library that includes a subclass of colvarcomp. That means Colvars needs to wrap dlopen (for example: https://theo-penavaire.medium.com/loading-of-a-c-class-from-a-shared-library-modern-c-722d6a830a2b and https://github.com/theopnv/Dynamic-Loading). This may requires the help from @giacomofiorin and @jhenin to do some fundamental changes in add_component_type of colvar.cpp.
  2. The colvarcomp_torchann class can then be declared in a separate header out-of-tree (no need for extern C), and then in the implementation colvarcomp_torchann.cpp, there should be creator and deleter functions implemented in extern C. For example:
    https://github.com/theopnv/Dynamic-Loading/blob/820e15b2ff8af6ae8a75a6cfb92b10e40a02f686/Libraries/Tatooine/Tatooine.cpp#L3-L31

If you have access to the NAMD Gitlab repository, you can have a look with the proposed PytorchForces implementation that also uses the dynamic loading:
https://gitlab.com/tcbgUIUC/namd/-/merge_requests/287/diffs#ae77510c784effb9160a1e44869634b6b0221c8a

@zwpku
Copy link
Member Author

zwpku commented Sep 19, 2024

@jhenin @HanatoK I don't know whether I understood dynamical loading correctly. One issue to me is that, in order to be able to call functions in the loaded library, one has to declare in the C++ source code of the library with extern "C" so that the names are not mangled when creating the .so file. I found many such declarations in plumed, but only a few in libtorch. So I not sure whether this is supported by libtorch.
Also, it would be good to have a compatible solution with the development of libtorch interface in Gromacs.

You don't have to declare everything in extern C. In my opinion, what we should do include:

  1. Colvars should have the ability to load a dynamic library that includes a subclass of colvarcomp. That means Colvars needs to wrap dlopen (for example: https://theo-penavaire.medium.com/loading-of-a-c-class-from-a-shared-library-modern-c-722d6a830a2b and https://github.com/theopnv/Dynamic-Loading). This may requires the help from @giacomofiorin and @jhenin to do some fundamental changes in add_component_type of colvar.cpp.
  2. The colvarcomp_torchann class can then be declared in a separate header out-of-tree (no need for extern C), and then in the implementation colvarcomp_torchann.cpp, there should be creator and deleter functions implemented in extern C. For example:
    https://github.com/theopnv/Dynamic-Loading/blob/820e15b2ff8af6ae8a75a6cfb92b10e40a02f686/Libraries/Tatooine/Tatooine.cpp#L3-L31

If you have access to the NAMD Gitlab repository, you can have a look with the proposed PytorchForces implementation that also uses the dynamic loading: https://gitlab.com/tcbgUIUC/namd/-/merge_requests/287/diffs#ae77510c784effb9160a1e44869634b6b0221c8a

@HanatoK Thanks! So this means that we compile colvarcomp_torchann class together with libtorch to build a dynamic library, say libcovarcomp_torchann.so. Then Colvars loads this library at runtime?

@HanatoK
Copy link
Member

HanatoK commented Sep 19, 2024

@jhenin @HanatoK I don't know whether I understood dynamical loading correctly. One issue to me is that, in order to be able to call functions in the loaded library, one has to declare in the C++ source code of the library with extern "C" so that the names are not mangled when creating the .so file. I found many such declarations in plumed, but only a few in libtorch. So I not sure whether this is supported by libtorch.
Also, it would be good to have a compatible solution with the development of libtorch interface in Gromacs.

You don't have to declare everything in extern C. In my opinion, what we should do include:

  1. Colvars should have the ability to load a dynamic library that includes a subclass of colvarcomp. That means Colvars needs to wrap dlopen (for example: https://theo-penavaire.medium.com/loading-of-a-c-class-from-a-shared-library-modern-c-722d6a830a2b and https://github.com/theopnv/Dynamic-Loading). This may requires the help from @giacomofiorin and @jhenin to do some fundamental changes in add_component_type of colvar.cpp.
  2. The colvarcomp_torchann class can then be declared in a separate header out-of-tree (no need for extern C), and then in the implementation colvarcomp_torchann.cpp, there should be creator and deleter functions implemented in extern C. For example:
    https://github.com/theopnv/Dynamic-Loading/blob/820e15b2ff8af6ae8a75a6cfb92b10e40a02f686/Libraries/Tatooine/Tatooine.cpp#L3-L31

If you have access to the NAMD Gitlab repository, you can have a look with the proposed PytorchForces implementation that also uses the dynamic loading: https://gitlab.com/tcbgUIUC/namd/-/merge_requests/287/diffs#ae77510c784effb9160a1e44869634b6b0221c8a

@HanatoK Thanks! So this means that we compile colvarcomp_torchann class together with libtorch to build a dynamic library, say libcovarcomp_torchann.so. Then Colvars loads this library at runtime?

Yes. That is what I mean. libcovarcomp_torchann.so is linked to libtorch.so (either dynamic or static linking), and Colvars loads libcovarcomp_torchann.so dynamically. I am not sure if Colvars maintainers agree with this idea, though.

@giacomofiorin
Copy link
Member

giacomofiorin commented Sep 19, 2024

Yes. That is what I mean. libcovarcomp_torchann.so is linked to libtorch.so (either dynamic or static linking), and Colvars loads libcovarcomp_torchann.so dynamically. I am not sure if Colvars maintainers agree with this idea, though.

This is actually a question very easy to answer (especially compared to others that you asked recently on this repo 😉).

The only arrangements that we have ever followed are set by the maintainers of the packages that Colvars is distributed with. In this specific case, if GROMACS is considering to link libTorch (see above), IMO Colvars should link it in the same manner. That would be statically, dynamically, etc.

@HanatoK have you discussed with your colleagues in the NAMD team about how to package libTorch with NAMD? It would be awesome if it could be packaged statically as you do already for the NVIDIA libraries, FFTW, etc

@HanatoK
Copy link
Member

HanatoK commented Sep 19, 2024

The only arrangements that we have ever followed are set by the maintainers of the packages that Colvars is distributed with. In this specific case, if GROMACS is considering to link libTorch (see above), IMO Colvars should link it in the same manner. That would be statically, dynamically, etc.

But it seems @jhenin wants to dynamically load libtorch, not dynamically or statically link to it. I also think that dynamic loading appears to be the most general and flexible solution since it does not require the MD engine to link to libtorch.

@HanatoK have you discussed with your colleagues in the NAMD team about how to package libTorch with NAMD? It would be awesome if it could be packaged statically as you do already for the NVIDIA libraries, FFTW, etc

We are not going to package or redistribute libtorch, because we want to use libtorch with CUDA, and libtorch may be compiled with different CUDA architectures. More specifically, libtorch_cuda.so may be compiled with -DTORCH_CUDA_ARCH_LIST, which depends on the actual GPUs being used. The official libtorch binary might not be the optimal build for all users.

I plan to distribute the PytorchForces and NAMD headers in its source code form in the NAMD binary tarball, and let the users download and compile libtorch following their official instruction, and then compile the binary plugin of PytorchForces linking with libtorch.

@giacomofiorin
Copy link
Member

@jhenin's suggestion was meant to avoid maintaining patched versions of the build recipes for each engine, which this branch contains right now. Loading would allow the use of a precompiled libTorch in the CI jobs, i.e. loading would be preferable for the specific purpose of running tests.

As far as distribution goes, as long as the user is required to build dependencies there is no perfect approach 😄

@jhenin
Copy link
Member

jhenin commented Sep 20, 2024

I did mention dynamic loading. However, linking at build time should be supported when possible. In Gromacs and LAMMPS, we have the benefit of a separate CMake file, so there we can relatively easily detect if torch is available at build time - especially if we coordinate with the Gromacs folks. In NAMD, it's more intrusive, so I don't know how easily the patch to the config script will be accepted. The build system of VMD is its own beast, and distribution is a problem, so dynamic loading makes more sense there.

@HanatoK
Copy link
Member

HanatoK commented Sep 20, 2024

I did mention dynamic loading. However, linking at build time should be supported when possible. In Gromacs and LAMMPS, we have the benefit of a separate CMake file, so there we can relatively easily detect if torch is available at build time - especially if we coordinate with the Gromacs folks. In NAMD, it's more intrusive, so I don't know how easily the patch to the config script will be accepted. The build system of VMD is its own beast, and distribution is a problem, so dynamic loading makes more sense there.

Do you want to support both dynamic loading and direct linkage? That would make the code more complicated and I cannot see the benefit of using both. You would need a lot of ifdefs to do that. For dynamic loading, if GROMACS is linked to libtorch, then libcolvarcomp_torchann.so does not need to link to libtorch since it would find the libtorch symbols when loading in GROMACS. If NAMD is not linked to libtorch, then libcolvarcomp_torchann.so needs to link to libtorch if it is used in NAMD.

@zwpku
Copy link
Member Author

zwpku commented Sep 20, 2024

@zwpku I have made changes to enable building with Gromacs 2024. Had you tested that? The regtest is broken possibly due to changes in the extended Lagrangian integrator.

@jhenin I tried to patch Gromacs 2024 with this branch. It seems that Gromacs 2024 is viewed as GROMACS-DEV in update-colvars-code.sh and the corresponding commands do not update the gmxManageColvars.cmake file (which include dependence on libtorch). I didn't continue to build the code but I am afraid that the torchann class will not be compiled. Is there some issue in update-colvars-code.sh?

Had to make separate patch for the mdmodules version
@jhenin
Copy link
Member

jhenin commented Sep 20, 2024

Thanks @zwpku , the patch was not in the right place.

@jhenin
Copy link
Member

jhenin commented Sep 23, 2024

The current failures to build gromacs in regtests are due to the lack of downstream changes. Merging master should help.

@giacomofiorin
Copy link
Member

@jhenin It looks like you have committed directly onto master the patches to GROMACS CMake files discussed above. Can you revert?

@@ -38,6 +38,19 @@ gmx_option_multichoice(GMX_USE_COLVARS
INTERNAL NONE)
mark_as_advanced(GMX_USE_COLVARS)

function(gmx_set_colvars_torch)
find_package(Torch)
Copy link
Member

@HubLot HubLot Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This line should be removed for GMX 2025

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HubLot I propose to stop patching any build recipe for the engines, and just test the features inside the library itself. I added support for linking against PyTorch in tests in PR #721, and tagged you as reviewer since you already reviewed a similar MR for GROMACS ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants