-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MFEM example] Wrong result when change preconditioner in example MFEM using Ginkgo #633
Comments
It also occurs, when I try to modify the example 2 (https://github.com/mfem/mfem/blob/master/examples/ex2.cpp) using Ginkgo solver. It occurs with all preconditioners using Cuda excutor. I try to use reference excutor and it's right, but with Cuda excutor it has the wrong result. |
The examples with MFEM need to be modified to use the cuda executor. Unfortunately, they do not work out of the box. We are working on integrating it completely. You can have a look at @nbeams PR on the MFEM side with this branch. But, you should still be able to use the cuda executor from Ginkgo's side, by the following modifications: diff --git a/examples/ginkgo/ex1.cpp b/examples/ginkgo/ex1.cpp
index 2d9f565..ca8d9cb 100644
--- a/examples/ginkgo/ex1.cpp
+++ b/examples/ginkgo/ex1.cpp
@@ -195,8 +195,8 @@ int main(int argc, char *argv[])
{
#ifdef MFEM_USE_GINKGO
// Solve the linear system with CG + ILU from Ginkgo.
- std::string executor = "reference";
- auto exec = gko::ReferenceExecutor::create();
+ std::string executor = "cuda";
+ auto exec = gko::CudaExecutor::create(0, gko::OmpExecutor::create());
auto ilu_precond =
gko::preconditioner::Ilu<gko::solver::LowerTrs<>,
gko::solver::UpperTrs<>, false>::build() But be aware that this copies the data from the host to the gpu, performs the solve and copies back the results to the cpu and hence it can be slow. For the version without the copies, you can have a look at the PR from @nbeams. |
Thank you for your replying. I mean I made the same modification as above, but the result is wrong. You can try to modify Jacobi preconditioner and build: ./ex1 -o 3 insteaf of ILU preconditioner and ./ex1 |
I am sorry but I cannot reproduce it. This seems to work for me: diff --git a/examples/ginkgo/ex1.cpp b/examples/ginkgo/ex1.cpp
index 2d9f565..5539530 100644
--- a/examples/ginkgo/ex1.cpp
+++ b/examples/ginkgo/ex1.cpp
@@ -195,14 +195,16 @@ int main(int argc, char *argv[])
{
#ifdef MFEM_USE_GINKGO
// Solve the linear system with CG + ILU from Ginkgo.
- std::string executor = "reference";
- auto exec = gko::ReferenceExecutor::create();
- auto ilu_precond =
- gko::preconditioner::Ilu<gko::solver::LowerTrs<>,
- gko::solver::UpperTrs<>, false>::build()
- .on(exec);
- GinkgoWrappers::CGSolver ginkgo_solver(executor, 1, 2000, 1e-12, 0.0,
- ilu_precond.release() );
+ using bj = gko::preconditioner::Jacobi<double, int>;
+ std::string executor = "cuda";
+ auto exec = gko::CudaExecutor::create(0, gko::OmpExecutor::create());
+ auto bj_precond = bj::build()
+ .with_max_block_size(16u)
+ .with_storage_optimization(
+ gko::precision_reduction::autodetect())
+ .on(exec);
+ GinkgoWrappers::CGSolver ginkgo_solver(executor, 1, 2000, 1e-12, 0.0,
+ bj_precond.release() );
ginkgo_solver.solve(&((SparseMatrix&)(*A)), X, B);
#endif
} |
I see that you mean for |
Yes, thank you for your support. |
Hi @pratikvn. I'm sorry to bother you. I try to follow the change of PR you mentioned but it doesn't work when I build MFEM. I use ginkgo-1.2.0. It returns: linalg/../fem/../mesh/../fem/../linalg/ginkgo.hpp: In constructor ‘mfem::GinkgoWrappers::GinkgoPreconditionerBase::GinkgoPreconditionerBase(std::shared_ptr, mfem::SparseMatrix&, mfem::Array&, bool)’: 1 error detected in the compilation of "/tmp/tmpxft_00002ad4_00000000-6_ginkgo.cpp1.ii". |
Also it doesn't work when I build with ginkgo-1.3.0. |
@nbeams, maybe you can help out here ? |
Hi @nghiakvnvsd, Let me make sure I understand exactly what you are testing here. The first error you reported was with the current master branches of MFEM and Ginkgo, correct? And you saw a problem with the block Jacobi preconditioner for ex1 with CUDA. Did you also try the OpenMP executor? Also, what happens if you set the For the issue with ex2, I think I might be of more use if I could see exactly what modifications you made to the existing code. Do you have the code available somewhere? Finally, the build issue with my branch of MFEM. Did you completely check out the branch, or add its commits on top of the current MFEM master? Also, I should say, I have been using that branch with my own branch of Ginkgo, as well (https://github.com/ginkgo-project/ginkgo/tree/mfem-wrappers). I have not seen that particular build issue before, but it wouldn't necessarily surprise me that there are build issues when not using my branch of Ginkgo. There have been changes to both Ginkgo and MFEM since I last did updates on my branches. So, I wouldn't actually recommend using my branch of MFEM as it currently stands. I think what @pratikvn meant was to look at the code in the PR for accessing an MFEM SparseMatrix on host or device, without copying, to wrap it as a Ginkgo matrix. E.g., with
|
Hi @nbeams The modified ex2 I mentioned, it was just combined between ex1 (ginkgo) and ex2 (serial). I have tried to use the mfem-wrappers and tried to merge ginkgo-precond and master mfem but it didn't work when I tried to run the example 1 (ginkgo) it reports:
|
Sorry, I should have changed "I wouldn't actually recommend using my branch of MFEM as it currently stands" to "do not use my branch of MFEM as it currently stands." I forgot I had commented out some lines related to building Ginkgo solvers. They were causing conflicts with an update in Ginkgo that was related to fixing a CUDA compiler issue. I wasn't using the Ginkgo solvers in the tests in my branch, just the preconditioners, so it didn't matter to me at the time. (But thanks for the reminder that I need to check again with the latest version of Ginkgo, and fix the solver integration in MFEM if needed.)
|
I think I will wait for the next version MFEM 4.2. Thank you very much for your support. |
Could you post the code of the modified ex2 to use Ginkgo together with it so that we can try to look into it? Thanks! |
Yes. Thank you!!! https://drive.google.com/drive/folders/1RNyy8UIcndXxzseSU4pO1mpcg_O83o3F?usp=sharing |
As you see, The ex2.cpp and ex2-cuda-executor.cpp just are different between "cuda" and "reference" executor. But I don't understand why ex2 converge while ex2-cuda-executor doesn't converge |
@nghiakvnvsd, I have built your examples |
@nghiakvnvsd: Unfortunately, I do not have an answer for you yet, but I haven't forgotten and I'm still looking into it. |
@nghiakvnvsd Update: Here, for CUDA, I would actually recommend to use the proper ILU(0) factorication method, which you can do with: factorization = share(
gko::factorization::Ilu<value_type, index_type>::build().on(exec));
auto solver_gen =
gko::solver::Cg<value_type>::build()
.with_criteria(
gko::stop::Iteration::build().with_max_iters(2000u).on(exec),
gko::stop::ResidualNormReduction<value_type>::build()
.with_reduction_factor(reduction_factor)
.on(exec))
.with_preconditioner(precond_fact)
.on(exec); For OpenMP and Reference, we did not implement this factorization method yet, but ParILU does work fine there. I am now looking into the Jacobi preconditioner and why it sometimes prevents conversion. |
This PR adds the option to sort the matrix before processing it inside the Jacobi preconditioner and the ILU factorization. By default, it is assumed that the input matrix is not sorted, therefore, a sorting step is performed (unless the factory parameter `skip_sorting` is set to `true`). Additionally, this PR adds functionality to unsort a given matrix for tests (currently only CSR and COO are supported). This functionality is then used in added tests that unsort the system matrix before generating / running for CSR, COO, Jacobi preconditioner and ILU factorization. Related PR: #659 Closes Issue: #633
Hi!
When I run ex1 (https://github.com/mfem/mfem/blob/master/examples/ginkgo/ex1.cpp) by reference excutor and cuda excutor, it's ok. But when I change preconditioner (Jacobi preconditioner), I ran by reference is ok. But with cuda excutor the result is wrong, it does not converge.
I think it's a error. You can try it and see.
Thank you!
The text was updated successfully, but these errors were encountered: