Iterative refinement (IR) #243

gflegar · 2019-02-26T05:35:17Z

I did a quick implementation of the iterative refinement method after getting back from work today (yesterday for you). It seemed like to much of a low hanging fruit not to have it in the first release.
Needles, to say, this does not try to do mixed precision IR (MPIR), but just the plain old IR, where you can specify what you want to use as the inner solver. By default I set it to the simplest possible thing - the identity operator, which results in Richardson iteration with scaling parameter 1.

To add mixed precision, we would have to solve the problem of conversions between value types - for that we would need to implement #247 - but even without it it's still useful, as it gives us Richardson iteration support, and if the default solver is replaced by the block-Jacobi precondtiioner, we actually get the block-Jacobi relaxation method (or what would be the "adaptive block-Jacobi relaxation method").

While writing the unit tests, I also figured out that stopping_status did not have comparisons, so I added them, and while adding them I noticed that we had 2 methods for resetting the status (clear() and reset()), so I removed clear() and used reset() wherever clear() was used previously.

tcojean · 2019-02-27T23:35:44Z

core/solver/ir.cpp

+
+        solver_->apply(lend(one_op), lend(residual), lend(one_op), dense_x);
+        residual->copy_from(dense_b);
+        system_matrix_->apply(lend(neg_one_op), dense_x, lend(one_op),


It's quite simple, but nonetheless I believe we in general put as comments the general algorithm of the solver, maybe you could also do that here? (or was that only for the step kernels?)

it was only for the step kernels

include/ginkgo/core/solver/ir.hpp

hartwiganzt

LGTM

thoasm

LGTM!

thoasm · 2019-03-04T12:44:36Z

cuda/solver/ir_kernels.cu

+    size_type num_cols, stopping_status *stop_status)
+{
+    const auto tidx =
+        static_cast<size_type>(blockDim.x) * blockIdx.x + threadIdx.x;


This is the first time I see it handled that way (using explicit casting to size_type), what is the normal auto deduction? Is it not uint64?

Add implementation of iterative refinement

3536cdf

gflegar added is:new-feature A request or implementation of a feature that does not exist yet. mod:core This is related to the core module. type:solver This is related to the solvers labels Feb 26, 2019

gflegar self-assigned this Feb 26, 2019

gflegar added mod:cuda This is related to the CUDA module. mod:openmp This is related to the OpenMP module. mod:reference This is related to the reference module. labels Feb 26, 2019

tcojean reviewed Feb 27, 2019

View reviewed changes

hartwiganzt reviewed Feb 28, 2019

View reviewed changes