Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add solver BiCG #438

Merged
merged 45 commits into from
Jan 28, 2020
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
ad58adc
initial setup via the create_algorithm script
Eoli-an Dec 11, 2019
da91b16
create inital code for bicg.cpp
Eoli-an Dec 17, 2019
5b00647
create inital code for bicg_kernels.cpp
Eoli-an Dec 17, 2019
0c96703
create inital code for bicg_kernels.cu
Eoli-an Dec 17, 2019
281cdcb
create inital code for bicg.hpp
Eoli-an Dec 17, 2019
c4cb0e3
create inital code for ginkgo.hpp
Eoli-an Dec 17, 2019
ac69bcf
create inital code for bicg_kernels.cpp
Eoli-an Dec 17, 2019
5724bb7
create inital code for bicg_kernels.hpp.inc
Eoli-an Dec 17, 2019
ba32e9c
create inital tests for the core, use only cg tests
Eoli-an Dec 17, 2019
0ef211a
remove unnecessary casting to csr matrix, which resulted in a segment…
Eoli-an Dec 17, 2019
b902e66
swap instructions so that we do not set the strategy for an empty poi…
Eoli-an Dec 17, 2019
57003b8
add implementation for the reference kernel
Eoli-an Dec 17, 2019
dcb759b
add implementation for the reference kernel tests, use only the cg tests
Eoli-an Dec 17, 2019
2ab69e9
create initial code for bicg_kernels.hpp.inc
Eoli-an Dec 17, 2019
c28bd2a
create initial code for bicg_kernels.cu
Eoli-an Dec 17, 2019
95ce8b1
create initial code for bicg_kernels.cpp, use only the cg tests
Eoli-an Dec 17, 2019
4e82749
fix bug where bicg did not correctly solve a non symmetrical system
Eoli-an Jan 7, 2020
561935d
clean up of the comments
Eoli-an Jan 7, 2020
59bf41d
fix bug where we did not check if the system_matrix_ was already tran…
Eoli-an Jan 21, 2020
fd4b7f1
remove a unnecessary apply and just set r2 to r
Eoli-an Jan 21, 2020
2433b83
additional empty line in the header file
Eoli-an Jan 21, 2020
06cb85c
use static_cast instead of dynamic_cast since the type of the Precond…
Eoli-an Jan 21, 2020
aee5f40
add additional line to follow the AAA rules
Eoli-an Jan 21, 2020
4e0cd70
create inital code for the Omp Executor
Eoli-an Jan 21, 2020
70969c2
create the code for Omp and add the tests
Eoli-an Jan 21, 2020
a540e94
swap the header for random and gtest around in all omp tests
Eoli-an Jan 21, 2020
2effc4f
swap around the core header files in all omp tests
Eoli-an Jan 21, 2020
aaad7c5
swap aroung the gtest and assertion headers in all reference tests
Eoli-an Jan 21, 2020
f1ad5d5
remove the comments around the cg tests and change some lines so we f…
Eoli-an Jan 21, 2020
16177e2
swap headers in all solver test for reference,omp and cuda
Eoli-an Jan 21, 2020
3c1eeba
transpose the preconditioner. For that the Identity matrix has to inh…
Eoli-an Jan 21, 2020
890b68a
remove unnecessary comments
Eoli-an Jan 21, 2020
62a20fe
clean up
Eoli-an Jan 23, 2020
b7bd148
remove unnecessary logger and stop criterion
Eoli-an Jan 23, 2020
351a44a
remove unnecessary dynmaic cast
Eoli-an Jan 23, 2020
7ee18c8
add name to contributers.txt
Eoli-an Jan 28, 2020
59147c5
add hip implementation
Eoli-an Jan 28, 2020
a1a40de
add non spd test for cuda,hip and omp. The cuda and hip test cases us…
Eoli-an Jan 28, 2020
76cbe0a
Instantiate for each value and index type instead of just the value t…
Eoli-an Jan 28, 2020
7d3a0f5
fix bug in GKO_DECLARE_BICG
Eoli-an Jan 28, 2020
1f71029
add bicg to the HIP CMakeLists
Eoli-an Jan 28, 2020
c6d93f0
remove unnecessary index type and instead check explicitly for all av…
Eoli-an Jan 28, 2020
92e705c
change math.hpp to math.hip.hpp
Eoli-an Jan 28, 2020
e731cf4
add bicg to the hip test list
Eoli-an Jan 28, 2020
e990b44
fix copy bug
Eoli-an Jan 28, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions common/solver/bicg_kernels.hpp.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
/*******************************<GINKGO LICENSE>******************************
Copyright (c) 2017-2020, the Ginkgo authors
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
******************************<GINKGO LICENSE>*******************************/

template <typename ValueType>
__global__ __launch_bounds__(default_block_size) void initialize_kernel(
size_type num_rows, size_type num_cols, size_type stride,
const ValueType *__restrict__ b, ValueType *__restrict__ r,
ValueType *__restrict__ z, ValueType *__restrict__ p,
ValueType *__restrict__ q, ValueType *__restrict__ r2,
ValueType *__restrict__ z2, ValueType *__restrict__ p2,
ValueType *__restrict__ q2, ValueType *__restrict__ prev_rho,
ValueType *__restrict__ rho, stopping_status *__restrict__ stop_status)
{
const auto tidx =
static_cast<size_type>(blockDim.x) * blockIdx.x + threadIdx.x;

if (tidx < num_cols) {
rho[tidx] = zero<ValueType>();
prev_rho[tidx] = one<ValueType>();
stop_status[tidx].reset();
}

if (tidx < num_rows * stride) {
r[tidx] = b[tidx];
z[tidx] = zero<ValueType>();
p[tidx] = zero<ValueType>();
q[tidx] = zero<ValueType>();
r2[tidx] = b[tidx];
z2[tidx] = zero<ValueType>();
p2[tidx] = zero<ValueType>();
q2[tidx] = zero<ValueType>();
}
}


template <typename ValueType>
__global__ __launch_bounds__(default_block_size) void step_1_kernel(
size_type num_rows, size_type num_cols, size_type stride,
ValueType *__restrict__ p, const ValueType *__restrict__ z,
ValueType *__restrict__ p2, const ValueType *__restrict__ z2,
const ValueType *__restrict__ rho, const ValueType *__restrict__ prev_rho,
const stopping_status *__restrict__ stop_status)
{
const auto tidx =
static_cast<size_type>(blockDim.x) * blockIdx.x + threadIdx.x;
const auto col = tidx % stride;
if (col >= num_cols || tidx >= num_rows * stride ||
stop_status[col].has_stopped()) {
return;
}
const auto tmp = rho[col] / prev_rho[col];

p[tidx] =
prev_rho[col] == zero<ValueType>() ? z[tidx] : z[tidx] + tmp * p[tidx];

p2[tidx] = prev_rho[col] == zero<ValueType>() ? z2[tidx]
: z2[tidx] + tmp * p2[tidx];
}


template <typename ValueType>
__global__ __launch_bounds__(default_block_size) void step_2_kernel(
size_type num_rows, size_type num_cols, size_type stride,
size_type x_stride, ValueType *__restrict__ x, ValueType *__restrict__ r,
ValueType *__restrict__ r2, const ValueType *__restrict__ p,
const ValueType *__restrict__ q, const ValueType *__restrict__ q2,
const ValueType *__restrict__ beta, const ValueType *__restrict__ rho,
const stopping_status *__restrict__ stop_status)
{
const auto tidx =
static_cast<size_type>(blockDim.x) * blockIdx.x + threadIdx.x;
const auto row = tidx / stride;
const auto col = tidx % stride;

if (col >= num_cols || tidx >= num_rows * num_cols ||
stop_status[col].has_stopped()) {
return;
}
if (beta[col] != zero<ValueType>()) {
const auto tmp = rho[col] / beta[col];
x[row * x_stride + col] += tmp * p[tidx];
r[tidx] -= tmp * q[tidx];
r2[tidx] -= tmp * q2[tidx];
}
}
1 change: 1 addition & 0 deletions core/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ target_sources(ginkgo
matrix/sellp.cpp
matrix/sparsity_csr.cpp
preconditioner/jacobi.cpp
solver/bicg.cpp
solver/bicgstab.cpp
solver/cg.cpp
solver/cgs.cpp
Expand Down
24 changes: 24 additions & 0 deletions core/device_hooks/common_kernels.inc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
#include "core/matrix/sellp_kernels.hpp"
#include "core/matrix/sparsity_csr_kernels.hpp"
#include "core/preconditioner/jacobi_kernels.hpp"
#include "core/solver/bicg_kernels.hpp"
#include "core/solver/bicgstab_kernels.hpp"
#include "core/solver/cg_kernels.hpp"
#include "core/solver/cgs_kernels.hpp"
Expand Down Expand Up @@ -213,6 +214,29 @@ GKO_INSTANTIATE_FOR_EACH_VALUE_TYPE(GKO_DECLARE_CG_STEP_2_KERNEL);
} // namespace cg


// TODO (script): adapt this block as needed
tcojean marked this conversation as resolved.
Show resolved Hide resolved
namespace bicg {


template <typename ValueType>
GKO_DECLARE_BICG_INITIALIZE_KERNEL(ValueType)
GKO_NOT_COMPILED(GKO_HOOK_MODULE);
GKO_INSTANTIATE_FOR_EACH_VALUE_TYPE(GKO_DECLARE_BICG_INITIALIZE_KERNEL);

template <typename ValueType>
GKO_DECLARE_BICG_STEP_1_KERNEL(ValueType)
GKO_NOT_COMPILED(GKO_HOOK_MODULE);
GKO_INSTANTIATE_FOR_EACH_VALUE_TYPE(GKO_DECLARE_BICG_STEP_1_KERNEL);

template <typename ValueType>
GKO_DECLARE_BICG_STEP_2_KERNEL(ValueType)
GKO_NOT_COMPILED(GKO_HOOK_MODULE);
GKO_INSTANTIATE_FOR_EACH_VALUE_TYPE(GKO_DECLARE_BICG_STEP_2_KERNEL);
tcojean marked this conversation as resolved.
Show resolved Hide resolved


} // namespace bicg


namespace lower_trs {


Expand Down
14 changes: 14 additions & 0 deletions core/matrix/identity.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,20 @@ std::unique_ptr<LinOp> IdentityFactory<ValueType>::generate_impl(
}


template <typename ValueType>
thoasm marked this conversation as resolved.
Show resolved Hide resolved
std::unique_ptr<LinOp> Identity<ValueType>::transpose() const
{
return this->clone();
}


template <typename ValueType>
std::unique_ptr<LinOp> Identity<ValueType>::conj_transpose() const
{
return this->clone();
}


#define GKO_DECLARE_IDENTITY_MATRIX(_type) class Identity<_type>
GKO_INSTANTIATE_FOR_EACH_VALUE_TYPE(GKO_DECLARE_IDENTITY_MATRIX);
#define GKO_DECLARE_IDENTITY_FACTORY(_type) class IdentityFactory<_type>
Expand Down
206 changes: 206 additions & 0 deletions core/solver/bicg.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
/*******************************<GINKGO LICENSE>******************************
Copyright (c) 2017-2020, the Ginkgo authors
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
******************************<GINKGO LICENSE>*******************************/

#include <ginkgo/core/solver/bicg.hpp>


#include <ginkgo/core/base/exception.hpp>
#include <ginkgo/core/base/exception_helpers.hpp>
#include <ginkgo/core/base/executor.hpp>
#include <ginkgo/core/base/math.hpp>
#include <ginkgo/core/base/name_demangling.hpp>
#include <ginkgo/core/base/utils.hpp>


#include "core/solver/bicg_kernels.hpp"


namespace gko {
namespace solver {


namespace bicg {


GKO_REGISTER_OPERATION(initialize, bicg::initialize);
GKO_REGISTER_OPERATION(step_1, bicg::step_1);
GKO_REGISTER_OPERATION(step_2, bicg::step_2);


} // namespace bicg


template <typename ValueType, typename IndexType>
void Bicg<ValueType, IndexType>::apply_impl(const LinOp *b, LinOp *x) const
{
using std::swap;
using Vector = matrix::Dense<ValueType>;
using CsrMatrix = matrix::Csr<ValueType, IndexType>;
constexpr uint8 RelativeStoppingId{1};

auto exec = this->get_executor();

auto one_op = initialize<Vector>({one<ValueType>()}, exec);
auto neg_one_op = initialize<Vector>({-one<ValueType>()}, exec);

auto dense_b = as<const Vector>(b);
auto dense_x = as<Vector>(x);
auto r = Vector::create_with_config_of(dense_b);
auto r2 = Vector::create_with_config_of(dense_b);
auto z = Vector::create_with_config_of(dense_b);
auto z2 = Vector::create_with_config_of(dense_b);
auto p = Vector::create_with_config_of(dense_b);
auto p2 = Vector::create_with_config_of(dense_b);
auto q = Vector::create_with_config_of(dense_b);
auto q2 = Vector::create_with_config_of(dense_b);

auto alpha = Vector::create(exec, dim<2>{1, dense_b->get_size()[1]});
auto beta = Vector::create_with_config_of(alpha.get());
auto prev_rho = Vector::create_with_config_of(alpha.get());
auto rho = Vector::create_with_config_of(alpha.get());

bool one_changed{};
Array<stopping_status> stop_status(alpha->get_executor(),
dense_b->get_size()[1]);

// TODO: replace this with automatic merged kernel generator
thoasm marked this conversation as resolved.
Show resolved Hide resolved

exec->run(bicg::make_initialize(
dense_b, r.get(), z.get(), p.get(), q.get(), prev_rho.get(), rho.get(),
r2.get(), z2.get(), p2.get(), q2.get(), &stop_status));

// rho = 0.0
// prev_rho = 1.0
// z = p = q = 0
// r = r2 = dense_b
// z2 = p2 = q2 = 0

decltype(copy_and_convert_to<CsrMatrix>(
exec, const_cast<LinOp *>(
system_matrix_.get()))) csr_system_matrix_unique_ptr{};

std::unique_ptr<LinOp> trans_A;
auto transposable_system_matrix =
dynamic_cast<const Transposable *>(system_matrix_.get());

if (transposable_system_matrix) {
trans_A = transposable_system_matrix->transpose();

tcojean marked this conversation as resolved.
Show resolved Hide resolved
} else {
csr_system_matrix_unique_ptr = copy_and_convert_to<CsrMatrix>(
system_matrix_->get_executor(),
const_cast<LinOp *>(system_matrix_.get()));

csr_system_matrix_unique_ptr->set_strategy(
std::make_shared<typename CsrMatrix::classical>());

auto csr_system_matrix_ = csr_system_matrix_unique_ptr.get();

trans_A = csr_system_matrix_->transpose();
}

auto trans_preconditioner_tmp =
as<const Transposable>(get_preconditioner().get());
auto trans_preconditioner = trans_preconditioner_tmp->transpose();

system_matrix_->apply(neg_one_op.get(), dense_x, one_op.get(), r.get());
// r = r - Ax = -1.0 * A*dense_x + 1.0*r

r2->copy_from(r.get());
// r2 = r

auto stop_criterion = stop_criterion_factory_->generate(
system_matrix_, std::shared_ptr<const LinOp>(b, [](const LinOp *) {}),
x, r.get());

int iter = -1;

while (true) {
get_preconditioner()->apply(r.get(), z.get());
trans_preconditioner->apply(r2.get(), z2.get());

z->compute_dot(r2.get(), rho.get());

++iter;
this->template log<log::Logger::iteration_complete>(this, iter, r.get(),
dense_x);

if (stop_criterion->update()
.num_iterations(iter)
.residual(r.get())
.solution(dense_x)
.check(RelativeStoppingId, true, &stop_status, &one_changed)) {
break;
}

exec->run(bicg::make_step_1(p.get(), z.get(), p2.get(), z2.get(),
rho.get(), prev_rho.get(), &stop_status));
// tmp = rho / prev_rho
// p = z + tmp * p
// p2 = z2 + tmp * p2
system_matrix_->apply(p.get(), q.get());

trans_A->apply(p2.get(), q2.get());

p2->compute_dot(q.get(), beta.get());

exec->run(bicg::make_step_2(dense_x, r.get(), r2.get(), p.get(),
q.get(), q2.get(), beta.get(), rho.get(),
&stop_status));
// tmp = rho / beta
// x = x + tmp * p
// r = r - tmp * q
// r2 = r2 - tmp * q2
swap(prev_rho, rho);
}
}


template <typename ValueType, typename IndexType>
void Bicg<ValueType, IndexType>::apply_impl(const LinOp *alpha, const LinOp *b,
const LinOp *beta, LinOp *x) const
{
auto dense_x = as<matrix::Dense<ValueType>>(x);

auto x_clone = dense_x->clone();
this->apply(b, x_clone.get());
dense_x->scale(beta);
dense_x->add_scaled(alpha, x_clone.get());
}


#define GKO_DECLARE_BICG(_type) class Bicg<_type>
GKO_INSTANTIATE_FOR_EACH_VALUE_TYPE(GKO_DECLARE_BICG);
tcojean marked this conversation as resolved.
Show resolved Hide resolved


} // namespace solver
} // namespace gko
Loading