Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triangular solvers #1193

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

Conversation

lksriemer
Copy link
Contributor

This PR adds several triangular solvers. Machinery to either select the solver from a configuration file or to automatically choose a good solver will be included in a future PR. For now, the default remains sparselib.

These solvers are added:

  • level based solver
  • warp diagonal-inverting solver 'winv'
  • (somewhat) load-balancing solver 'wvar'
  • thinned out syncfree solver
  • block solver

The changes are a little cluttered due to some renaming, e. g. trisolve_algorithm to trisolve_strategy.

Some minimal refactoring is still required, but in principle the code is ready for review. There are several TODOs still in the code. They are not really pressing, but it should probably be decided what to do about them. Some might just be deleted.

adds these triangular solvers:
- level based solver
- warp inverting solver 'winv'
- load-balancing solver 'wvar'
- thinned out syncfree solver
- block solver
@MarcelKoch
Copy link
Member

label!

@ginkgo-bot ginkgo-bot added mod:all This touches all Ginkgo modules. reg:build This is related to the build system. reg:testing This is related to testing. type:solver This is related to the solvers labels Nov 14, 2022
Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first glance at the infrastructure and some of the algorithms. Really nice work!

@@ -73,7 +73,7 @@ template <typename ValueType, typename IndexType>
void generate(std::shared_ptr<const HipExecutor> exec,
const matrix::Csr<ValueType, IndexType>* matrix,
std::shared_ptr<solver::SolveStruct>& solve_struct,
bool unit_diag, const solver::trisolve_algorithm algorithm,
bool unit_diag, const solver::trisolve_strategy strategy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool unit_diag, const solver::trisolve_strategy strategy,
bool unit_diag, const solver::trisolve_strategy* strategy,

@@ -73,7 +73,7 @@ template <typename ValueType, typename IndexType>
void generate(std::shared_ptr<const HipExecutor> exec,
const matrix::Csr<ValueType, IndexType>* matrix,
std::shared_ptr<solver::SolveStruct>& solve_struct,
bool unit_diag, const solver::trisolve_algorithm algorithm,
bool unit_diag, const solver::trisolve_strategy strategy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool unit_diag, const solver::trisolve_strategy strategy,
bool unit_diag, const solver::trisolve_strategy* strategy,

@@ -70,7 +70,7 @@ template <typename ValueType, typename IndexType>
void generate(std::shared_ptr<const DpcppExecutor> exec,
const matrix::Csr<ValueType, IndexType>* matrix,
std::shared_ptr<solver::SolveStruct>& solve_struct,
bool unit_diag, const solver::trisolve_algorithm algorithm,
bool unit_diag, const solver::trisolve_strategy strategy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool unit_diag, const solver::trisolve_strategy strategy,
bool unit_diag, const solver::trisolve_strategy* strategy,

@@ -70,7 +70,7 @@ template <typename ValueType, typename IndexType>
void generate(std::shared_ptr<const DpcppExecutor> exec,
const matrix::Csr<ValueType, IndexType>* matrix,
std::shared_ptr<solver::SolveStruct>& solve_struct,
bool unit_diag, const solver::trisolve_algorithm algorithm,
bool unit_diag, const solver::trisolve_strategy strategy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool unit_diag, const solver::trisolve_strategy strategy,
bool unit_diag, const solver::trisolve_strategy* strategy,

@@ -32,13 +32,22 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

template <typename ValueType, typename IndexType>
__device__ __forceinline__
std::enable_if_t<!is_complex_s<ValueType>::value, ValueType>
std::enable_if_t<std::is_floating_point<ValueType>::value, ValueType>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this have to do with int not having a definition of is_complex? We will probably run into issues with #1257

Comment on lines +314 to +318
void should_perform_transpose_kernel(std::shared_ptr<const CudaExecutor> exec,
bool& do_transpose)
{
do_transpose = false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this most likely belongs outside the unnamed namespace

Comment on lines +732 to +734
if (levels[row] != sweep) {
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to sort by levels and instead launch only threads for the rows in that level?



template <typename IndexType, bool is_upper>
__global__ void level_generation_kernel(const IndexType* const rowptrs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can find a way to merge this kernel with the numerical one (same for legacy) with the right generalizations of + and *?

const auto local_row = rhs;
const auto row = gid;

const auto activemask = __activemask();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

activemask and Volta don't play nice with each other :-( This may only be a subset of all threads in this branch


// This is "heavily inspired" by cppreference.
template <class T>
__device__ const T* lower_bound(const T* first, const T* const last,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have an implementation in searching.cuh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mod:all This touches all Ginkgo modules. reg:build This is related to the build system. reg:testing This is related to testing. type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants