-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flat sparse packs #877
Conversation
P.S., thanks @lroberts36 for your help with this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, for my own understanding, how is this intended to be used downstream with sparse variables? Is it just an if
check for each variable on each meshblock and the work is skipped if that variable isnt allocated on that meshblock?
It only packs allocated variables. So if for example, I have a total of 20 variables, but 3 variables are allocated on each of 5 blocks (where the set of variables allocated are different on each block), you get an outer index of length 15. |
I see what you did there ;) |
@lroberts36 you should probably be the other person to review this---seeing as you wrote sparse packs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM except that I think we should make sure the pack bounds arrays are meaningful even for flat packs.
src/interface/sparse_pack.hpp
Outdated
@@ -200,27 +252,32 @@ class SparsePack : public SparsePackBase { | |||
KOKKOS_INLINE_FUNCTION int GetLowerBound(const int b) const { return 0; } | |||
|
|||
KOKKOS_INLINE_FUNCTION int GetUpperBound(const int b) const { | |||
PARTHENON_DEBUG_REQUIRE(!flat_, "Bounds not valid for flat packs"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are they not valid if b==0
? I thought they were.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, after reading below I see they are not valid. I think we should make it so that they are valid, and maybe add a default const int b=0
here so that GetLowerBound()
and GetUpperBound()
work for sparse packs (or maybe make them separate functions that have a check if flat_==true
). I think this is clearer than exposing the pack size through GetMaxNumberOfVars()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to just call them GetFlatLowerBound()
and GetFlatUpperBound()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to make them valid and just re-use the API. I'm not sure if we should come up with new or more generic names, I just chose not to for now. We can revisit that later maybe.
src/interface/sparse_pack_base.cpp
Outdated
if (!desc.flat) { | ||
pack.bounds_h_(0, b, i) = idx; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we still set this with the actual block index, i.e. pack.bounds_h_(0, block, i) = idx
? I am not sure what the use case would be, but it would allow all of the Get*Bound
stuff to work returning the flat index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually desired behaviour though? I wonder if GetBounds
should return 0, and nvars always in the flat case? I could add an if statement to that effect maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now updated GetLowerBound
returns 0 and GetUpperBound
returns total numbers of vars and blocks (i.e., pack size) for flat packs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
I don't have a strong opinion with respect to the index handling.
In the spirit of our ParArrayNDs
that transparently handle 0
I could see the appeal to also allow using the standard ()
operator and bounds being used with a PARTHENON_DEBUG_REQUIRE(!flat || b == 0,...)
.
Co-authored-by: Philipp Grete <pgrete@hs.uni-hamburg.de>
After incorporating @lroberts36 's suggestions, I broke a test, so I'm going to need a little time to debug. We should not wait for this PR for the next release. @pgrete . |
@lroberts36 this is ready for re-review. I believe I've incorporated all your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we were on a different page about how to keep the (block, variable) indexing working for flat packs. I included some possible changes that are along the lines of what I was thinking. I think we probably don't want to do it because of the extra H2D copy, but another array could be added that maps from flat idx to block idx (along with functions for querying the pack for which block an flat idx corresponds to) so that e.g. the correct coordinates could be accessed from a flat pack.
src/interface/sparse_pack.hpp
Outdated
KOKKOS_INLINE_FUNCTION int GetLowerBound(const int b) const { return 0; } | ||
KOKKOS_INLINE_FUNCTION int GetLowerBound(const int b = 0) const { return 0; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should change for flat packs. GetLowerBound(b)
with b > 0
should return bounds_(0, b, 0)
and GetUpperBound(b)
should return bounds_(1, b, nvar_)
.
src/interface/sparse_pack.hpp
Outdated
KOKKOS_INLINE_FUNCTION int GetUpperBound(const int b = 0) const { | ||
return bounds_(1, !flat_ * b, !flat_ * nvar_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there should be any check for if the pack is flat here. This is why I think that there actually needs to be a new function that returns the total upper bound
KOKKOS_INLINE_FUNCTION int GetUpperBound() const {
assert(flat_);
return bounds_(1, GetNBlocks() - 1, nvar_);
}
src/interface/sparse_pack.hpp
Outdated
KOKKOS_INLINE_FUNCTION int GetUpperBoundHost(const int b) const { | ||
return bounds_h_(1, b, nvar_); | ||
KOKKOS_INLINE_FUNCTION int GetUpperBoundHost(const int b = 0) const { | ||
return bounds_h_(1, !flat_ * b, !flat_ * nvar_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also no flat check here.
src/interface/sparse_pack_base.cpp
Outdated
@@ -179,13 +187,12 @@ SparsePackBase SparsePackBase::Build(T *pmd, const PackDescriptor &desc) { | |||
} | |||
} | |||
// Record the maximum for easy access | |||
pack.bounds_h_(1, b, nvar) = idx - 1; | |||
pack.bounds_h_(1, !desc.flat * b, !desc.flat * nvar) = idx - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pack.bounds_h_(1, !desc.flat * b, !desc.flat * nvar) = idx - 1; | |
pack.bounds_h_(1, block, nvar) = idx - 1; |
src/interface/sparse_pack_base.cpp
Outdated
if (!desc.flat) { | ||
pack.bounds_h_(1, b, i) = idx - 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!desc.flat) { | |
pack.bounds_h_(1, b, i) = idx - 1; | |
} | |
pack.bounds_h_(1, b, i) = idx - 1; |
if (!desc.flat) { | ||
idx = 0; | ||
b = block; | ||
// JMM: This line could be unified with the coords_h line below, | ||
// but it would imply unnecessary copies in the case of non-flat | ||
// packs. | ||
coords_h(b) = pmbd->GetBlockPointer()->coords_device; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!desc.flat) { | |
idx = 0; | |
b = block; | |
// JMM: This line could be unified with the coords_h line below, | |
// but it would imply unnecessary copies in the case of non-flat | |
// packs. | |
coords_h(b) = pmbd->GetBlockPointer()->coords_device; | |
} | |
if (!desc.flat) { | |
idx = 0; | |
b = block; | |
} | |
coords_h(block) = pmbd->GetBlockPointer()->coords_device; |
src/interface/sparse_pack_base.cpp
Outdated
if (!desc.flat) { | ||
pack.bounds_h_(0, b, i) = idx; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!desc.flat) { | |
pack.bounds_h_(0, b, i) = idx; | |
} | |
pack.bounds_h_(0, block, i) = idx; |
src/interface/sparse_pack_base.cpp
Outdated
if (desc.flat) { | ||
coords_h(idx) = pmbd->GetBlockPointer()->coords_device; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (desc.flat) { | |
coords_h(idx) = pmbd->GetBlockPointer()->coords_device; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is potentially confusing that for regular backs GetCoords
takes the block number as an argument while for flat packs it takes the variable flat index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to be able to make this call, as I think there's a clear use-case: i.e., you need coords for your stencil op. However, I would be willing to give GetCoords
two different names?
src/interface/sparse_pack_base.cpp
Outdated
// Size is nvar + 1 to store the maximum idx for easy access | ||
pack.bounds_ = bounds_t("bounds", 2, nblocks, nvar + 1); | ||
pack.bounds_ = bounds_t("bounds", 2, desc.flat ? 1 : nblocks, !desc.flat * nvar + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pack.bounds_ = bounds_t("bounds", 2, desc.flat ? 1 : nblocks, !desc.flat * nvar + 1); | |
pack.bounds_ = bounds_t("bounds", 2, nblocks, nvar + 1); |
pack.bounds_h_ = Kokkos::create_mirror_view(pack.bounds_); | ||
|
||
pack.coords_ = coords_t("coords", nblocks); | ||
pack.coords_ = coords_t("coords", desc.flat ? max_size : nblocks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pack.coords_ = coords_t("coords", desc.flat ? max_size : nblocks); | |
pack.coords_ = coords_t("coords", nblocks); |
@lroberts36 I don't think I understand what you want. Can we talk over MM or something? |
yep, I am on MM |
@lroberts36 take a look now. I've re-enabled original functionality for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple more comments/suggestions below, but this LGTM and I approve
src/interface/sparse_pack_base.cpp
Outdated
if (desc.flat) { | ||
coords_h(idx) = pmbd->GetBlockPointer()->coords_device; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is potentially confusing that for regular backs GetCoords
takes the block number as an argument while for flat packs it takes the variable flat index.
src/interface/sparse_pack_base.hpp
Outdated
KOKKOS_INLINE_FUNCTION | ||
Real &operator()(int idx, const int k, const int j, const int i) const { | ||
PARTHENON_DEBUG_REQUIRE(flat_, "Accessor only valid for flat packs"); | ||
return pack_(0, 0, idx)(k, j, i); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for providing operator()
in sparse_pack_base.hpp
? I was trying to confine the interface to just SparsePack
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a mistake. I'll remove it.
@pgrete the cuda regression test appears to be failing, without a useful error message, but it works on my machine. I'll dig in a little bit, but I wonder if something is wrong with the CI machine? |
Weird problem went away. Pulling the trigger. |
PR Summary
In riot I recently encountered an interesting use-case, which I needed to support. It's actually familiar. In the boundary communication kernels, we fuse the loop over meshblocks, over faces, and over variables. So the outermost index has a range that spans ALL variables over ALL boundary buffers and ALL meshblocks in a given
MeshData
object. We do this so that when we run with hierarchical parallelism, there's enough "chunks" of work in the outermost index space to saturate the device... i.e., we need at least as many elements in the outer range as we have warps on the GPU.I found myself needing the same capability for loops over variables in the interior of a block. I.e., I needed a hierarchical structure where the outer loop is over meshblocks and variables TOGETHER and the inner loop is over the 3D index space of the meshblock. Because of sparse, this outer loop needs to be a SINGLE loop. The number of variables on each meshblock might vary.
Here I implement this capability in @lroberts36 's
SparsePack
s, which was the easiest place. When built with thesparse
flag in the generic factory function, or with theGetFlat
factory function, the sparse pack builds with this shape.Note that I added a blurb describing this functionality, but the documentation for the
SparsePack
is very sparse, and I didn't fill it in.PR Checklist