Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RMM integration plugin #5873
RMM integration plugin #5873
Changes from 45 commits
b7a322d
e15845d
812c209
a891112
b5eb54d
6abd4c0
2bdbc23
c723632
78c2254
a520fa1
a73391c
309efc0
fa4ec11
871fc29
48051df
c0a05ce
c12e0a6
a3e0e2f
3aeab69
862d580
2a064bf
ab4e7b4
dd05d7b
a4da8c5
789021f
f27d836
a4b86a9
e5eb262
4cf7f00
d023a50
abc64a3
1e7e42e
f1eeaff
1069ae0
99a7520
ecc16ec
92d1481
e74fd0d
2ee04b3
87422a2
9021a75
377580a
2f3c532
3df7cc3
567fb33
1e63c46
ad216c5
b4195cd
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we are constructing the predictor lazily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The device vectors were being constructed before
cudaSetDevice()
was called. The device vectors need access to correct CUDA context at the time of construction, so I've delayed construction of the device vectors.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit worrying if this is necessary, not sure if we can guarantee this behaviour across xgboost. Also wouldn't it be easier to place DeviceModel inside a unique pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, this was not necessary, since Thrust's device vector builds its memory resource (MR) lazily for each GPU it was being used. On the other hand, if we use RMM allocator with device vectors, then the correct CUDA context needs to be set (with
cudaSetDevice()
) prior to the construction of the device vector.For now this line works, but in the longer term I can design a new device vector class that lazily constructs the device MR.
That won't work, because DeviceModel has a separate Init() function, and the correct CUDA context isn't set until we call Init() function.
xgboost/src/predictor/gpu_predictor.cu
Lines 244 to 245 in 1d22a9b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an earlier point in the program that we can set the device? e.g. in the learner as soon as it receives the parameter gpu_id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you can use
HostDeviceVector
, which is lazy.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I replaced it with
HostDeviceVector
.