-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate Depthwidth and Lossguide growing policy in fast histogram #4102
Separate Depthwidth and Lossguide growing policy in fast histogram #4102
Conversation
@CodingCat Can you rebase against the latest master? |
38c8da2
to
be17c1f
Compare
@trivialfis @hcho3 @RAMitchell ping for review |
1 similar comment
@trivialfis @hcho3 @RAMitchell ping for review |
Will review today |
Glanced the changes. Will leave comments today. Got into some troubles with my network access recently. Sorry for the delay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a better, more consistent logic around whether to build histograms or use the subtraction trick. For example could we push a lot of this logic into some histogram class that maintains the state of if it needs to build a new histogram or can use the subtraction trick, or needs to sync? Then we just ask it for the histogram for a particular node and it decides what to do? This is just an idea I would like some more thoughts @CodingCat @hcho3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass over this PR. I agree with the overall decision to split loss-guide and depth-wise strategies.
The complex accounting for sibling relations (left_to_right_siblings
and right_to_left_siblings
) can potentially be simplified. Doesn't each tree node already encode whether it's a left child or right child? See
xgboost/include/xgboost/tree_model.h
Line 132 in 99a2904
bool IsLeftChild() const { |
@RAMitchell @hcho3 thank you very much for the review, I have addressed most of the comments (will continue working on the performance monitor class tmr ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur with @RAMitchell that we can put more thoughts in general structure of histogram building code. Since this PR is about separating two strategists, I tried to compare two ExpandWith*
side by side, and I think there's a slight chance that we can modify share the code of split evaluating and new node initializing between two methods. I'm ok with merging this after issues mentioned in comments got resolved. But it might be better that we make more clarity to the code before merging.
...kages/xgboost4j-spark/src/test/scala/ml/dmlc/xgboost4j/scala/spark/XGBoostGeneralSuite.scala
Outdated
Show resolved
Hide resolved
...kages/xgboost4j-spark/src/test/scala/ml/dmlc/xgboost4j/scala/spark/XGBoostGeneralSuite.scala
Outdated
Show resolved
Hide resolved
Seems the comments got a little bit messy. It's because my machine got frozen during a review. After rebooting, the previous review was not visible to me so I have to do it again. After publishing it the previous somehow shows up ... If you find similar comments, please ignore one of them. |
5fc0b11
to
79f7d31
Compare
I think I have addressed all comments, ready for the next round of review |
Is there some reason you didn't use the existing Monitor class? Line 41 in 99a2904
|
@RAMitchell I was not aware of this class...but when I take a look at Monitor class there, I found it has several differences with the performance monitoring here, e.g. the format of the report and the log level, I am a bit hesitated about whether we should change the behavior of performance reporting either in hist or in other places using that class I personally vote to unify the performance monitoring across updaters after this release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I like the overall organization of the code. I have some minor stylistic comments
src/tree/updater_quantile_hist.cc
Outdated
<< "tree_method=hist does not support multiple roots at this moment"; | ||
if (param_.grow_policy == TrainParam::kLossGuide) { | ||
ExpandWithLossGuide(gmat, gmatb, column_matrix, p_fmat, p_tree, gpair_h); | ||
while (!qexpand_loss_guided_->empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just remove this loop, along with the 3-line comments?
src/tree/updater_quantile_hist.cc
Outdated
} | ||
} else { | ||
ExpandWithDepthWidth(gmat, gmatb, column_matrix, p_fmat, p_tree, gpair_h); | ||
} | ||
|
||
// set all the rest expanding nodes to leaf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NVM, let's just remove the comment
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
unsigned timestamp = 0; | ||
int num_leaves = 0; | ||
|
||
for (int nid = 0; nid < p_tree->param.num_roots; ++nid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this is merged, I will file a follow-up PR to deprecate num_roots
parameter.
thanks for the review @trivialfis @hcho3 @RAMitchell |
So depthwidth is now going to be per-level? :( That's unfortunate, because we use depthwidth and the per-nodeness of fast histogram to do feature selection - addding penalty to loss change for unused features. Once feature is used, no penalty is applied. With per-levelness it wouldn't work well especially for deep trees. Was it really worth it? |
@Denisevi4 it helps in reducing communication overhead in distributed training and serves as the base for improving multi-cores performance in the next release by increasing maximum parallelism |
@Denisevi4 The per-nodeness of fast histogram was not so good for performance, due to 1) extra syncing needed in distributed setting, and 2) lack of parallelism in multi-core CPU |
it's a following PR of #4011 which separates depthwidth and loss guide growing policy in fast histogram per our discussion in #4077
closes #4077
@trivialfis @hcho3 @RAMitchell please help to review