Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate Depthwidth and Lossguide growing policy in fast histogram #4102

Merged
merged 73 commits into from
Feb 13, 2019

Conversation

CodingCat
Copy link
Member

@CodingCat CodingCat commented Feb 5, 2019

it's a following PR of #4011 which separates depthwidth and loss guide growing policy in fast histogram per our discussion in #4077

closes #4077

@trivialfis @hcho3 @RAMitchell please help to review

@hcho3
Copy link
Collaborator

hcho3 commented Feb 7, 2019

@CodingCat Can you rebase against the latest master?

@CodingCat CodingCat force-pushed the dist_fast_histogram_per_level branch from 38c8da2 to be17c1f Compare February 7, 2019 17:27
@CodingCat
Copy link
Member Author

@trivialfis @hcho3 @RAMitchell ping for review

1 similar comment
@CodingCat
Copy link
Member Author

@trivialfis @hcho3 @RAMitchell ping for review

@hcho3
Copy link
Collaborator

hcho3 commented Feb 10, 2019

Will review today

@trivialfis
Copy link
Member

Glanced the changes. Will leave comments today. Got into some troubles with my network access recently. Sorry for the delay.

Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a better, more consistent logic around whether to build histograms or use the subtraction trick. For example could we push a lot of this logic into some histogram class that maintains the state of if it needs to build a new histogram or can use the subtraction trick, or needs to sync? Then we just ask it for the histogram for a particular node and it decides what to do? This is just an idea I would like some more thoughts @CodingCat @hcho3.

src/common/column_matrix.h Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.h Show resolved Hide resolved
Copy link
Collaborator

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first pass over this PR. I agree with the overall decision to split loss-guide and depth-wise strategies.

The complex accounting for sibling relations (left_to_right_siblings and right_to_left_siblings) can potentially be simplified. Doesn't each tree node already encode whether it's a left child or right child? See

bool IsLeftChild() const {

src/common/column_matrix.h Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.h Show resolved Hide resolved
src/tree/updater_quantile_hist.h Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
@CodingCat
Copy link
Member Author

@RAMitchell @hcho3 thank you very much for the review, I have addressed most of the comments (will continue working on the performance monitor class tmr )

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur with @RAMitchell that we can put more thoughts in general structure of histogram building code. Since this PR is about separating two strategists, I tried to compare two ExpandWith* side by side, and I think there's a slight chance that we can modify share the code of split evaluating and new node initializing between two methods. I'm ok with merging this after issues mentioned in comments got resolved. But it might be better that we make more clarity to the code before merging.

src/common/hist_util.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.h Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.h Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
@trivialfis
Copy link
Member

trivialfis commented Feb 11, 2019

Seems the comments got a little bit messy. It's because my machine got frozen during a review. After rebooting, the previous review was not visible to me so I have to do it again. After publishing it the previous somehow shows up ... If you find similar comments, please ignore one of them.

@CodingCat
Copy link
Member Author

I think I have addressed all comments, ready for the next round of review

@RAMitchell
Copy link
Member

Is there some reason you didn't use the existing Monitor class?

* \struct Monitor

@CodingCat
Copy link
Member Author

@RAMitchell I was not aware of this class...but when I take a look at Monitor class there, I found it has several differences with the performance monitoring here, e.g. the format of the report and the log level, I am a bit hesitated about whether we should change the behavior of performance reporting either in hist or in other places using that class

I personally vote to unify the performance monitoring across updaters after this release

Copy link
Collaborator

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I like the overall organization of the code. I have some minor stylistic comments

src/tree/updater_quantile_hist.h Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.h Outdated Show resolved Hide resolved
<< "tree_method=hist does not support multiple roots at this moment";
if (param_.grow_policy == TrainParam::kLossGuide) {
ExpandWithLossGuide(gmat, gmatb, column_matrix, p_fmat, p_tree, gpair_h);
while (!qexpand_loss_guided_->empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just remove this loop, along with the 3-line comments?

}
} else {
ExpandWithDepthWidth(gmat, gmatb, column_matrix, p_fmat, p_tree, gpair_h);
}

// set all the rest expanding nodes to leaf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM, let's just remove the comment

src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.h Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
src/tree/updater_quantile_hist.cc Outdated Show resolved Hide resolved
hcho3 and others added 8 commits February 13, 2019 00:02
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Co-Authored-By: CodingCat <CodingCat@users.noreply.github.com>
Copy link
Collaborator

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


unsigned timestamp = 0;
int num_leaves = 0;

for (int nid = 0; nid < p_tree->param.num_roots; ++nid) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this is merged, I will file a follow-up PR to deprecate num_roots parameter.

@CodingCat
Copy link
Member Author

thanks for the review @trivialfis @hcho3 @RAMitchell

@CodingCat CodingCat merged commit c18a366 into dmlc:master Feb 13, 2019
@Denisevi4
Copy link

So depthwidth is now going to be per-level? :(

That's unfortunate, because we use depthwidth and the per-nodeness of fast histogram to do feature selection - addding penalty to loss change for unused features. Once feature is used, no penalty is applied. With per-levelness it wouldn't work well especially for deep trees. Was it really worth it?

@CodingCat
Copy link
Member Author

CodingCat commented Feb 14, 2019

@Denisevi4 it helps in reducing communication overhead in distributed training and serves as the base for improving multi-cores performance in the next release by increasing maximum parallelism

@hcho3
Copy link
Collaborator

hcho3 commented Feb 14, 2019

@Denisevi4 The per-nodeness of fast histogram was not so good for performance, due to 1) extra syncing needed in distributed setting, and 2) lack of parallelism in multi-core CPU

@lock lock bot locked as resolved and limited conversation to collaborators May 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[proposed change] distinguish loss-guide and depth-guide in fast histogram
5 participants