Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring monotone constraints (linked to #2305) #2717

Conversation

CharlesAuguste
Copy link
Contributor

@aldanor @redditur

@guolinke @chivee @jameslamb @huanzhang12 @Laurae2 @StrikerRUS
You were all reviewers of the PR #2305. This PR was judged to be not merge-able because it is too big. Therefore, in my ultimate comment (#2305 (comment)) I said I would split into smaller PR easier to merge. This is the first PR in relation with #2305.

The goal of this PR is just to refactor the current code around monotone constraints. The behavior of the library should not be affected by this PR. Refactoring the monotone constraints that way was asked by @guolinke in the previous PR (from the comment #2305 (comment)). We think that refactoring is very important because it will make it easier to develop more efficient constraining methods.

The main issue with this PR is that I was not able to refactor the files data_parallel_tree_learner.cpp, gpu_tree_learner.cpp, voting_parallel_tree_learner because I am not familiar with them (already mentioned in #2305 (comment) and in the discussion below). I can spend more time on it, but I would really appreciate getting help on that specific part if that is possible for you. @guolinke Since this refactor is something you were looking forward to, are you willing to fix these files for us? If you are familiar with these tree learners, I hope that would not be too much work. Please let me know what you think. Thank you very much.

@CharlesAuguste CharlesAuguste changed the title Refactor monotone constraints (linked to #2305) Refactoring monotone constraints (linked to #2305) Jan 29, 2020
#ifndef LIGHTGBM_TREELEARNER_MONOTONE_CONSTRAINTS_H_
#define LIGHTGBM_TREELEARNER_MONOTONE_CONSTRAINTS_H_

#include <algorithm>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CharlesAuguste include vector and cstdint?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And limits

@guolinke
Copy link
Collaborator

As #2699 will significantly improve training efficiency. Could you wait for its merge? it is almost ready, except the GPU codes.

@CharlesAuguste CharlesAuguste force-pushed the PR1-refactor-code-to-prepare-for-monotone-constraints branch from 63dbcf4 to 5ce7bdd Compare February 3, 2020 14:45
@CharlesAuguste
Copy link
Contributor Author

@guolinke I just rebased on your last PR. I also improved the compilation of this PR, and now only GPU is failing (but that's to be expected because I was not able to fix the file as explained in my initial comment). Let me know what you think. Thanks

@guolinke
Copy link
Collaborator

guolinke commented Feb 5, 2020

@CharlesAuguste I create PR in your repo

@CharlesAuguste
Copy link
Contributor Author

@guolinke I merged your fix, we pass many more checks now. Thanks for that! However, there are still a few checks that we are not passing. These seem related to the docs and to the R package, but I don't understand how that is related to this PR. Any idea of what might be wrong? Thanks

@guolinke
Copy link
Collaborator

guolinke commented Feb 7, 2020

@CharlesAuguste I think they are temporary fails.
Also, ping @StrikerRUS and @jameslamb .

@aldanor
Copy link

aldanor commented Feb 7, 2020

Yea, it looks like some docs-related server is down, causing temporary connection errors.

@StrikerRUS
Copy link
Collaborator

Yea, it looks like some docs-related server is down, causing temporary connection errors.

I just re-run the test. Now everything is OK. Sorry about the inconvenience.

For R tests please refer to #2748 (comment).

@jameslamb
Copy link
Collaborator

Yea, it looks like some docs-related server is down, causing temporary connection errors.

I just re-run the test. Now everything is OK. Sorry about the inconvenience.

For R tests please refer to #2748 (comment).

@StrikerRUS @CharlesAuguste @aldanor I think #2752 will fix the R test issues, let's see.

@guolinke
Copy link
Collaborator

guolinke commented Feb 9, 2020

@CharlesAuguste could you fix the conflicts?

@CharlesAuguste
Copy link
Contributor Author

No problem, I will do fix them.

@CharlesAuguste CharlesAuguste force-pushed the PR1-refactor-code-to-prepare-for-monotone-constraints branch from 57f427c to a195864 Compare February 10, 2020 11:02
@CharlesAuguste
Copy link
Contributor Author

@guolinke I solved the merge conflicts, and all checks seem to pass. So it should be ready to merge if you are happy enough with the PR now :).

@guolinke guolinke merged commit 3670e47 into microsoft:master Feb 10, 2020
StrikerRUS pushed a commit that referenced this pull request Mar 23, 2020
#2770)

* Add util functions.

* Added monotone_constraints_method as a parameter.

* Add the intermediate constraining method.

* Updated tests.

* Minor fixes.

* Typo.

* Linting.

* Ran the parameter generator for the doc.

* Removed usage of the FeatureMonotone function.

* more fixes

* Fix.

* Remove duplicated code.

* Add debug checks.

* Typo.

* Bug fix.

* Disable the use of intermediate monotone constraints and feature sampling at the same time.

* Added an alias for monotone constraining method.

* Use the right variable to get the number of threads.

* Fix DEBUG checks.

* Add back check to determine if histogram is splittable.

* Added forgotten override keywords.

* Perform monotone constraint update only when necessary.

* Small refactor of FastLeafConstraints.

* Post rebase commit.

* Small refactor.

* Typo.

* Added comment and slightly improved logic of monotone constraints.

* Forgot a const.

* Vectors that are to be modified need to be pointers.

* Rename FastLeafConstraints to IntermediateLeafConstraints to match documentation.

* Remove overload of GoUpToFindLeavesToUpdate.

* Stop memory leaking.

* Fix cpplint issues.

* Fix checks.

* Fix more cpplint issues.

* Refactor config monotone constraints method.

* Typos.

* Remove useless empty lines.

* Add new line to separate includes.

* Replace unsigned ind by size_t.

* Reduce number of trials in tests to decrease CI time.

* Specify monotone constraints better in tests.

* Removed outer loop in test of monotone constraints.

* Added categorical features to the monotone constraints tests.

* Add blank line.

* Regenerate parameters automatically.

* Speed up ShouldKeepGoingLeftRight.

Co-authored-by: Charles Auguste <auguste@dubquantdev801.ire.susq.com>
Co-authored-by: guolinke <guolin.ke@outlook.com>
jameslamb pushed a commit to jameslamb/LightGBM that referenced this pull request Mar 24, 2020
…, microsoft#2717)  (microsoft#2770)

* Add util functions.

* Added monotone_constraints_method as a parameter.

* Add the intermediate constraining method.

* Updated tests.

* Minor fixes.

* Typo.

* Linting.

* Ran the parameter generator for the doc.

* Removed usage of the FeatureMonotone function.

* more fixes

* Fix.

* Remove duplicated code.

* Add debug checks.

* Typo.

* Bug fix.

* Disable the use of intermediate monotone constraints and feature sampling at the same time.

* Added an alias for monotone constraining method.

* Use the right variable to get the number of threads.

* Fix DEBUG checks.

* Add back check to determine if histogram is splittable.

* Added forgotten override keywords.

* Perform monotone constraint update only when necessary.

* Small refactor of FastLeafConstraints.

* Post rebase commit.

* Small refactor.

* Typo.

* Added comment and slightly improved logic of monotone constraints.

* Forgot a const.

* Vectors that are to be modified need to be pointers.

* Rename FastLeafConstraints to IntermediateLeafConstraints to match documentation.

* Remove overload of GoUpToFindLeavesToUpdate.

* Stop memory leaking.

* Fix cpplint issues.

* Fix checks.

* Fix more cpplint issues.

* Refactor config monotone constraints method.

* Typos.

* Remove useless empty lines.

* Add new line to separate includes.

* Replace unsigned ind by size_t.

* Reduce number of trials in tests to decrease CI time.

* Specify monotone constraints better in tests.

* Removed outer loop in test of monotone constraints.

* Added categorical features to the monotone constraints tests.

* Add blank line.

* Regenerate parameters automatically.

* Speed up ShouldKeepGoingLeftRight.

Co-authored-by: Charles Auguste <auguste@dubquantdev801.ire.susq.com>
Co-authored-by: guolinke <guolin.ke@outlook.com>
@lock lock bot locked as resolved and limited conversation to collaborators Apr 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants