Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiThreading not working properly with version 0.71.2 #3543

Closed
xavierburtschell opened this issue Aug 1, 2018 · 16 comments
Closed

MultiThreading not working properly with version 0.71.2 #3543

xavierburtschell opened this issue Aug 1, 2018 · 16 comments

Comments

@xavierburtschell
Copy link

Hello,

I'm using xgboost in R/Rstudio desktop (latest versions for both as of 1st august 2018) on a dell laptop, windows 10, 4 cores 8 logical processor, 16 Go Ram. Since I updated xgboost to 0.71.2, it suddenly became very slow. After investigation, it appears it wasn't using multithreading/CPU properly: all cores were used, but it wouldn't go beyond 50-60% CPU usage, and the nthread input in params for xgb.train (for example) had close to no impact on the performance. I rolled back to 0.71.1, it's working fine again (for nthread=8, all cores being used, 100% CPU usage.

Did anybody noticed this issue as well? Any solution/fix for latest version of xgboost?

Many thanks,

Xavier

@sjain777
Copy link

sjain777 commented Aug 8, 2018

Yes, I am having the same problem too! Building models now takes almost twice as long as with earlier versions of xgboost. Could one of the authors please look into this for a fix in version 0.71.2? Thanks much!

@hcho3
Copy link
Collaborator

hcho3 commented Aug 28, 2018

Do you have the same problem with version 0.80?

@hcho3
Copy link
Collaborator

hcho3 commented Aug 30, 2018

Also, do you experience the same problem with Python package?

@xavierburtschell
Copy link
Author

xavierburtschell commented Aug 30, 2018

hcho3, I haven't try yet either of those, for now i've decided to stay with 0.71.1, I will try 0.80 when I can.

@sjain777
Copy link

Hi hcho3, the following site: https://cran.r-project.org/web/packages/xgboost/index.html shows version 0.71.2 as the latest available for Windows. Where can I get 0.80 from? I would like to test it when the Windows binaries is available for 0.80. Thanks much.

@jeremiedb
Copy link

I also noticed the slowdown from CRAN (0.71.2), but installing from current master (0.81.0.1 - compiled with VS) has similar performance to Python (0.80.0).
Strangely though, R took 12.8 sec. in my test vs 12.2 sec. for Python, but CPU usage capped at 65% during R training while at 100% with Python.
@sjain777 Installing xgboost from source is fairly straightforward following the docs: https://xgboost.readthedocs.io/en/latest/build.html

git clone --recursive https://github.com/dmlc/xgboost
cd xgboost
git submodule init
git submodule update
cd R-package
R CMD INSTALL .

@hcho3
Copy link
Collaborator

hcho3 commented Sep 23, 2018

@sjain777 0.80 is not yet available at CRAN, since some unit tests are failing spuriously for 32-bit Windows. We'll working on it, but for now you'll have to compile from the source.

@xavierburtschell Did you get around trying latest versions?

@xavierburtschell
Copy link
Author

@hcho3 still haven't tried, i'm sticking with 0.71.1 for now. I'm putting a model in production using it, now is not the right time to try a new version. I'll try within the next few months I think

@hcho3
Copy link
Collaborator

hcho3 commented Oct 27, 2018

Will investigate if this is due to the choice of compiler (MinGW vs Visual Studio). It is reported that some OpenMP loops may face worse performance when compiled with MinGW. See #2243 and https://medium.com/data-design/lightgbm-on-windows-visual-studio-vs-mingw-gcc-r-with-visual-studio-417fc14eca2c.

@hcho3
Copy link
Collaborator

hcho3 commented Oct 28, 2018

Actually, it appears that the problem isn't restricted to Windows. Will investigate.

@hcho3
Copy link
Collaborator

hcho3 commented Oct 28, 2018

I ran a small test script on EC2 and so far I'm not seeing performance degradation between versions.

Test script

library(xgboost) 
library(microbenchmark)
set.seed(222)
N <- 2*10^5
p <- 350
x <- matrix(rnorm(N  * p), ncol = p)
y <- rnorm(N )

microbenchmark( 
  mymodel <- xgboost(data = x, label = y, nrounds = 5, 
                     objective = "reg:linear", "tree_method" = "exact",
                     max_depth = 10, min_child_weight = 1, eta = 1, 
                     subsample = 0.66, colsample_bytree = 0.33), 
  times = 6) 

microbenchmark( 
  mymodel <- xgboost(data = x, label = y, nrounds = 5, 
                     objective = "reg:linear", "tree_method" = "approx",
                     max_depth = 10, min_child_weight = 1, eta = 1, 
                     subsample = 0.66, colsample_bytree = 0.33), 
  times = 6) 
C5.9xlarge 'Exact' run time (sec) 'Approx' run time (sec)
latest master (commit hash d81fedb) 3.16 3.16
0.71.2 (commit hash 1214081) 2.53 3.49
0.71.1 (commit hash 098075b) 2.57 3.52
0.6.4 (commit hash ce84af7) 2.91 3.18
C5.18xlarge 'Exact' run time (sec) 'Approx' run time (sec)
latest master (commit hash d81fedb) 3.23 3.04
0.71.2 (commit hash 1214081) 2.28 5.02
0.71.1 (commit hash 098075b) 2.50 4.56
0.6.4 (commit hash ce84af7) 2.58 4.47

@hcho3
Copy link
Collaborator

hcho3 commented Oct 28, 2018

If anyone has a detailed instructions to re-produce the bug, feel free to post it here. I will look at other bugs in the meantime.

@coder3344
Copy link

coder3344 commented Jan 16, 2019

i get same problem in v0.81, most of time the cpu useage stay at 99%~100%(1 cpu core), occasionally it can reach 400% or more ( i set 35 threads) but can not be maintained for long time
when i roll back to v0.71, erverything works perfect

xgboost install cmd: conda install -c conda-forge xgboost

@xavierburtschell
Copy link
Author

xavierburtschell commented Jun 17, 2019

@hcho3 I just tried xgboost version 0.82.1, problem solved! all Cores/threads used at 100%, no noticeable difference in computation time between version 0.71.1 and 0.82.1.
thanks!
Just out of curiosity, any idea how version 0.82.1 solved it?

@hcho3
Copy link
Collaborator

hcho3 commented Jul 3, 2019

@xavierburtschell We have an on-going effort to let XGBoost utilize multi-core CPUs better. See #3957, #4310, #4529.

@hcho3 hcho3 closed this as completed Jul 3, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants