Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NoLD issue in xgb.train and/or xgb.attr #5935

Closed
vnijs opened this issue Jul 24, 2020 · 11 comments · Fixed by #6378
Closed

NoLD issue in xgb.train and/or xgb.attr #5935

vnijs opened this issue Jul 24, 2020 · 11 comments · Fixed by #6378
Assignees

Comments

@vnijs
Copy link

vnijs commented Jul 24, 2020

CRAN has notified me of a NoLD problem in one of my R-packages that seems to be attributable to XGBoost. Unfortunately I haven't been able to figure out a way to test the NoLD setup that CRAN has without actually submitting the package to CRAN. If you could provide any assistance to help figure out where the issue is coming from so I can report back to CRAN without getting several of my packages archived that we be wonderful.

Error in finalizer(env) : 
  Inconsistent 'best_score' values between the closure state: 0.826267 and the xgb.attr: 0.826267
Calls: %>% ... do.call -> <Anonymous> -> xgb.train -> f -> finalizer

https://www.stats.ox.ac.uk/pub/bdr/noLD/radiant.model.out

Relevant source code: https://github.com/radiant-rstats/radiant.model/blob/master/R/gbt.R

@trivialfis
Copy link
Member

Probably due to floating point error.

Inconsistent 'best_score' values between the closure state: 0.826267 and the xgb.attr: 0.826267

Seems pretty consistent at the displayed precision.

@vnijs
Copy link
Author

vnijs commented Jul 24, 2020

Thanks for the quick reply @trivialfis. I agree the numbers do seem pretty consistent but they are not the same according to CRAN. Is there a way to check if this error is indeed attributable to XGBoost? I have been looking at the discussion at the link below but can't find find any such issue in my own R code.

https://blog.r-hub.io/2019/05/21/nold/

@trivialfis
Copy link
Member

Could you try again after we merge #5934 ? Without a way to reproduce it, I can only make guesses.

@vnijs
Copy link
Author

vnijs commented Aug 6, 2020

@trivialfis I was able to create a reproducible example of the issue using only the xgboost package and the titanic dataset from the titanic package. If you run the code below with NoLD you will get the error below.

Since testing with R (NoLD) was a bit of a challenge, at least for me, I created a docker image (vnijs/nold) with R-devel NoLD on Ubuntu with Rstudio and build tools available. See the repo (vnijs/NoLD/README.md) and dockerhub repo linked below. Please do let me know if I can help in any way.

https://github.com/vnijs/NoLD
https://hub.docker.com/r/vnijs/nold

[1]	train-auc:0.829749 
Will train until train_auc hasn't improved in 3 rounds.

[2]	train-auc:0.829749 
[3]	train-auc:0.829749 
[4]	train-auc:0.829749 
Stopping. Best iteration:
[1]	train-auc:0.829749

Error in finalizer(env) : 
  Inconsistent 'best_score' values between the closure state: 0.829749 and the xgb.attr: 0.829749
install.packages("titanic")
library(xgboost)
titanic <- titanic::titanic_train
titanic$Pclass <-  as.factor(titanic$Pclass)
dtx <- model.matrix(~ 0 + ., data = titanic[, c("Pclass", "Sex")])
dty <- titanic$Survived

xgboost::xgboost(
  data = dtx, 
  label = dty, 
  objective = "binary:logistic",
  eval_metric = "auc",
  nrounds = 100,
  early_stopping_rounds = 3
)

@trivialfis
Copy link
Member

Let me try looking into it tomorrow.

@trivialfis trivialfis self-assigned this Aug 11, 2020
@hcho3
Copy link
Collaborator

hcho3 commented Nov 12, 2020

@vnijs Hello, I had a chance to look at your script. I was not able to reproduce the error on my machine.

Commands I ran:

docker pull vnijs/nold
docker run --rm -it docker.io/vnijs/nold:latest bash

# Inside Docker
mkdir -p ~/R/x86_64-pc-linux-gnu-library/4.1
R -e "install.packages('xgboost', repos='https://cloud.r-project.org')"
Rscript test.R

Output:

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/titanic_0.1.0.tar.gz'
Content type 'application/x-gzip' length 71672 bytes (69 KB)
==================================================
downloaded 69 KB

* installing *source* package ‘titanic’ ...
** package ‘titanic’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (titanic)

The downloaded source packages are in
	‘/tmp/RtmpwnI2BB/downloaded_packages’
[1]	train-auc:0.829749 
Will train until train_auc hasn't improved in 3 rounds.

[2]	train-auc:0.829749 
[3]	train-auc:0.829749 
[4]	train-auc:0.829749 
Stopping. Best iteration:
[1]	train-auc:0.829749

##### xgb.Booster
raw: 4.5 Kb 
call:
  xgb.train(params = params, data = dtrain, nrounds = nrounds, 
    watchlist = watchlist, verbose = verbose, print_every_n = print_every_n, 
    early_stopping_rounds = early_stopping_rounds, maximize = maximize, 
    save_period = save_period, save_name = save_name, xgb_model = xgb_model, 
    callbacks = callbacks, objective = "binary:logistic", eval_metric = "auc")
params (as set within xgb.train):
  objective = "binary:logistic", eval_metric = "auc", validate_parameters = "TRUE"
xgb.attributes:
  best_iteration, best_msg, best_ntreelimit, best_score, niter
callbacks:
  cb.print.evaluation(period = print_every_n)
  cb.evaluation.log()
  cb.early.stop(stopping_rounds = early_stopping_rounds, maximize = maximize, 
    verbose = verbose)
# of features: 4 
niter: 4
best_iteration : 1 
best_ntreelimit : 1 
best_score : 0.829749 
best_msg : [1]	train-auc:0.829749 
nfeatures : 4 
evaluation_log:
 iter train_auc
    1  0.829749
    2  0.829749
    3  0.829749
    4  0.829749

@vnijs
Copy link
Author

vnijs commented Nov 12, 2020

Thanks for following up @hcho3. I can reproduce with xgboost 1.1.1.1, which was the version on CRAN when I posted this. I installed the latest version just now inside the container but I'm still getting the same issue (see output below). Is your test.R script exactly the same as my code below? I assume so, but that is the only thing I can think of at the moment

> packageVersion("xgboost")
[1] '1.2.0.1'
> library(xgboost)
ic <- titanic::titanic_train
titanic$Pclass <-  as.factor(titanic$Pclass)
dtx <- model.matrix(~ 0 + ., data = titanic[, c("Pclass", "Sex")])
dty <- titanic$Survived

xgboost::xgboost(
  data = dtx,
  label = dty,
  objective = "binary:logistic",
  eval_metric = "auc",
  nrounds = 100,
  early_stopping_rounds = 3
)> titanic <- titanic::titanic_train
> titanic$Pclass <-  as.factor(titanic$Pclass)
> dtx <- model.matrix(~ 0 + ., data = titanic[, c("Pclass", "Sex")])
> dty <- titanic$Survived
>
> xgboost::xgboost(
+   data = dtx,
+   label = dty,
+   objective = "binary:logistic",
+   eval_metric = "auc",
+   nrounds = 100,
+   early_stopping_rounds = 3
+ )
[1]     train-auc:0.829749
Will train until train_auc hasn't improved in 3 rounds.

[2]     train-auc:0.829749
[3]     train-auc:0.829749
[4]     train-auc:0.829749
Stopping. Best iteration:
[1]     train-auc:0.829749

Error in finalizer(env) :
  Inconsistent 'best_score' values between the closure state: 0.829749 and the xgb.attr: 0.829749

> R.version
               _
platform       x86_64-pc-linux-gnu
arch           x86_64
os             linux-gnu
system         x86_64, linux-gnu
status         Under development (unstable)
major          4
minor          1.0
year           2020
month          08
day            03
svn rev        78964
language       R
version.string R Under development (unstable) (2020-08-03 r78964)
nickname       Unsuffered Consequences

@hcho3
Copy link
Collaborator

hcho3 commented Nov 12, 2020

@vnijs Yes, I used the same code that you posted.

@hcho3
Copy link
Collaborator

hcho3 commented Nov 12, 2020

@vnijs Interesting, the error is only thrown when I run the script inside an interactive session. Rscript test.R does not trigger the error at all.

@hcho3
Copy link
Collaborator

hcho3 commented Nov 12, 2020

For some reason, R and Rscript are different in the vnijs/nold container:

jovyan@5d84438b2fc8:~$ which R
/usr/local/bin/R
jovyan@5d84438b2fc8:~$ which Rscript
/usr/bin/Rscript

jovyan@5d84438b2fc8:~$ /usr/bin/R --version
R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

jovyan@5d84438b2fc8:~$ R --version
R Under development (unstable) (2020-08-03 r78964) -- "Unsuffered Consequences"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

@hcho3
Copy link
Collaborator

hcho3 commented Nov 12, 2020

Indeed, /opt/R-devel/bin/Rscript test.R reproduced the error nicely:

Installing package into ‘/home/jovyan/R/x86_64-pc-linux-gnu-library/4.1’
(as ‘lib’ is unspecified)
Error in contrib.url(repos, type) : 
  trying to use CRAN without setting a mirror
Calls: install.packages -> startsWith -> contrib.url
Execution halted
jovyan@5d84438b2fc8:~$ vim test.R
jovyan@5d84438b2fc8:~$ /opt/R-devel/bin/Rscript test.R
[1]	train-auc:0.829749 
Will train until train_auc hasn't improved in 3 rounds.

[2]	train-auc:0.829749 
[3]	train-auc:0.829749 
[4]	train-auc:0.829749 
Stopping. Best iteration:
[1]	train-auc:0.829749

Error in finalizer(env) : 
  Inconsistent 'best_score' values between the closure state: 0.829749 and the xgb.attr: 0.829749

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants