Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix matrix attributes not sliced #4311

Merged
merged 1 commit into from
Apr 10, 2019
Merged

Conversation

jeffzi
Copy link
Contributor

@jeffzi jeffzi commented Mar 29, 2019

In the R package, the user can provide additional information for the custom objective function by adding attributes to the DMatrix . xgb.cv relies on slice.xgb.DMatrix for slicing the folds.

However, those attributes are sliced only if they are vectors of same length as the DMatrix. In a multiclass problem, it makes sense to use a matrix instead of a vector for additional information about the classes.

MWE:

library(xgboost)

data(agaricus.train, package = "xgboost")

dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
attr(dtrain, "vec") <- getinfo(dtrain, "label")
nrow(dtrain)
#> [1] 6513
train_len <- length(attr(dtrain, "vec"))
attr(dtrain, "mat") <- matrix(runif(train_len * 2), nrow = train_len, ncol = 2)
length(attr(dtrain, "mat")) # != nrow(dtrain), will not be sliced
#> [1] 13026
dim(attr(dtrain, "mat"))
#> [1] 6513    2

dtrain <- slice(dtrain, 1:20)
length(attr(dtrain, "vec"))
#> [1] 20
attr(dtrain, "mat")
#> NULL

Created on 2019-03-29 by the reprex package (v0.2.1)

This PR fixes the issue by testing NROW instead of length and slices tabular attributes by row, similarly to what is done for the DMatrix itself.

@codecov-io
Copy link

codecov-io commented Mar 29, 2019

Codecov Report

Merging #4311 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4311      +/-   ##
==========================================
+ Coverage   67.82%   67.82%   +<.01%     
==========================================
  Files         132      132              
  Lines       12201    12202       +1     
==========================================
+ Hits         8275     8276       +1     
  Misses       3926     3926
Impacted Files Coverage Δ
src/tree/updater_colmaker.cc 39.68% <0%> (+0.11%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ea5b77...3ef8f1f. Read the comment docs.

@trivialfis
Copy link
Member

Hi @hetong007 , could you help review this?

@hetong007
Copy link
Member

LGTM.

@hetong007 hetong007 merged commit 956e73f into dmlc:master Apr 10, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Jul 9, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants