Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add number of columns to native data iterator. #5202

Merged
merged 1 commit into from
Feb 25, 2020

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Jan 14, 2020

This helps mitigating imputing sparse dataset from JVM package, where some columns might be completely missing. @wbo4958 will submit a follow up PR on JVM side.

For consistency and simplicity, NativeDataIter is turned into an adapter.

As a new data field is added, we break the ABI.

@trivialfis
Copy link
Member Author

@CodingCat

Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Last time I looked at the NativeDataIter I had the feeling that it uses a lot of extra memory.

@trivialfis
Copy link
Member Author

We do something similar on dask, but delegate the task to np/pd.

@trivialfis
Copy link
Member Author

Pls don't merge.

@trivialfis trivialfis changed the title Add number of columns to native data iterator. [WIP] Add number of columns to native data iterator. Jan 15, 2020
@trivialfis
Copy link
Member Author

Will wait for the corresponding jvm PR and merge them together.

* Change native data iter into an adapter.
@trivialfis trivialfis changed the title [WIP] Add number of columns to native data iterator. Add number of columns to native data iterator. Feb 23, 2020
Copy link
Member

@RAMitchell RAMitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, it seems to fit the adapter pattern well and now we have more consistency in data construction.

@trivialfis trivialfis merged commit f2b8cd2 into dmlc:master Feb 25, 2020
@trivialfis trivialfis deleted the jvm-get-num-columns branch February 25, 2020 15:42
@lock lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants