Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dask API sphinx docstrings #4507

Merged
merged 2 commits into from
May 28, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 19 additions & 5 deletions doc/gpu/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,6 @@ The experimental parameter ``single_precision_histogram`` can be set to True to

The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0.

Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.

.. note:: Enabling multi-GPU training

Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.

The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details.

Expand All @@ -82,6 +77,24 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu
param['max_bin'] = 16
param['tree_method'] = 'gpu_hist'


Single Node Multi-GPU
=====================
Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance.

.. note:: Enabling multi-GPU training

Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`.
XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter.


Multi-node Multi-GPU Training
=============================
XGBoost supports fully distributed GPU training using `Dask
<https://dask.org/>`_. See Python documentation :ref:`dask_api` and worked examples `here
<https://github.com/dmlc/xgboost/tree/master/demo/dask>`_.


Objective functions
===================
Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status.
Expand Down Expand Up @@ -209,6 +222,7 @@ References
Contributors
=======
Many thanks to the following contributors (alphabetical order):

* Andrey Adinets
* Jiaming Yuan
* Jonathan C. McKinney
Expand Down
3 changes: 3 additions & 0 deletions doc/python/python_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ Callback API

.. autofunction:: xgboost.callback.early_stop

.. _dask_api:

Dask API
--------
.. automodule:: xgboost.dask
Expand All @@ -83,3 +85,4 @@ Dask API
.. autofunction:: xgboost.dask.create_worker_dmatrix

.. autofunction:: xgboost.dask.get_local_data

2 changes: 2 additions & 0 deletions python-package/xgboost/dask.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ def _start_tracker(n_workers):
def get_local_data(data):
"""
Unpacks a distributed data object to get the rows local to this worker

:param data: A distributed dask data object
:return: Local data partition e.g. numpy or pandas
"""
Expand Down Expand Up @@ -107,6 +108,7 @@ def run(client, func, *args):
dask by default, unless the user overrides the nthread parameter.

Note: Windows platforms are not officially supported. Contributions are welcome here.

:param client: Dask client representing the cluster
:param func: Python function to be executed by each worker. Typically contains xgboost
training code.
Expand Down