diff --git a/doc/gpu/index.rst b/doc/gpu/index.rst index 2de2495a0721..d02f2a69c6aa 100644 --- a/doc/gpu/index.rst +++ b/doc/gpu/index.rst @@ -67,11 +67,6 @@ The experimental parameter ``single_precision_histogram`` can be set to True to The device ordinal can be selected using the ``gpu_id`` parameter, which defaults to 0. -Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance. - -.. note:: Enabling multi-GPU training - - Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`. The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/build` for details. @@ -82,6 +77,24 @@ The GPU algorithms currently work with CLI, Python and R packages. See :doc:`/bu param['max_bin'] = 16 param['tree_method'] = 'gpu_hist' + +Single Node Multi-GPU +===================== +Multiple GPUs can be used with the ``gpu_hist`` tree method using the ``n_gpus`` parameter. which defaults to 1. If this is set to -1 all available GPUs will be used. If ``gpu_id`` is specified as non-zero, the selected gpu devices will be from ``gpu_id`` to ``gpu_id+n_gpus``, please note that ``gpu_id+n_gpus`` must be less than or equal to the number of available GPUs on your system. As with GPU vs. CPU, multi-GPU will not always be faster than a single GPU due to PCI bus bandwidth that can limit performance. + +.. note:: Enabling multi-GPU training + + Default installation may not enable multi-GPU training. To use multiple GPUs, make sure to read :ref:`build_gpu_support`. +XGBoost supports multi-GPU training on a single machine via specifying the `n_gpus' parameter. + + +Multi-node Multi-GPU Training +============================= +XGBoost supports fully distributed GPU training using `Dask +`_. See Python documentation :ref:`dask_api` and worked examples `here +`_. + + Objective functions =================== Most of the objective functions implemented in XGBoost can be run on GPU. Following table shows current support status. @@ -209,6 +222,7 @@ References Contributors ======= Many thanks to the following contributors (alphabetical order): + * Andrey Adinets * Jiaming Yuan * Jonathan C. McKinney diff --git a/doc/python/python_api.rst b/doc/python/python_api.rst index 63c3fdd587a8..5a19f02f6cfc 100644 --- a/doc/python/python_api.rst +++ b/doc/python/python_api.rst @@ -74,6 +74,8 @@ Callback API .. autofunction:: xgboost.callback.early_stop +.. _dask_api: + Dask API -------- .. automodule:: xgboost.dask @@ -83,3 +85,4 @@ Dask API .. autofunction:: xgboost.dask.create_worker_dmatrix .. autofunction:: xgboost.dask.get_local_data + diff --git a/python-package/xgboost/dask.py b/python-package/xgboost/dask.py index 5f6d7db20fc3..18e496ffd5b4 100644 --- a/python-package/xgboost/dask.py +++ b/python-package/xgboost/dask.py @@ -43,6 +43,7 @@ def _start_tracker(n_workers): def get_local_data(data): """ Unpacks a distributed data object to get the rows local to this worker + :param data: A distributed dask data object :return: Local data partition e.g. numpy or pandas """ @@ -107,6 +108,7 @@ def run(client, func, *args): dask by default, unless the user overrides the nthread parameter. Note: Windows platforms are not officially supported. Contributions are welcome here. + :param client: Dask client representing the cluster :param func: Python function to be executed by each worker. Typically contains xgboost training code.