Skip to content

Commit

Permalink
Add Presto C++ config properties doc
Browse files Browse the repository at this point in the history
  • Loading branch information
steveburnett committed Jun 5, 2024
1 parent 7659fe4 commit 531ed39
Show file tree
Hide file tree
Showing 2 changed files with 158 additions and 0 deletions.
1 change: 1 addition & 0 deletions presto-docs/src/main/sphinx/presto-cpp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Note: Presto C++ is in active development. See :doc:`Limitations </presto_cpp/li

presto_cpp/features
presto_cpp/limitations
presto_cpp/properties

Overview
========
Expand Down
157 changes: 157 additions & 0 deletions presto-docs/src/main/sphinx/presto_cpp/properties.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
===============================
Presto C++ Properties Reference
===============================

This section describes Presto C++ configuration properties.

The following is not a complete list of all configuration and
session properties, and does not include any connector-specific
catalog configuration properties. For information on catalog
configuration properties, see :doc:`Connectors </connector/>`.

.. contents::
:local:
:backlinks: none
:depth: 1

Coordinator Properties
----------------------

Set the following configuration properties for the Presto coordinator exactly
as they are shown in this code block to enable the Presto coordinator's use of
Presto C++ workers.

.. code-block:: none
native-execution-enabled=true
optimizer.optimize-hash-generation=false
regex-library=RE2J
use-alternative-function-signatures=true
experimental.table-writer-merge-operator-enabled=false
These Presto coordinator configuration properties are described here, in
alphabetical order.

``experimental.table-writer-merge-operator-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``boolean``
* **Default value:** ``true``

Merge TableWriter output before sending to TableFinishOperator. This property must be set to
``false`` when used.

Optionally, use the session property ``experimental.table-writer-merge-operator-enabled = false``.

``native-execution-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``boolean``
* **Default value:** ``false``

This property is required when running Presto C++ workers because of
underlying differences in behavior from Java workers.

Optionally, use the session property ``native-execution-enabled = true``.

``optimizer.optimize-hash-generation``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``boolean``
* **Default value:** ``true``

Set this property to ``false`` when running Presto C++ workers
because Velox does not support optimized hash generation,
instead performing this optimization adaptively.

Optionally, use the session property ``optimizer.optimize-hash-generation = false``.


``regex-library``
^^^^^^^^^^^^^^^^^

* **Type:** ``type``
* **Allowed values:** ``RE2J``
* **Default value:** ``JONI``

Only `RE2J <https://github.com/google/re2j>`_ is currently supported by Velox.

Optionally, use the session property ``regex-library = RE2J``.


``use-alternative-function-signatures``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``boolean``
* **Default value:** ``false``

Some aggregation functions use generic intermediate types which are
not compatible with Velox aggregation function intermediate types. One
example function is ``approx_distinct``, whose intermediate type is
``VARBINARY``.
This property provides function signatures for built-in aggregation
functions which are compatible with Velox.

Worker Properties
-----------------

The configuration properties of Presto C++ workers are described here, in alphabetical order.

``async-data-cache-enabled``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``boolean``
* **Default value:** ``true``

In-memory cache.

``query.max-memory-per-node``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``integer``
* **Default value:** ``4GB``

Max memory usage for each query.

``query-memory-gb``
^^^^^^^^^^^^^^^^^^^

* **Type:** ``integer``
* **Default value:** ``38``

The total memory capacity that can be used across all query executions.
Memory for system usage such as disk spilling and cache prefetch which
are not counted in query memory usage.

``query-reserved-memory-gb``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``integer``
* **Default value:** ``4``

Specifies the amount of query memory capacity reserved
to ensure that each query has the minimal memory capacity to run. A query can
only allocate from the reserved query memory if its current capacity is less
than the minimal memory capacity as specified by
``memory-pool-reserved-capacity``.

The exceeding capacity must allocate from the non-reserved query memory.

``system-memory-gb``
^^^^^^^^^^^^^^^^^^^^

* **Type:** ``integer``
* **Default value:** ``40``

Memory allocation limit enforced via internal memory allocator.

Set ``system-memory-gb`` to the available machine memory of the deployment.

``task.max-drivers-per-task``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* **Type:** ``integer``
* **Default value:** ``number of hardware CPUs``

Number of drivers to use per task. Defaults to hardware CPUs.

0 comments on commit 531ed39

Please sign in to comment.