Skip to content

Commit

Permalink
Restructure and expand presto_cpp docs
Browse files Browse the repository at this point in the history
  • Loading branch information
steveburnett committed May 16, 2024
1 parent e344635 commit fccc180
Show file tree
Hide file tree
Showing 7 changed files with 107 additions and 84 deletions.
2 changes: 1 addition & 1 deletion presto-docs/src/main/sphinx/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Presto Documentation
ecosystem
router
develop
prestissimo
presto-cpp
release

.. Note: If "release" is not the last item, the CSS must be updated.
25 changes: 0 additions & 25 deletions presto-docs/src/main/sphinx/prestissimo.rst

This file was deleted.

This file was deleted.

28 changes: 0 additions & 28 deletions presto-docs/src/main/sphinx/prestissimo/prestissimo-properties.rst

This file was deleted.

51 changes: 51 additions & 0 deletions presto-docs/src/main/sphinx/presto-cpp.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
**********
Presto C++
**********

Note: Presto C++ is in active development. See :doc:`Limitations </presto_cpp/limitations>`.

.. toctree::
:maxdepth: 1

presto_cpp/features
presto_cpp/limitations

Overview
========

Presto C++, sometimes referred to by the development name Prestissimo, is a
drop-in replacement for Presto workers written in C++ and based on the
`Velox <https://velox-lib.io/>`_ library.
It implements the same RESTful endpoints as Java workers using the Proxygen C++
HTTP framework.
Because communication with the Java coordinator and across workers is only
done using the REST endpoints, Presto C++ does not use JNI and does not
require a JVM on worker nodes.

Presto C++'s codebase is located at `presto-native-execution
<https://github.com/prestodb/presto/tree/master/presto-native-execution>`_.

Motivation and Vision
=====================

Presto aims to be the top performing system for data lakes.
To achieve this goal, the Presto community is moving the Presto
evaluation engine from the native Java-based implementation to a new
implementation written in C++ using `Velox <https://velox-lib.io/>`_.

By moving the evaluation engine to a library, the intent is to enable the
Presto community to focus on more features and better integration with table
formats and other data warehousing systems.

Supported Use Cases
===================

Only specific connectors are supported in the Presto C++ evaluation engine.

* Hive connector for reads and writes, including CTAS, are supported.

* Iceberg tables are supported only for reads.

* Iceberg connector supports both V1 and V2 tables, including tables with delete files.

* TPCH connector, with ``tpch.naming=standard`` catalog property.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
====================
Prestissimo Features
====================
===================
Presto C++ Features
===================

.. contents::
:local:
Expand All @@ -27,17 +27,17 @@ Other HTTP endpoints include:
* GET: v1/info
* GET: v1/status

The request/response flow of Prestissimo is identical to Java workers. The
The request/response flow of Presto C++ is identical to Java workers. The
tasks or new splits are registered via `TaskUpdateRequest`. Resource
utilization and query progress are sent to the coordinator via task endpoints.


Remote Function Execution
-------------------------

Prestissimo supports remote execution of scalar functions. This feature is
Presto C++ supports remote execution of scalar functions. This feature is
useful for cases when the function code is not written in C++, or if for
security or flexibility reasons the function code cannot be linked to the same
security or flexibility reasons, the function code cannot be linked to the same
executable as the main engine.

Remote function signatures need to be provided using a JSON file, following
Expand Down Expand Up @@ -114,7 +114,7 @@ function server. If specified, takes precedence over
JWT authentication support
--------------------------

Prestissimo supports JWT authentication for internal communication.
C++ based Presto supports JWT authentication for internal communication.
For details on the generally supported parameters visit `JWT <../security/internal-communication.html#jwt>`_.

There is also an additional parameter:
Expand Down Expand Up @@ -169,9 +169,9 @@ Size of the SSD cache when async data cache is enabled.
* **Default value:** ``true``
* **Presto on Spark default value:** ``false``

Enable periodic clean up of old tasks. This is ``true`` for Prestissimo,
however for Presto on Spark this defaults to ``false`` as zombie/stuck tasks
are handled by spark via speculative execution.
Enable periodic clean up of old tasks. The default value is ``true`` for Presto C++.
For Presto on Spark this property defaults to ``false``, as zombie or stuck tasks
are handled by Spark by speculative execution.

``old-task-cleanup-ms``
^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -189,7 +189,7 @@ Old task is defined as a PrestoTask which has not received heartbeat for at leas
Session Properties
------------------

The following are the native session properties for Prestissimo.
The following are the native session properties for C++ based Presto.

``driver_cpu_time_slice_limit_ms``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
44 changes: 44 additions & 0 deletions presto-docs/src/main/sphinx/presto_cpp/limitations.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
======================
Presto C++ Limitations
======================

.. contents::
:local:
:backlinks: none
:depth: 1

General Limitations
===================

The C++ evaluation engine has a number of limitations:

* Not all built-in functions are implemented in C++. Attempting to use unimplemented functions results in a query failure. For supported functions, see `Function Coverage <https://facebookincubator.github.io/velox/functions/presto/coverage.html>`_.

* Not all built-in types are implemented in C++. Attempting to use unimplemented types will result in a query failure.

* Certain parts of the plugin SPI are not used by the C++ evaluation engine. In particular, C++ workers will not load any plugin in the plugins directory, and certain plugin types are either partially or completely unsupported.

* ``PageSourceProvider``, ``RecordSetProvider``, and ``PageSinkProvider`` do not work in the C++ evaluation engine.

* User-supplied functions, types, parametric types and block encodings are not supported.

* The event listener plugin does not work at the split level.

* User-defined functions do not work in the same way, see `Remote Function Execution <features.html#remote-function-execution>`_.

* Memory management works differently in the C++ evaluation engine. In particular:

* The OOM killer is not supported.
* The reserved pool is not supported.
* In general, queries may use more memory than they are allowed to through memory arbitration. See `Memory Management <https://facebookincubator.github.io/velox/develop/memory.html>`_.

Functions
=========

reduce_agg
----------

In C++ based Presto, ``reduce_agg`` is not permitted to return ``null`` in either the
``inputFunction`` or the ``combineFunction``. In Presto (Java), this is permitted
but undefined behavior. For more information about ``reduce_agg`` in Presto,
see `reduce_agg <../functions/aggregate.html#reduce_agg>`_.

0 comments on commit fccc180

Please sign in to comment.