Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(templating): Safer Jinja template processing #11704

Merged
merged 14 commits into from
Nov 17, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion superset/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ def _try_json_readsha( # pylint: disable=unused-argument
# Experimental feature introducing a client (browser) cache
"CLIENT_CACHE": False,
"ENABLE_EXPLORE_JSON_CSRF_PROTECTION": False,
"ENABLE_TEMPLATE_PROCESSING": False,
"ENABLE_TEMPLATE_PROCESSING": True,
"KV_STORE": False,
"PRESTO_EXPAND_DATA": False,
# Exposes API endpoint to compute thumbnails
Expand Down Expand Up @@ -677,6 +677,10 @@ class CeleryConfig: # pylint: disable=too-few-public-methods
# language. This allows you to define custom logic to process macro template.
CUSTOM_TEMPLATE_PROCESSORS: Dict[str, Type[BaseTemplateProcessor]] = {}

# Prevent access to classes/objects and proxy methods in the default Jinja context,
# unless explicitly overridden by JINJA_CONTEXT_ADDONS or CUSTOM_TEMPLATE_PROCESSORS.
Copy link
Member

@mistercrunch mistercrunch Nov 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not want to make JINJA_CONTEXT_ADDONS mutually exclusive with SAFE_JINJA_PROCESSING. Someone may want to add a safe function to their environment without having to fully pivot into the legacy/more risky approach. It should be easy to support this, but we should highlight the caveats.

"""
Exposing functionality through JINJA_CONTEXT_ADDONS has security implications as it opens a window for a user to execute untrusted code. It's important to make sure that you make sure that the objects exposed (as well as objects attached to those objets) are harmless. We recommend only exposing simple/pure functions that return native types.
"""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unintentional. Fixing to allow JINJA_CONTEXT_ADDONS to coexist with SAFE_JINJA_PROCESSING

SAFE_JINJA_PROCESSING: bool = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to introduce a TEMPLATE_PROCESSOR parameter that accepts TemplateProcessorEnum values, something like

TemplateProcessorEnum(enum.Enum):
    SafeJinja = 1
    LegacyJinja = 2
    Chevron = 3
    Custom = 4

In this approach, LegacyJinja would include the old datetime, random etc base context, and SafeJinja would have a more limited set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to support so many different modes. To me it's more important to find a "paved path" of safe and flexible templating solution that makes the most sense. Every feature flag we added here is more like a temporary solution for compatibility rather than something we want to support in the long-term.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktmud I agree, I think we should push safety and (potentially unsafe) customizability as a path forward.


# Roles that are controlled by the API / Superset and should not be changes
# by humans.
ROBOT_PERMISSION_ROLES = ["Public", "Gamma", "Alpha", "Admin", "sql_lab"]
Expand Down
24 changes: 13 additions & 11 deletions superset/extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,20 +42,22 @@

class JinjaContextManager:
def __init__(self) -> None:
self._base_context = {
"datetime": datetime,
"random": random,
"relativedelta": relativedelta,
"time": time,
"timedelta": timedelta,
"uuid1": uuid.uuid1,
"uuid3": uuid.uuid3,
"uuid4": uuid.uuid4,
"uuid5": uuid.uuid5,
}
self._base_context: Dict[str, Any] = {}
self._template_processors: Dict[str, Type["BaseTemplateProcessor"]] = {}

def init_app(self, app: Flask) -> None:
if not app.config["SAFE_JINJA_PROCESSING"]:
self._base_context = {
"datetime": datetime,
"random": random,
"relativedelta": relativedelta,
"time": time,
"timedelta": timedelta,
"uuid1": uuid.uuid1,
"uuid3": uuid.uuid3,
"uuid4": uuid.uuid4,
"uuid5": uuid.uuid5,
}
self._base_context.update(app.config["JINJA_CONTEXT_ADDONS"])
self._template_processors.update(app.config["CUSTOM_TEMPLATE_PROCESSORS"])

Expand Down
93 changes: 76 additions & 17 deletions superset/jinja_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,14 @@
"""Defines the templating context for SQL Lab"""
import inspect
import re
from typing import Any, cast, List, Optional, Tuple, TYPE_CHECKING
from functools import partial
from typing import Any, Callable, cast, Dict, List, Optional, Tuple, TYPE_CHECKING

from flask import g, request
from jinja2.sandbox import SandboxedEnvironment
from flask import current_app, g, request
from jinja2.sandbox import ImmutableSandboxedEnvironment, SandboxedEnvironment

from superset import jinja_base_context
from superset.exceptions import SupersetTemplateException
from superset.extensions import feature_flag_manager, jinja_context_manager
from superset.utils.core import convert_legacy_filters_into_adhoc, merge_extra_filters

Expand Down Expand Up @@ -151,7 +153,7 @@ def cache_key_wrapper(self, key: Any) -> Any:

def url_param(
self, param: str, default: Optional[str] = None, add_to_cache_keys: bool = True
) -> Optional[Any]:
) -> Optional[str]:
"""
Read a url or post parameter and use it in your SQL Lab query.

Expand Down Expand Up @@ -186,6 +188,28 @@ def url_param(
return result


def safe_proxy(func: Callable[..., Any], *args: Any, **kwargs: Any) -> Any:
mistercrunch marked this conversation as resolved.
Show resolved Hide resolved
none_type = type(None).__name__
allowed_types = [
mistercrunch marked this conversation as resolved.
Show resolved Hide resolved
none_type,
"bool",
"str",
"unicode",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm pretty sure unicode has been replaced by str in Python 3.

"int",
"long",
"float",
"list",
"dict",
"tuple",
]
return_value = func(*args, **kwargs)
value_type = type(return_value).__name__
if value_type not in allowed_types:
raise SupersetTemplateException("Unsafe template value")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to get a more verbose error here, something like

raise SupersetTemplateException(__("Unsafe return type for function %(func)s: %(value_type)s", func=func.__name__, value_type=value_type)


return return_value


class BaseTemplateProcessor: # pylint: disable=too-few-public-methods
"""Base class for database-specific jinja context

Expand Down Expand Up @@ -218,9 +242,33 @@ def __init__(
self._schema = query.schema
elif table:
self._schema = table.schema
self._extra_cache_keys = extra_cache_keys
self._context: Dict[str, Any] = {}
self.set_env()
self.set_context(**kwargs)

def set_env(self) -> None:
self._env = SandboxedEnvironment()

extra_cache = ExtraCache(extra_cache_keys)
def set_context(self, **kwargs: Any) -> None:
self._context.update(kwargs)
self._context.update(jinja_base_context)

def process_template(self, sql: str, **kwargs: Any) -> str:
"""Processes a sql template

>>> sql = "SELECT '{{ datetime(2017, 1, 1).isoformat() }}'"
>>> process_template(sql)
"SELECT '2017-01-01T00:00:00'"
"""
template = self._env.from_string(sql)
kwargs.update(self._context)
return template.render(kwargs)


class JinjaTemplateProcessor(BaseTemplateProcessor):
def set_context(self, **kwargs: Any) -> None:
extra_cache = ExtraCache(self._extra_cache_keys)
self._context = {
"url_param": extra_cache.url_param,
"current_user_id": extra_cache.current_user_id,
Expand All @@ -229,22 +277,28 @@ def __init__(
"filter_values": filter_values,
"form_data": {},
}

self._context.update(kwargs)
self._context.update(jinja_base_context)

if self.engine:
self._context[self.engine] = self
self._env = SandboxedEnvironment()

def process_template(self, sql: str, **kwargs: Any) -> str:
"""Processes a sql template

>>> sql = "SELECT '{{ datetime(2017, 1, 1).isoformat() }}'"
>>> process_template(sql)
"SELECT '2017-01-01T00:00:00'"
"""
template = self._env.from_string(sql)
kwargs.update(self._context)
return template.render(kwargs)
class SafeJinjaTemplateProcessor(BaseTemplateProcessor):
def set_context(self, **kwargs: Any) -> None:
extra_cache = ExtraCache(self._extra_cache_keys)
self._context = {
"url_param": partial(safe_proxy, extra_cache.url_param),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't familiar with partial, but I'm guessing that the intent is to clean up other methods/context that could be attached to the callable (?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent with partial is to wrap the callable with a method to enforce a safe return value.

"current_user_id": partial(safe_proxy, extra_cache.current_user_id),
"current_username": partial(safe_proxy, extra_cache.current_username),
"cache_key_wrapper": partial(safe_proxy, extra_cache.cache_key_wrapper),
"filter_values": partial(safe_proxy, filter_values),
"form_data": {},
mistercrunch marked this conversation as resolved.
Show resolved Hide resolved
}

def set_env(self) -> None:
self._env = ImmutableSandboxedEnvironment()


class NoOpTemplateProcessor(
Expand All @@ -257,7 +311,7 @@ def process_template(self, sql: str, **kwargs: Any) -> str:
return sql


class PrestoTemplateProcessor(BaseTemplateProcessor):
class PrestoTemplateProcessor(JinjaTemplateProcessor):
"""Presto Jinja context

The methods described here are namespaced under ``presto`` in the
Expand Down Expand Up @@ -335,8 +389,13 @@ def get_template_processor(
**kwargs: Any,
) -> BaseTemplateProcessor:
if feature_flag_manager.is_feature_enabled("ENABLE_TEMPLATE_PROCESSING"):
default_processor = (
SafeJinjaTemplateProcessor
if current_app.config["SAFE_JINJA_PROCESSING"]
else JinjaTemplateProcessor
)
template_processor = template_processors.get(
database.backend, BaseTemplateProcessor
database.backend, default_processor
)
else:
template_processor = NoOpTemplateProcessor
Expand Down