Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flytekit] Pyflyte package needs to store serialization context #1359

Closed
wild-endeavor opened this issue Aug 19, 2021 · 2 comments
Closed

[flytekit] Pyflyte package needs to store serialization context #1359

wild-endeavor opened this issue Aug 19, 2021 · 2 comments
Assignees
Labels
enhancement New feature or request Epic: Interactive Experience Everything from FlyteRemote to how to better support Jupyter/vscode development with Flyte flytekit FlyteKit Python related issue

Comments

@wild-endeavor
Copy link
Contributor

wild-endeavor commented Aug 19, 2021

Motivation: Why do you think this is important?
We introduced the pyflyte package command to make the registration experience cleaner (by removing certain options for example). One of the examples that we removed was the --in-container-config-path setting. This setting gets stored as the FLYTE_INTERNAL_CONFIGURATION_PATH environment variable, which flytekit then looks up at run time to see where to load a configuration file.

At run time, typically all the settings that flytekit needs is provided by Propeller (things like admin location, auth settings, etc.). However, there is one case where the compilation settings are required at run time: the non-default images that tasks may specify within a dynamic task. There are other serialization settings that might be useful (default resources, etc) but these are not as important. Currently, if users use pyflyte package, their dynamic tasks can only return tasks that also use the default image.

Goal: What should the final outcome look like, ideally?
A way of capturing the serialization settings, most importantly the images, at pyflyte package time, that can be retrieved at dynamic task run time.

Broader config ticket.
#2214

@wild-endeavor wild-endeavor added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Aug 19, 2021
@wild-endeavor wild-endeavor added bug Something isn't working and removed untriaged This issues has not yet been looked at by the Maintainers labels Aug 25, 2021
@kumare3
Copy link
Contributor

kumare3 commented Aug 26, 2021

The assertion that I would like to make here is, there is no in-container-config-path. If a config is required, we should assume it is available in a known path like /etc/flyte/config.ini or /etc/flyte/config.yaml.
But, we should also ensure that there are no configuration that is ever statically required by flytekit. This breaks the assumptions of portability and simplified configuration.

As @wild-endeavor rightly points out, for single container scenario, package already passes the container image information - using the badly named env variable called - FLYTE_INTERNAL_IMAGE.
This however, breaks in the case of multiple images, one per task. This is today controlled using the images config. We highly recommend users use naming of the format - default_image.name:<some-string>-default_image.tag, but this is not restricted anywhere and users can provide arbitrary images. if these images are directly provided in the container_image argument of the task then we do not need to send it over. But, flytekit today provides a way to specify it in the config, so that it can be changed.

that being said, we may have other things in the future that are required for successfully serializing dynamic tasks. We can capture this in a special object - serializationsettings and simply pickle them and send them over as an environment variable.
It is guaranteed that the code and python version are the same, it will be read by python and thus pickling is safe.

@kumare3 kumare3 added flytekit FlyteKit Python related issue and removed bug Something isn't working bugSquash-Cascade labels Sep 8, 2021
@wild-endeavor wild-endeavor added the Epic: Interactive Experience Everything from FlyteRemote to how to better support Jupyter/vscode development with Flyte label Feb 16, 2022
@wild-endeavor
Copy link
Contributor Author

Adding the interactive label because this might relate to how we handle configuration in interactive settings. The config settings needed by dynamic tasks (images is the most important, default resource values kinda, way lower in importance) is a subset of the configuration needed by an interactive experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Epic: Interactive Experience Everything from FlyteRemote to how to better support Jupyter/vscode development with Flyte flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

3 participants