Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Env_vars from KubernetesPodOperator does not assume secret (deploy as env var) because secrets are loaded after env vars into the pods. #40092

Closed
1 of 2 tasks
ana-carolina-januario opened this issue Jun 6, 2024 · 8 comments
Labels
area:core invalid kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues

Comments

@ana-carolina-januario
Copy link

Apache Airflow version

Other Airflow 2 version (please specify below)

If "Other Airflow 2 version" selected, which one?

2.6.0

What happened?

We've tried to perform some bash commands where we need to use env_vars to build the command.
The main environment variable, so called COMMAND, has a placeholder to be replaced by some secret values, for instance:
(here having (), {}, or nothing surrounding the env_var, does not affect the result.)

COMMAND="echo 'This is my secret $(SECRET_AS_ENV_VAR)'"

but when I run:
echo $COMMAND

The result does not replace the variable.

When I run:
echo $SECRET_AS_ENV_VAR
The value is printed correctly so, the variable is successfully loaded.

I've tried to

I've analyzed the pods and noticed the order env_vars are presented, might be affecting this replacement.

I've notice that when the pod is being build, the secrets (as env vars) are being loaded after the env vars. This could be reviewed in order to load secrets prior to the load of env_vars.

Thanks,
Ana

What you think should happen instead?

I believe the env var COMMAND should have the SECRET_AS_ENV_VAR replaced properly when starting the pod.

After looking at the code, I've noticed this function should load secrets at the same time (or even before) loading env vars.:
https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/7.3.0/_modules/airflow/providers/cncf/kubernetes/operators/pod.html#KubernetesPodOperator.build_pod_request_obj

How to reproduce

Using your KubernetesPodOperator example,
https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/1.2.0/_modules/airflow/providers/cncf/kubernetes/example_dags/example_kubernetes.html
add a variable with a value that uses the secret deployed as env:

init_environments = [k8s.V1EnvVar(name='key1', value='This string should show the value of SQL_CONN = $(SQL_CONN)'), k8s.V1EnvVar(name='key2', value='value2')]

For the args parameter:
args=['echo $key1']

With this echo, the value won't be replaced in the value of key1.
But if you run 'echo $SQL_CONN' the value will be correctly printed.

Operating System

PRETTY_NAME="Debian GNU/Linux 11 (bullseye)" NAME="Debian GNU/Linux" VERSION_ID="11" VERSION="11 (bullseye)" VERSION_CODENAME=bullseye ID=debian

Versions of Apache Airflow Providers

Version of the apache-airflow-providers-cncf-kubernetes is 6.1.0

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@ana-carolina-januario ana-carolina-januario added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jun 6, 2024
@vatsrahul1001 vatsrahul1001 added the provider:cncf-kubernetes Kubernetes provider related issues label Jun 6, 2024
@ana-carolina-januario
Copy link
Author

Hi,
Additionally, I've tried multiple workarounds:

  1. export to other env var like:
    export COMMAND_1=$COMMAND
    without success.
  2. export the secret to another env var(using the cmds parameter) and then export the command to another env_var as well:
    cmds=['sh', '-c', 'export NEW_SECRET=$(SQL_CONN) ; export COMMAND_1=$(COMMAND)']
    also without changes on the behaviour.

Thanks,
Ana.

@raphaelauv
Copy link
Contributor

raphaelauv commented Jun 11, 2024

use kubernetes native secret ( since kubernetes env_vars are plain text )

secret_toto = Secret(
    deploy_type="env",
    deploy_target="THE_SECRET",
    secret="toto",
    key="tata")

KubernetesPodOperator(
    task_id="task-toto",
    kubernetes_conn_id="kubernetes_default",
    image="aaaaa:0.1",
    secrets=[secret_toto],
)

@ana-carolina-januario
Copy link
Author

I'm not sure if I understood the suggestion.
My problem is that, once the pod is launched, the order of the env_vars (secrets+env_vars) is not the ideal. - if they are independent fro each other, the order does not matter.

I would expect the secrets being loaded before the env_vars but once I run "describe" from kubectl for the pod, I figure that the secrets are loaded after the env_vars (that I build dynamically and hence it is not compatible to put them in a secret).
Once I launch the pod, I 'need' the secrets to be loaded before the env vars so I can perform all actions using env_vars that refer secrets' values.

Let me know if this does not make sense to you.

Thanks!

@potiuk
Copy link
Member

potiuk commented Jun 28, 2024

This is the way how env vars are passed - there is no way you can add secrets to env vars before - because they are evaluated in two different places. When you create k8s pod, you pass it the env vars and K8S processes the pod definition and sets the env vars. Only then it launches the POD and airflow process start and it's the airflow process that reads secrets. If you do it the other way - this is also a terrible, terrible security fix - as @raphaelauv mentioned - env variables are visible and quereable when you have access to K8S. so you should absoiutely like NEVER pass the env variables with resolved secrets to launch a pod,

You need to rethink your strategty. What @raphaelauv suggest is good - make your secrets available as K8S secrets, and in your PODs you can mount the secrets as env variables (if you can make your secrets available as K8S secrets). If you need to retrieve them dynamically by Airflow from secrets manager, you need to write a code that will do it dynamically as part of your task execute() mehod or JINJA template that is resolved after the pod has been instantiated.

@ephraimbuddy
Copy link
Contributor

Closing as won't fix

@potiuk
Copy link
Member

potiuk commented Jul 1, 2024

Also we just merged #40519 that addresses that by explicitly mentioning in appropriate places that env variables are bad way of passing secrets.

@potiuk potiuk added invalid and removed needs-triage label for new issues that we didn't triage yet labels Jul 1, 2024
@ana-carolina-januario
Copy link
Author

Actually I am using k8s native secrets to pass credentials/secrets.
What I pass as env var is a command with the reference for a secret deployed as env_Var:
from airflow.kubernetes.secret import Secret
secret_value = Secret(
deploy_type=deploy_type,
deploy_target=secret_key,
secret=secret_name,
key=value
)
and my command is like:
COMMAND="some-tool $(SECRET_DEPLOYED_AS_ENV_VAR)"
where some-tool needs the value passed through the env_var that contains the secret value.

I am not passing the value of the secret directly as an env var, I am passing through k8s secret, deploying as env_var.

Thanks!

@potiuk
Copy link
Member

potiuk commented Jul 1, 2024

I am not passing the value of the secret directly as an env var, I am passing through k8s secret, deploying as env_var.

I stull do not understand. If you use native secret in K8S you should be able to mount it as ENV var directly - and value of that secret should be available for you - this is native k8s behaviour, and Airflow has nothing to do with it.

I really have no idea in this case what is passed when and what problem you have - maybe I am stupid, but most likely You should explain in ore detail what your problem is.

Please try to explain it in the way that we can understand it if you need help.

I will re-open it and convert into discussion so that you can continue explaining it, but this is almost for sure not an airflow issue, so discussion is more appropriate.

@potiuk potiuk reopened this Jul 1, 2024
@apache apache locked and limited conversation to collaborators Jul 1, 2024
@potiuk potiuk converted this issue into discussion #40526 Jul 1, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
area:core invalid kind:bug This is a clearly a bug provider:cncf-kubernetes Kubernetes provider related issues
Projects
None yet
Development

No branches or pull requests

5 participants